Validation Panel Report -- 1998



Student Evaluation Standards Project



prepared by



W. Todd Rogers, Chair


Daniel L. Stufflebeam, Member


Validation Panel



October 13, 1998






This report is based on

  1. The Joint Committee's Panel of Writers' Draft Student Evaluation Standards;

  2. the discussion among members of the Joint Committee at their annual meeting held on October 1 - 3, 1998, and at which the Draft Student Evaluation Standards were considered and revised;
  3. and

  4. our interactions with members of the Joint Committee during the meeting.

We first want to acknowledge the serious attention given by the members of the Joint Committee, in attendance at the October, 1998 meeting, to the task of preparing the first draft of the Student Evaluation Standards to be sent out to the panels of national and international reviewers. Clearly, a lot of thought and effort was devoted to this task.

But, we are concerned that the material provided to the Joint Committee members was inferior, and that there was a lack of clarity in what was expected from the Joint Committee members at the meeting. We make the following observations and suggestions in the hope that they will assist the Joint Committee members as they work toward the development of a credible and sound set of student evaluation standards. Some actions were taken following the oral presentation of this report to the Joint Committee following lunch on the third day of the meeting. We have indicated these actions in italics following the presentation of our concern and our suggestions intended to help address the concerns.

A.   We start with two concerns which may be outside our mandate, but which we feel must be quickly addressed so as to ensure the development of a sound and credible set of standards for student evaluation.
A1.   Is it reasonable and wise for the Joint Committee to be involved both in the development and the validation of the Student Evaluation Standards and in the revision and re-validation of the Personnel Evaluation Standards? What will the impact of such an involvement be on the quality of the two products to be produced? In our opinion, the quality of both the Student Evaluation Standards and the Personnel Evaluation Standards will likely be diminished. No opportunity was provided to the members of the Joint Committee who attended the 1998 Annual Meeting to thoroughly review and discuss the standards produced by each of the four working groups. Joint Committee members indicated that it was difficult to "change gears," to change from focusing intently on standards for student evaluation to focusing intently on standards for personnel evaluation.

We suggest the following options, which are listed in order of preference, to address this concern:

  1. Postpone the revision of the Personnel Evaluation Standards to a later time.

  2. Use of specifically constituted task forces with expertise to work on tasks identified by the Joint Committee and related to the work that needs to be done on both sets of standards.

  3. Schedule two meetings of the Joint Committee each year, with one meeting devoted to the development and validation of the Student Evaluation Standards and the second meeting devoted to the revision and re-validation of the Personnel Evaluation Standards.

A2.   On the third day of the meeting, Dr. Arlen Gullickson was elected to be the chair of the Joint Committee, with his term to begin following completion of the 1998 Joint Committee Annual Meeting. Like the previous two chairs, Dr. Dan Stufflebeam (1981 Program Evaluation Standards and 1988 Personnel Evaluation Standards) and Dr. Jim Sanders (1994 revision of the Program Evaluation Standards; Student Evaluation Standards; revised Personnel Evaluation Standards), the new chair is employed at The Evaluation Center, Western Michigan University. The Center has kindly provided and continues to provide infrastructure support for the Joint Committee. Unlike the previous two chairs, Dr. Gullickson is not the principal investigator for the grant received from the Kellogg Foundation to develop and validate the Student Evaluation Standards and the grant received from the Kellogg Foundation to revise and re-validate the Personnel Evaluation Standards. Dr. Stufflebeam was the principal investigator for the development and validation of the 1981 edition of the Program Evaluation Standards and the 1988 edition of the Personnel Evaluation Standards. Dr. Sanders was the principal investigator for the 1994 edition of the Program Evaluation Standards and is the principal investigator for both of the current projects. Hence, we have a situation where the new chair has the responsibility but not necessarily the authority for ensuring the successful completion of the two projects in a timely manner.

To address this concern, we suggest:

  1. That the new chair become the principal investigator for both projects and that the Kellogg Foundation be asked to approve this change.
  2. Dr. Sanders indicated that he agreed with this suggestion, and that he would work with the new chair (Dr. Arlen Gullickson) and the officials at the Kellogg Foundation to effect the suggested change.

B.   The following comments and suggestions are related to the development of the Student Evaluation Standards.
B1.   There is a need to be clear on the foci for the Student Evaluation Standards and then how the Student Evaluation Standards will differ from other standards in the area of measurement and evaluations.

In terms of previous work by the Joint Committee, the object of the evaluation needs to be clearly articulated. Our suggestion is that the object of the evaluation be students. The following statement submitted by one member on the panel of writers best characterizes why we feel students, and not teachers, programs, or educational systems such as a district or state, should be the evaluation object for the Student Evaluation Standards:

A significant element of the American traditional education system is the use of teacher-made, classroom tests or other teacher-made performance tasks. Judgment of performance or "scores" prepared by the teachers are used for calculation of "grades" that become the foundation of the record system documenting academic achievement of students. Probably this "classroom testing system" is the largest testing system operating in American schools and may have the most far-reaching effects on the status of students within the schools. For the most part, "grades" become the primary basis for actions taken by the school regarding eligibility for services, for placement decisions, for student and system expectations of what the student might achieve, and for promotion in grade level. (from Hayes, U7. Follow-Up, Illustrative Case 1)

Concerning how the Student Evaluation Standards differ from other sets of published measurement and evaluation standards, the Joint Committee might consider demonstrating the difference in a table in which the different sets of standards are listed together with their purpose, intended audience, object of the evaluation, and setting in which the evaluation takes place (see Table 1 for a suggestion).

Table 1
Comparison of Standards
Standard Purpose Audience Eval. Object Setting
Student
Evaluation
Standards
  Classroom
Teachers
Student Pre-school, elementary, and secondary school, and university/ college classroom
Standards for
Educational and
Psychological
Testing
       
Principles for Fair Student Assessment Practices for Education in Canada Assessments depend on professional judgment; the principles and related guidelines presented in this document identify the issues to consider in exercising this professional judgment and in striving for the fair and equitable assessment of all students. Part A:
Student
Part B:
Commercial Test Publishers; provincial and territorial ministries and departments of education, and local school jurisdictions
Part A:
Student

Part B:
Student School
Part A:
elementary, and secondary school, and university
/college classroom
Part B:
Student School School Jurisdiction
· · · · ·
· · · · ·

B2.   There is a need to clearly identify the audiences for the Student Evaluation Standards; whether these audiences are of equal importance or whether some are primary and others are secondary; the knowledge and skill in the areas of measurement and evaluation possessed by the members of the audience(s); and the attitudes of these members toward measurement and evaluation. In keeping with our suggestion that the object of the evaluation be the student, and given our knowledge of other available students for measurement and evaluation, we suggest the primary audience be classroom teachers and school administrators. There clearly is a need for a set of sound and relevant standards that teachers and administrators can follow when assessing the students for whom they are responsible. We are suggesting that individuals like psychologists and school psychologists be considered a secondary audience. While they work with individual students, they do so with standardized tests. Standards such as the Standards for Educational and Psychological Testing are available for the standardized testing and subsequent evaluations that psychologists and school psychologists do when they work with students.

B3.   The setting(s) in which the student evaluations take place should be clearly established. As indicated above, we suggest that the setting be the classroom. We further suggest that the standards be written to cover elementary and secondary education, with perhaps some attention given to post secondary education. We are of the opinion that extending the applications to education and training programs in industry, business, and the military may extend the Student Evaluation Standards in such a way to lead to difficulties (e.g., getting a sufficient number of case illustrations for each setting that is balanced across standards).

The Joint Committee decided in the afternoon session on October 3 that the object of the evaluation is to be the student, and that the settings in which the student evaluations take place are to include pre-school, elementary, secondary, and university/college classrooms.

B4.   Regarding the standards, we strongly encourage the Joint Committee to include a standard on stakeholder identification. Inclusion of a stakeholder standard will set out clearly who should be involved in a student evaluation and under what conditions, and who are impacted by the evaluation and the decisions made.

The Joint Committee decided in the afternoon session on October 3 to include a standard on stakeholder identification.

Other topics that should receive greater attention, either as part of an existing standard or as separate standards, include:

  1. Legal Viability/Appeal Process
  2. Marking/rating constructed response assessments, performance assessments, and portfolio assessments
  3. Peer and self assessment of small group work
  4. Making and recording observations
  5. Evaluation of student attitudes
  6. Computerized testing, and the use of computerized data files
  7. Combining scores for grading purposes
  8. Frames of reference used to determine grades
  9. Anecdotal reporting, use of portfolios for reporting, student led conferences, computerized reports
  10. Inclusion of special needs students and students for whom English is a second language
  11. Follow-up procedures to improve student learning
  12. Teacher as evaluation instrument
  13. Constructivist evaluation
  14. Payment for results

This list is intended to be neither inclusive nor exhaustive. It is provide with the intent of helping the Joint Committee to carefully consider all of the elements of student evaluation at the classroom level so that a comprehensive and complete set of standards and guidelines for sound and valid practice are provided in the final product.

We encourage the Joint Committee to re-consider some of the descriptors to better reflect the intent of the corresponding standards. For example, the descriptor for P1 might be changed to service orientation to better reflect the fact that teachers "serve" their students and the students' parents. Teachers establish expected leaning outcomes for their students, provide learning opportunities for the students to achieve or acquire the expected learning outcomes, and then evaluate the students to determine if they have or have not achieved or acquired the expected learning outcomes , with the evaluation reported to the students and their parents. A better description for P6 is Complete and Fair Evaluation in the sense that a complete and fair evaluation calls for a comprehensive assessment which yields data and information that can be validly interpreted in terms of strengths and weaknesses.

It appears that some of the summary statements of standards may need to be rewritten to remove ambiguity. For example, the summary statements for the following standards were rewritten by at least one member of the panel of writers, suggesting that there may be some ambiguity with the summary statement: P1, P3, P6, U1, U3, U6, U7, F2, and F3.

Other standards appear to overlap, and might be combined. For example, the submissions for U6 and U7 are not distinct. P3 and P5, and A1 and A3, seem to overlap.

Further, some of the illustrative cases seem to illustrate better a standard other than the one for which they were written. For example, Case 2 (Rogien), U5 looks like a good illustration for the proposed Stakeholder Identification Standard; Case 2 (Shastri), U7 is a good example of the need for A12.

B5.   We have a number of concerns regarding the writing of the standards to be included in the Student Evaluation Standards.

  1. There is a clear need for a section of definition of terms. This can be seen by the different ways the members of the Panel of Writers used the same term. Some writers felt it necessary to define the terms they were using. Terms like testing, assessment, and evaluation were inconsistently used by the members of the panel of writers.
  2. We agree with the definition of student evaluation adopted by the members of the Joint Committee. This definition should appear in the Introduction to the standards, and be illustrated with two or three empales. Care must be exercised to ensure that this definition is consistently used throughout the entire set of standards.

  3. There is a clear need for a model student evaluation standard that can then be used to guide the further writing and editing of the remaining standards. We noted that the panel of writers was provided a model standard, but that this standard was for program evaluation and not student evaluation. It may be that the inclusion of a program evaluation standard as a model standard led to the lack of consistency across writers as to what the object of the evaluation should be.
  4. Two standards, U7 and A1, revised by the working groups of the Joint Committee were proposed as model student evaluation standards. However the reactions of the members of the Joint Committee during the afternoon sessions was such that neither can be considered to be a model standard.

  5. There is need to be clear on the use of terms like child vs. student; and teacher vs. instructor vs. evaluator .

  6. We believe there is a need to clarify the specifications for preparing the guidelines, common errors, and illustrative cases. For example,

  7. Steps need to be taken to ensure that the references for each standard are a selective, yet sufficient set.

  8. To increase the readability of the standards by the audiences for whom they are intended, we suggest that

    • wherever possible, active rather than passive voice should be used.
    • concrete rather than abstract writing should be used.
    • the writing should be as parsimonious as possible.
    • terms should be used consistently across standards.
    • standards, guidelines, and common errors should be cross referenced as was done in the program and personnel evaluation standards.
    • the number of positive illustration should be greater than the number of negative illustrations.

B6.   We are concerned about the representation of the individuals and groups who will be involved in developing and validating the Student Evaluation Standards. This concern was raised by the Validation Panels for the Personnel Evaluation Standards and the revised Program Evaluation Standards, seemingly after the fact. We strongly encourage the Joint Committee to become proactive in ensuring adequate representation within the Joint Committee, the national and international panels, and the field tests. Each of the stakeholder groups identified for the Student Evaluation Standards should be adequately represented.

The system for identifying national and international panel members and field testers is limited. One tends to nominated individuals whom they know or whom the association they represent has identified. Teachers, principals, parents, and students are likely not to be adequately represented using the present approach. We suggest that the present approach be augmented by using specially constituted focus groups . For example, members of the Joint Committee might identify groups of parents and students close to their homes. Graduate students in education (especially part-time students who are likely teaching or in administration full-time) would be relevant representatives of teachers and administrators.

C.   Closing Statement

We are deeply concerned about the low quality of the draft produced at the Joint Committee meeting. First, as suggested earlier, we are concerned about the inferior quality of the standards submitted by the panel of writers. This was the material that the Joint Committee members worked with. Second, the members of the Joint Committee were not clear on the task they were to do: could standards be combined, deleted, completely rewritten? This lack fo clarity is understandable given that many of the members were attending their first meeting of the Joint Committee. The orientation provided to them appears to not have been sufficient. Third, unlike the development of the program and personnel standards, there was no opportunity for the members of each working group to see an review thoroughly the standards produced by the other working groups. The depth of this concern is confirmed by the failure of the members of the Joint Committee in attendance on the third afternoon to embrace the two standards presented as model standards. These observations, together with the material presented above, leads us to two recommendations which we wish we did not have to make but which, given the importance and potential positive impact of the Student Evaluation Standards, we know we have to make.

We recommend that

the national and international reviews be delayed until a credible, well-written set of student standards is developed and fully reviewed and approved, in the presence of all members of the Joint Committee, by the Joint Committee.

In our opinion several of the standards need to be greatly revised, if not rewritten. To assist with this matter and to expedite the process, we recommend that

the Joint Committee allow the Chair to assemble a small team of writers with known expertise in student evaluation in the classroom to assist with the preparation of the initial draft standards. The Joint Committee should then review and revise these initial standards to produce and approve the draft to be sent to members of the national and international review panel.