Jump to content

Standardized test

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Chris53516 (talk | contribs) at 17:12, 10 October 2006 (rv - some of these edits appear to violate copyright law. content for first paragraph under "Scoring" was cut and pasted from somewhere). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

A standardized test is a test administered and scored in a standard manner. The tests are designed in such a way that the "questions, conditions for administering, scoring procedures, and interpretations are consistent" (Sylvan Learning, 2006[1]) and are "administered and scored in a predetermined, standard manner" (Popham, 1999[2]).

Design

In practice, standardized tests can be composed of multiple-choice and true-false questions, and short-answer or essay writing components that are assigned a score by independent evaluators. Standardized tests can include written portions as well. These can be graded by humans who use rubrics which are rules or guidelines and anchor papers, which give examples of papers for each possible score, to determine the grade to be given to a response.

Score reference

There are two types of standardized tests: norm-referenced tests and criterion-referenced tests,[1] resulting in a norm-referenced score or a criterion-referenced score, respectively. Norm-referenced scores compare test-takers to a sample of peers. Criterion-referenced scores compare test-takers to a criterion, and may also be described as standards-based assessment as they are aligned with the standards-based education reform movement.[3] Norm-referenced tests are associated with traditional education, which measures success by rank ordering students, while standards-based assessments are based on the egalitarian belief that all students can succeed if they are assessed against high standards which are required of all students regardless of ability or economic background.[citation needed]

History

The earliest evidence of standardized testing based on merit comes from China during the Han dynasty. The concept of a state ruled by men of ability and virtue was an outgrowth of Confucian philosophy. The imperial examinations covered the so-called Six Arts which included music, archery and horsemanship, arithmetic, writing, and knowledge of the rituals and ceremonies of both public and private parts. Later, the five studies were added to the testing (military strategies, civil law, revenue and taxation, agriculture and geography).[citation needed]

United States

The first large-scale use of the IQ test in the US was during the World War I (circa 1914-18). The Educational Testing Service (ETS) established in 1948 is the world's largest private educational testing and measurement organization, operating on an annual budget of approximately $900 million.

The Elementary and Secondary Education Act of 1994 requires standardized testing in public schools. US Public Law 107-110, known as the No Child Left Behind Act of 2001 further ties public school funding to standardized testing.

The USA educational system judges the academic qualification of applicants on their test results of standardized tests, standardized college and graduate-school entrance tests:

Standards

The considerations of validity and reliability typically are viewed as essential elements for determining the quality of any standardized test. However, professional and practitioner associations frequently have placed these concerns within broader contexts when developing standards and making overall judgments about the quality of any standardized test as a whole within a given context.

Evaluation standards

In the field of evaluation, and in particular educational evaluation, the Joint Committee on Standards for Educational Evaluation [4] has published three sets of standards for evaluations. The Personnel Evaluation Standards [5] was published in 1988, The Program Evaluation Standards (2nd edition) [6] was published in 1994, and The Student Evaluation Standards [7] was published in 2003.

Each publication presents and elaborates a set of standards for use in a variety of educational settings. The standards provide guidelines for designing, implementing, assessing and improving the identified form of evaluation. Each of the standards has been placed in one of four fundamental categories to promote educational evaluations that are proper, useful, feasible, and accurate. In these sets of standards, validity and reliability considerations are covered under the accuracy topic. For example, the student accuracy standards help ensure that student evaluations will provide sound, accurate, and credible information about student learning and performance.

Testing standards

In the field of psychometrics, the Standards for Educational and Psychological Testing [8] place standards about validity and reliability, along with errors of measurement and related considerations under the general topic of test construction, evaluation and documentation. The second major topic covers standards related to fairness in testing, including fairness in testing and test use, the rights and responsibilities of test takers, testing individuals of diverse linguistic backgrounds, and testing individuals with disabilities. The third and final major topic covers standards related to testing applications, including the responsibilities of test users, psychological testing and assessment, educational testing and assessment, testing in employment and credentialing, plus testing in program evaluation and public policy.

Advantages

One of the main advantages of standardized testing is that it is able to provide assessments that are psychometrically valid and reliable, as well as results which are generalizable and replicable.

Another advantage is aggregation. A well designed standardized test provides an assessment of an individual's mastery of a domain of knowledge or skill which at some level of aggregation will provide useful information. That is, while individual assessments may not be accurate enough for practical purposes, the mean scores of classes, schools, branches of a company, or other groups may well provide useful information because of the reduction of error accomplished by increasing the sample size.

While standardized tests are often criticized as unfair, the psychometric standards applied in the development of standardized tests would produce fairer testing if applied in other types of testing. In particular, the effectiveness of each test item in accomplishing the goal of the test would have to be demonstrated.

References

  1. ^ a b Sylvan Learning glossary
  2. ^ Popham, J. (1999). Why standardized tests don’t measure educational quality. Educational Leadership, 56(6), 8-15.
  3. ^ Where We Stand: Standards-Based Assessment and Accountability (American Federation of Teachers) [1]
  4. ^ Joint Committee on Standards for Educational Evaluation
  5. ^ Joint Committee on Standards for Educational Evaluation. (1988). The Personnel Evaluation Standards: How to Assess Systems for Evaluating Educators. Newbury Park, CA: Sage Publications.
  6. ^ Joint Committee on Standards for Educational Evaluation. (1994). The Program Evaluation Standards, 2nd Edition. Newbury Park, CA: Sage Publications.
  7. ^ Committee on Standards for Educational Evaluation. (2003). The Student Evaluation Standards: How to Improve Evaluations of Students. Newbury Park, CA: Corwin Press.
  8. ^ The Standards for Educational and Psychological Testing

See also

Major topics

Other topics