eye

How the Test Results are Interpreted

A misconception about testing is that scores are either interpreted relative to other student scores (norm-referencing) or relative to some fixed criterion, such as content standards (criterion-referencing). These scoring methods are not mutually exclusive, nor do they define the development of the test. The raw scores may be criterion-referenced or norm-referenced—or both—to learn different things about the results of the test.

Standardized Achievement Tests
Tests are administered via publisher-prescribed testing procedures, including exact directions, time limits, and scoring criteria. These procedures ensure that testing conditions are the same for all students. Results of standardized tests can help teachers improve how they address student needs by showing where specific instruction may be needed. From these results, teachers can develop programs to help individual students use their existing skills and knowledge effectively and to help students develop skills and knowledge that may be lacking.

Raw Scores
This term refers to the actual results of the test (the score is x out of y, which means the test taker answered x questions correctly out of a total of y questions).

Norm-Referenced Scores
In order to compare a student’s test performance against that of a “norm group,” raw scores from a standardized test are converted to a uniform ranking. A percentile, which indicates the percentage of scores that fall below a given score related to the total number of scores, is a norm-referenced score.

The California Achievement Test, Sixth Edition (CAT/6) is a norm-referenced test. Data is first collected from a sample group (the “norm group”) with defined characteristics. An individual student’s test performance, or raw score, is translated into scores based on both the normative sample and on the scoring method used.

CAT/6 reveals scale scores, percentile ranks, normal curve equivalents, and anticipated achievement. Student performance can be compared against national, state, or local scores.

Individual Percentile Rank Scores
Percentile rank scores range from 1 to 99.9, with a score of 50 denoting average performance. Percentile rank scores are useful for indicating a student’s relative standing compared to other students in the same grade who took the test at a comparable time of year.

Percentile rank scores can be used to compare an individual student’s test performance across subtests. A percentile rank of 80 does not indicate that the student answered 80 percent of the test questions correctly.

  • National Percentile indicates the percentage of students in a norm group whose scores fall below a given student’s scaled score. For example, if a student score converts to a national percentile rank of 75, the student scored higher than approximately 75 percent of the students in the national norm group.
  • Local Percentile is a comparison of students in the same grade within an individual school or district. If a student has a local percentile rank of 85, that means the student scored higher than 85 percent of the students in the local group. When a local percentile is reported, half the students will score below the 50th percentile and half will score above the 50th percentile.
  • Stanine Scores
    "Stanine," or Standard Nines, scores were developed to address the shortcomings (i.e., unequal intervals) of percentile rank scores. The scores are converted into “approximately equal” units, so they can be readily compared. They are normed and broken down as follows:
  • 1 to 3 – below average
  • 4 to 6 – average
  • 7 to 9 – above average

Stanines are less precise than a percentile rank score, but organizes students' scores into equal-sized subgroups.

Grade Equivalents

The scales for grade equivalents range from 0.09 to 12.9, representing kindergarten through grade 12. A grade level equivalent indicates the school year and month for which the student has displayed typical achievement.

Example: A grade equivalent of 7.6 means that the student’s achievement

is at the performance level typical of students in the norm group

who have completed the sixth month of grade 7.

Grade equivalents have serious drawbacks. They are frequently misinterpreted and should not be used with parents, students, or others lacking a sound statistical foundation.

While grade equivalents represent the typical performance of students tested in a given month within the school year, a grade eight student obtaining a grade equivalent of 10.7 on a language test designed for grade eight students has not necessarily mastered grade ten language. The score only represents what an average grade ten student would achieve if he or she were to take a grade eight test in the seventh month of the school year. While this level of performance may be advanced for many grade eight students, it is not comparable to the achievement of a grade ten student.

Scaled Scores
The scaled score is the basic score for CAT/6. It is used to derive other norm-referenced scores (e.g., national percentiles or grade equivalents) to describe test performance. There are two ways to obtain scale scores for content areas tested in CAT/6:

  • Pattern Scoring or Item Response Theory (IRT) – IRT scores are numerically derived using information contained in student response
  • IRT takes into account student ability and certain characteristics of test items. The characteristics are called "parameters."
  • The primary benefit of using IRT is that all components of CAT/6 are on the same scale.
  • The scores that result are intended to reflect increasing levels of performance.
  • Number-Correct Scoring (traditional score) – Traditional scoring requires converting the raw test score (the number of correct answers) to a scaled score.

The following chart is helpful when making comparisons using test results. For example, if a student receives a standard score of 115 on the Weschler IQ and a percentile ranking of 50, there is a discrepancy since one would expect a higher percentile ranking. In another example, one could make a comparison of a percentile score of 80 and a stanine score of six. The chart can be a useful visual aid for teachers and parents.

 

Example of relationship of standard scoring methods

Next