How is a test-item writing activity defined?

An item can have a spoken, written, or visual stimulus, as well as any combination of the three. Thus, while an item or task may ostensibly assess one modality, it may also be testing some other as well.

The data from item analysis can drive the way in which you design future tests. As noted previously, if student knowledge assessment is the bridge between teaching and learning–then exams ought to measure the student learning gap as accurately as possible. ReliabilityInterpretation.90 and aboveExcellent reliability; at the level of the best standardized tests.80 – .90Very good for a classroom test.70 – .80Good for a classroom test; in the range of most.

Ensure Item Relevancy

ADVANTAGES Large portion of content can be covered. Easy to construct, because it measures simple learning outcomes. It is useful in interpreting diagrams, charts etc. Test length — a test with more items will have a higher reliability, all other things being equal. In summary, «Test Item» is the item to be tested while «Features to be Tested» are the specific aspects of the Test Item that will be evaluated during testing.

If the objective asks the test taker to identify genres of music from the 1990s, and your item is asking the test taker to identify different wind instruments, your item is not aligning with the objective. Fill in the ____________ questions are featured frequently on exams. This kind of test item features two columns, a numbered column and a lettered column. Students are asked to match the correct answer with the correct stem.

Are easier to construct than are multiple-choice or matching items. Minimize guessing by requiring the students to provide an original response rather than to select from several alternatives. Read and grade the answers without looking at the students’ names to avoid possible preferential treatment.

test item

Each test-item writing activity should be reported for a maximum of a 12-month period. If this activity lasts longer than 12 months, it should be reported as separate activities.

Furthermore, separate analyses must be requested for different versions of the same exam. Item analysis data are not synonymous with item validity. An external criterion is required to accurately judge the validity of test items.

This is the general form of the more commonly reported KR-20 and can be applied to tests composed of items with different numbers of points given for different response alternatives. When coefficient alpha is applied to tests in which each item has only one correct test item answer and all correct answers are worth the same number of points, the resulting coefficient is identical to KR-20. A basic assumption made by ScorePak® is that the test under analysis is composed of items measuring a single subject area or underlying ability.

Test Item Specifications

Or a test taker wanting to become a chef may be asked to prepare a specific dish to ensure they can execute it properly. MQC is the acronym for “minimally qualified candidate.” The MQC is a conceptualization of the assessment candidate who possesses the minimum knowledge, skills, experience, and competence to just meet the expectations of a credentialed individual. If the credential is entry level, the expectations of the MQC will be less than if the credential is designated at an intermediate or expert level. Think of an ability continuum that goes from low ability to high ability. Somewhere along that ability continuum, a cut point will be set.

  • For each option, the test taker chooses “yes” or “no.” When the question is answered correctly or incorrectly, the next question is presented.
  • In a complex system, there may be multiple levels of components and sub-systems that are integrated and tested at various levels.
  • You can disable test items to temporarily exclude them from the run by clearing the check box next to them.
  • This test should not contribute heavily to the course grade, and it needs revision.The measure of reliability used by ScorePak® is Cronbach’s Alpha.
  • In TestComplete projects, a test item can represent a single test case, or just part of a testing procedure , or even an auxiliary procedure .

Biserial correlation coefficients are computed to determine whether the attribute or attributes measured by the criterion are also measured by the item and the extent to which the item measures them. The rbis gives an estimate of the well-known Pearson product-moment correlation between the criterion score and the hypothesized item continuum when the item is dichotomized into right and wrong . Ebel and Frisbie state that the rbis simply describes the relationship between scores on a test item (e.g., «0» or «1») and scores (e.g., «0», «1»,…»50″) on the total test for all examinees. For example, many teachers may think that the minimum score on a test consisting of 100 items with four alternatives each is 0, when in actuality the theoretical floor on such a test is 25.

This information should be looked at in conjunction with the discrimination index; higher total test scores should be obtained by students choosing the correct, or most highly weighted alternative. Incorrect alternatives with relatively high means should be examined to determine why “better” students chose that particular alternative. Item discrimination indices must always be interpreted in the context of the type of test which is being analyzed. Items with low discrimination indices are often ambiguously worded and should be examined. Items with negative indices should be examined to determine why a negative value was obtained.


For example, a negative value may indicate that the item was mis-keyed, so that students who knew the material tended to choose an unkeyed, but correct, response option. The item discrimination index provided by ScorePak® is a Pearson Product Moment correlation2 between student responses to a particular item and total scores on all other items on the test. This index is the equivalent of a point-biserial coefficient in this application. It provides an estimate of the degree to which an individual item is measuring the same thing as the rest of the items. Short questions and answers are typically asked at the school and college level. Because universities usually conduct comprehensive questions.

