8.8 Standard Setting Terminology

Standard Setting is the process used to determine exam outcomes for candidates. risr/assess includes several common Standard Setting methods, each of which can be customised, this additionally allows you to use custom Standard Setting methods.

Supported Standard Setting Methods 

Angoff

Each item in the item-bank can be assigned an Angoff percentage, representing the percentage of minimally competent candidates* (borderline) that would be expected to know the correct answer to the item. 

Ebel

risr/assess has full support for Ebel standard setting, allowing each exam paper to be standard set by a group of invited judges. The Ebel matrix can be customised, and difficulty and relevance can be input by each judge, and the Ebel score automatically calculated for each item in the exam. Ebel score histories are kept for each item in the item bank, allowing Ebel scoring trends over different exams and exam papers to be tracked separately in the item bank. 

Borderline Group

The borderline group method only applies to examiner marked exams. For each item the examiner can indicate a global mark for each item answered by the candidate, which can have multiple levels, but minimally pass/borderline/fail. The pass mark is determined by analysing the borderline group of candidates. 

Borderline Regression

Similar to borderline group, the examiner specifies a global mark for each item answered by the candidate, which minimally has pass/borderline/fail levels. The pass mark is determined by constructing the regression line through all of the groups. 

McManus Borderline Regression

Similar to borderline regression, the examiner specifies a global mark for each item answered by the candidate, which minimally has pass/borderline/fail levels. The pass mark is determined by constructing the regression line through all of the groups and combining with a negative confidence interval.

Related terms

Cronbach’s Alpha

For each of the standard setting methods the Cronbach’s Alpha reliability metric is also calculated for the exam. This is given for the whole exam as well as what it would be if each item in turn were omitted from the analysis. This allows items that are lowering the reliability of the exam to be excluded from the results.

Standard Error of Measurement

The Standard Error of Measurement (not to be confused with the Standard Error of the Mean) gives an indication of the spread of the measurement errors, when estimating candidates' true scores from the observed scores. It is calculated from the reliability coefficient (risr/assess uses Chronbach's alpha). It is assumed that the sampling errors are normally distributed.

The SEM is calculated as

SEM = S(1 – rxx)0.5

where is the standard deviation of the exam, and rxx is the reliability coefficient (Chronbach's alpha).

The key application of SEM in risr/assess is to apply a confidence interval to the cut score. For example, if you would like to be 68% sure of the pass/fail decision, the SEM indicates that the candidates within 1 SEM of the cut score may fluctuate to the other side of the cut score should they take the exam again. For example, if you wanted to be 95% sure of your decision on outcomes, an SEM multiplier of 1.96 can be applied. These figures are based on the Normal Distribution. risr/assess applies this on the positive side for most Standard Setting methods, as we are dealing with competency exams. In practice, what this means is that you are 95% certain that the passing candidates scores represent their true scores.

Score Normalisation

It is possible to normalise the marks for each item in an exam to a consistent number. You may want to do this so that each item in an exam is equally weighted, despite having an inconsistent number of marks for each item. Note that the normalisation is on an item basis, and not on an exam basis. It is possible to revert to non-normalised scores.

Minimum Items to pass

When you are on the standard setting page you can set the minimum number of stations to pass an exam. Enter the number and click "apply" this will recalculate the number of passes and fails based on this additional criteria and the total score. If you wish to only pass on stations then you should set the "passing score" to zero. 

Candidates

Candidates are the individuals who are assigned to take or sit an exam within risr/assess. The minimum information risr/assess requires for a Candidate is their ID. The ID must be unique otherwise there is a risk of duplication within risr/assess. 

Notes

*It is important that the assessment function has a clear definition of what a borderline or minimally competent candidate is, and whether these are equivalent