Does this make sense? Flor-i-duh?
Here is the way Florida decides on the level of question difficulty for its high stakes testing. They actually create questions they think fewer than 40% of the students will get correct. Later they use the actual results for questions to figure out the “complexity” – despite the fact that different students take the test each year. Let’s design questions so confusing that we know fewer than 40% will get them correct! Brilliant.
The difficulty of FCAT 2.0 items is initially estimated by committees of educators participating in Item Content Review meetings each year. As each test item is reviewed, committee members make a prediction of difficulty based upon their knowledge of student performance at the given grade level. The classification scheme used for this prediction of item difficulty is based on the following:
More than 70 percent of the students are likely to respond correctly.
Between 40 percent and 70 percent of the students are likely to respond correctly.
Less than 40 percent of the students are likely to respond correctly.
After an item appears on a test, item difficulty refers to the actual percentage of students who chose the correct answer.