Point biserial correlation

Read about our exciting partnership with Blackboard, Inc. to simplify and streamline the process for assessing student learning: Blackboard’s Assessment and Accreditation Solution: Expanding the Possibilities for Continual Improvement.

Point biserial correlation

The point biserial correlation measures item reliability.

How? It correlates student scores on one particular question with their scores on the test as a whole.

The driving assumption is simple: Students who score well on the test as a whole should on average score well on the question under review. Students who struggle on the test as a whole should on average struggle on the question under review. If a question deviates from this assumption (aka, a "suspect" question), the point biserial correlation lets us know.

The point biserial correlation ranges from a low of -1.0 to a high of +1.0.

The closer the point biserial correlation is to +1.0 the more reliable the question is considered because it discriminates well among students who mastered the test material and those who did not.

A point biserial correlation of 0.0 means the question didn't discriminate at all. Imagine a test where all 20 students answered Question 1 correctly. Since Question 1 doesn't discriminate among any of the students relative to how they performed on the rest of the test, its point biserial correlation of 0.0 makes perfect sense.

A negative point biserial correlation means that students who performed well on the test as a whole tended to miss the question under review and students who didn't perform as well on the test as a whole got it right. It's a red flag, and there are a number of possible things to check. Is the answer key correct? Is the question clearly worded? If it's multiple choice, are the choices too similar?

EAC suggestion: For high stakes exams intended to distinguish among students who mastered the material from those who did not, shoot for questions with point biserial correlations greater than +0.30. They're considered very good items. Questions with point biserial correlations less than +0.09 are considered poor. Questions with point biserial correlations between +0.09 and +0.30 are considered acceptable to reasonably good.

See other test statistics: