Preventing and Exploring Bias in Examinations

For years, the ABP has actively worked to diversify the membership of our committees and subboards. A more diverse group of volunteers will help ensure that pediatric exams are unbiased. But we also know that guarding against implicit (unconscious) bias requires a clear and ongoing prevention and evaluation strategy.

To help prevent implicit bias in exams, the ABP introduced training materials that focus on preventing content in our exams that may lead to bias. These training materials have been provided to the pediatrician volunteers who write, review, and approve exam items (questions). And in 2020, the ABP also began an additional process to evaluate individual exam items for bias.

Bias Prevention

Bias prevention activities at the ABP take place in three steps. First, the item-writing training module, for volunteers who are new to the item-writing process, includes a section about checking for bias in questions or language that could be considered offensive or culturally inappropriate. Volunteers are instructed to check each item they write with a series of questions adapted from Hambleton and Rogers.1 The questions help item writers screen for racial favoritism, stereotypes, and inaccessible language.

Next, these instructions are reviewed again when the volunteers meet in person to review and evaluate exam items. Then, the medical editors review the items a final time, again screening for insensitivity or bias.

Item Analysis and Evaluation

To further ensure unbiased exams, the ABP convened and trained a Bias and Sensitivity Review (BSR) Panel in 2020 to analyze exam items in the General Pediatrics Initial Certifying Exam — after the exam was administered in the fall. The ABP staff and volunteers wanted to identify exam items in which one racial/ethnic group or one gender performed significantly different than another, after controlling for overall knowledge.

The BSR Panel consists of 11 general pediatricians, selected to reflect the racial/ethnic and gender demographics of the pediatric workforce, along with two nonphysicians. All have expertise in bias, sensitivity, equity, and inclusion.

First, the ABP staff flagged items in the 2020 exam that had a statistically different performance per group. Flagged items were then reviewed by the BSR Panel to identify specific content within these items that might have contributed to the observed performance differences. Items that were deemed to be problematic were removed from scoring (i.e., not counted toward a test taker’s overall score).

To ensure this successful process, the ABP had conducted a pilot analysis in the spring of 2020, using data from the 2019 General Pediatrics Initial Certifying Exam. These data were reviewed by a subset of General Pediatrics Exam Committee members who recommended convening a BSR Panel for future review of potentially biased exam items.

Andrew Dwyer“We have been encouraged to find that, based on these preliminary studies, item bias does not appear to be a major problem for our exams,” says Andrew Dwyer, PhD, ABP Director of Psychometrics. “We will, however, continue to evaluate and monitor future exams for potential bias, and we will continue to improve our detection and prevention methods.”


1Hambleton R, Rogers J. Item bias review. Practical Assessment, Research, and Evaluation. 1994 Vol. 4, Article 6. doi: 10.7275/jymp-md73. https://scholarworks.umass.edu/pare/vol4/iss1/6. Accessed Dec. 31, 2020.