Enhancing NPTE test items
Currently, the NPTE contains questions that include graphics. Does the Federation have plans to add other testing technologies such as “hot spots” or video?
The Federation has been exploring other testing formats for some time. While we have the technical capabilities of utilizing these formats, we are not planning on adding these types of test questions in the immediate future. These types of questions create a host of additional issues and complexities.
For example, a question that requires the examinee to review to a video clip of a gait pattern and determine the dysfunction may seem like the perfect question for a physical therapy licensing examination. However, before such a question can be used, several issues have to be addressed:
- How much time needs to be allowed for a question that requires the viewing of a video as compared to a standard multiple-choice question?
- How many times should the candidate be allowed to re-run the video?
- Is the video clip clear enough that all candidates are able to see the action-taking place clearly?
- How does one account for the psychometric differences between the video question and other questions?
We continue to explore whether additional test formats provide us with better information related to the competency of the entry-level physical therapist.
Can you provide the background on the 20 PT NPTE scores that were invalidated in 2007?
The following information was taken from the “News and Events” section.
Forensic Analysis Conducted to Investigate Effect of Trafficking in Recalled Test Items Leads to Invalidation of 20 Candidate Test Scores
On Friday, August 17, 2007, the Board of Directors of the Federation of State Boards of Physical Therapy approved the invalidation of 20 candidates’ National Physical Therapy Exam (“NPTE”) test results. This decision resulted from an extensive forensic analysis of the test performances of all candidates who sat for the NPTE between March 1, 2005 and June 5, 2007.
The forensic analysis, conducted by Caveon, a test security company, was commissioned in response to the unlawful trafficking of NPTE questions by Philippines-based exam prep centers. Through its own private investigation efforts, as well as Philippines government surveillance and raids of two Manila test centers in January 2007, FSBPT has confirmed that the centers have distributed to customers compilations of actual NPTE test questions memorized and shared by prior test takers (“recalled items”). In an effort to assess the potential effects of this practice of using recalled test items, Caveon analyzed approximately 23,512 test performances of all NPTE candidates, regardless of place of education.
Caveon’s analysis conclusively establishes that at least twenty individuals benefited unfairly from advance access to recalled test items. All twenty candidates are Philippines-educated, some but not all of whom are already licensed to practice physical therapy. FSBPT’s assessment and review of the Caveon forensic analysis is continuing, so as to determine whether additional candidate score invalidation is appropriate.
In identifying these twenty candidates, the forensic analysis used three statistical indices to identify aberrant candidate performances. First, performance on compromised test questions (those known to be compromised by distribution at Philippines-based test prep centers) was compared to performance on non-compromised test items. Second, the similarity among candidate response choices was examined, with higher degrees of similarity suggesting the possibility of prior knowledge of test content. Third, the analysis computed the probability that each test taker had attended a course at which recalled items were used. In each case, the percentage of candidates flagged as aberrant was highest for Philippines-educated test takers.
FSBPT limited the universe of “aberrant” test performances under each of the three indices to those test results whose likelihood of occurring by chance was at least 1 in 10,000 (one in ten thousand). The twenty invalidated candidate scores are those that appeared aberrant based on all three statistical indices. The likelihood of aberrant performance on all three statistical indices is extremely unlikely and at least less than 1 in one million.
“As with every decision we’ve made in addressing the troubling use of recalled items, the FSBPT Board did not take this action lightly,” stated E. Dargan Ervin, Jr., FSBPT President. “We made the decision only after careful consideration of the issues and in light of the overwhelming statisical data that calls into question the legitimacy of these scores.”