Not all standardized admissions tests are created equal. They vary in terms of testing time, content, and design. But what is the difference between a good test and a great test designed for the same purpose? If you were comparing cricket players, the debate would center on performance indicators such as batting average or bowling average. In high-stakes testing there are a handful of questions you should ask.
How valid is the test? Validity refers to how well a test measures what it is designed to measure. To that end, it is critically important to first identify the skills you want to measure. For example, a great testing program continuously surveys the programs and individuals that use the test results to ensure that the exam continues to measure the skills that users have identified as important for success.
Time allotment is a major, yet often overlooked, feature of test design that directly impacts validity. In a timed exam, there is a tradeoff between content and test length. Designers have to strike a balance between the difficulty of the questions and how much time test takers are permitted to answer them.
The goal of a standardized test is to offer test takers a full opportunity to demonstrate what they know and what they can do with regard to the test content. It should not be to differentiate students on a skill such as processing speed that is tangentially relevant, if at all, to educational success.
How is the exam administered? The method of test delivery or administration also plays a key role in validity and reliability. Our preference at GMAC is to employ a computer adaptive testing algorithm that assures that every test taker sees a unique set of questions and at the same time assures that every form of the test is equivalent in terms of content and difficulty.
What exactly is computer-adaptive testing? Rather than using the computer as an electronic page turner, an adaptive test uses the computers processing power to analyze each examinees responses during the test session. By using the computer to calculate a score estimate after each question and using that estimate to select subsequent questions, an adaptive test is able to provide a more valid, reliable, secure, and shorter test.
Reliability goes hand in hand with validity. A tests reliability is the extent to which test scores are consistent over repeated sittings. This concept is critically important in standardized testing, and reliability is a key reason why test scores have meaning. Candidates with the same ability should always get approximately the same score regardless of when they sit for an examination. If there is a great deal of inconsistency in the scores, then the test cannot provide an accurate or fair assessment of what it is supposed to measure.
Another basic concept of a high-quality standardized test is that it is administered and scored in a consistent manner. Every one of our GMAT examinations is administered with the same software, same interface, same hardware, same number of test questions, same mix of test questions, and the same test center environment.
Similarly, if the content mix were to vary, then the skills being assessed would vary, and reliability would be reduced. The responses to the test questions must also reflect the test takers ability with regard to the intended skills. If not, then extraneous variation is introduced, and reliability is reduced.
Candidates, schools, and test providers should share a desire for everyones test score to accurately reflect their ability. Test takers need to properly prepare for the exam by becoming familiar with the question types and learning how the test is structured. High-quality testing programs provide free practice examinations, publish test-preparation books, and provide on-line diagnostic tools. The truly exceptional testing program will actively reach out to the test prep industry to provide accurate, useful information to test takers about their exam.
Test providers also have a responsibility to do everything they can to protect the integrity of the exam and to ensure that the person taking the test is the same person that shows up in the classroom. The quality testing program will make a substantial investment in test security, including the best and most accurate technology for identity authentication.
Another element that makes a good test great is a commitment to the rights of test takers. A quality testing program will publish a formal document outlining test taker rights and responsibilities. Such a document will cover areas such as adherence to professional standards of testing, access to information about the test and the protection of personal information.
While a test program may purport to be fair to all test takers and of the highest quality, a few basic questions can help evaluate the veracity of such claims.
The author Lawrence M Rudner is Vice President of Research and Development and Chief Psychometrician of the Graduate Management Admission Council (GMAC).