Reliability in psychometric tests

Reliability in psychometric tests

What is “reliability” in psychometrics?

When we say an assessment is “reliable”, it means that we know it will measure the same thing consistently. While it’s a pretty simple concept in theory, it’s tougher to measure in practice. 

There are quite a few different ways to test the reliability of a psychometric assessment, each offering a different ‘type’ of reliability: 

Test-Retest Reliability

This is where you get someone to take a test twice in quick succession, and see similar results both times. As you can imagine, if the results vary wildly, then the assessment isn’t hugely consistent or reliable.

Parallel Forms

This is when you compare two different psychometric assessments, each based on the same content. There are three steps to this process:

  1. Testers pool lots of test ‘items’ together (e.g. questions, or tasks in the case of our behaviour-based assessment)
  2. They’ll randomly split the pool into two different psychometric tests.
  3. They’ll give both tests to the same people, on the same day, and check for similar results.

Internal Consistency Reliability

Because most tests aim to measure certain traits multiple times, it’s important that they do this consistently across different items. That’s what ‘internal consistency reliability’ measures. There’s no point having an assessment that measures certain traits unevenly throughout!

Inter-Rater Reliability

This is where several independent judges score a particular test, and compare their results. The closer the comparison, the better the inter-rater reliability.

This can be done in two ways:

  1. Each judge scores each ‘item’ in an assessment – perhaps on a scale from 1-10. Then, you can simply compare these scores and see if there’s any correlation. 
  2. Each judge categorises each observation made by a psychometric assessment. Then, you can check for the percentage of ‘agreement’ between them. If they agree on 5 of 10 observations, for example, you’d say the assessment has an inter-rater reliability raten of 50%.

Why is reliability important?

Just as it’s difficult to trust an unreliable person, you’d have a tough time relying on the results of an unreliable psychometric assessment. Think of it this way: if an assessment throws up different results every time, which result should you put your faith in? An unreliable assessment, ultimately, makes it harder for you to see true potential – not easier.

Reliability & validity in psychometrics

Reliability in a psychometric test is vital. But reliability doesn’t make a test valid. This can feel quite confusing, but, in short, a valid assessment measures what it says it does. So, if you think about it, you could have a test that consistently fails to measure whatever it says it will…

For a bit more on validity in psychometrics, jump to our glossary article here.

Or, if you’d like to learn a bit more about our assessment at Arctic Shores, just let us know here. We’ll get back to you pronto.

Let’s talk talent

Sign up for insights into all things hiring