www.cyclingnews.com news and analysis

Cyclingnews TV News Tech Features Road MTB BMX Cyclo-cross Track Photos Fitness Letters Search Forum

Recently on Cyclingnews.com

Mont Ventoux
Photo ©: Sirotti

Urine testing for rEPO

Dear Cyclingnews,

I have been following the news reports about Lance Armstrong and Roberto Heras, among others, and their alleged blood doping with recombinant erythropoietin (rEPO). I have been struck by the disconnect between the vehement denial of blood doping on the part of the athletes and the test results. Like many, I want to know where the truth lies, for the sake of the athletes and the sport of cycling.

The articles about the urinary rEPO test focus on important technical details about the testing procedure itself (effect of urinary proteins on test results, specificity of monoclonal antibodies used, scientific steps to validate the test) but do not discuss the performance of the test in a population - in this case, a population of athletes (as opposed to patients with cancer or kidney disease). I would like to review some basic principles of diagnostic testing that apply to all tests and have direct bearing on how to interpret the tests results from these elite athletes.

No laboratory test is perfect. No matter how refined the test is, it will still misclassify samples, whether from patients or elite athletes. Good tests rarely misclassify samples and are used frequently in medicine. Poor tests, even if well conceived, misclassify too often to be useful and are typically abandoned. So, testing is used as a way to classify samples (from patients or athletes). It turns out that a test can be imperfect in only two ways. First, the test may indicate a substance is in a sample (e.g., urine or blood) when, in truth, it is absent. This type of error would misclassify a sample as 'positive' when it should be classified as 'negative'. This is called a 'false positive' test result. Second, the test may fail to identify a substance in a sample when, in truth, it is present. This type of error would misclassify a sample as 'negative' when it should have been found 'positive'. This is called a 'false negative' result.

In the field of medical science, the ability of a test to classify a sample is characterized by determining the sensitivity and specificity of the test. I will give a brief primer on these terms and then discuss how they are used to interpret diagnostic tests - in particular the urinary rEPO test.

Sensitivity and False Negatives - two sides of the same coin

The sensitivity of a test is the ability to identify a substance in a sample, when, in truth, the substance is present in that sample. Medical scientists might assess the sensitivity of the urinary rEPO test by giving rEPO to 100 healthy people (who do not use rEPO for any reason, like most of use) and then perform the urinary test to determine the percent of these people with a positive urinary rEPO test. With a perfect test, 100 out of 100 would test positive, because they were all given the drug under supervision, and the sensitivity would be 100% for rEPO in the urine. But since no diagnostic test is perfect, sensitivity is usually less than 100%.

Consider the same experiment where 100 healthy people were given rEPO and then urinary rEPO was measured. In the real world, not everyone would test positive. Let's say 90 of 100 test positive, so the sensitivity would be 90% for the test. But 10 of these 100 people did, in fact, have rEPO in their blood and urine that was not detected by the test! These are called 'false negative' test results. So we would say that the test has a false negative rate of 10%.

Specificity and False Positives - two sides of a different coin

The specificity of a test is the ability to show that a substance is not present in a sample when, in truth, the substance is not in the sample. In other words, it is the ability of the test to identify correctly samples without the substance of interest. An example will clarify this. To assess specificity of urinary rEPO, medical scientists would study 100 healthy people (who do not use rEPO, like most of us) and then give saline injections (a substance without rEPO) and then measure rEPO in the urine. With a perfect test, 100 out of 100 would test negative and show no rEPO in the urine, so the specificity would be 100%. But, as you may now guess, no test is perfect and the specificity of a test is usually less than 100%. Consider our example again. If we measured urinary rEPO and found that 95 people test 'negative', we would say that the test had a specificity of 95%. It correctly classified 95 out of 100 samples as not having rEPO, but the test misclassified 5 of the 100 study subjects as 'positive'; these are called 'false positive' results. We would say that the test had a 'false positive' rate of 5%.

Interpretation of Tests using Sensitivity and Specificity

How do we use these principles to interpret the results of urinary rEPO tests? Here again, remember that no diagnostic test is perfect. For the sake of discussion, I will focus on specificity because that it what is at issue in most of the blood doping allegations. In the real world, the specificity of a useful test may range from 80% up to 99.99%. A test with a specificity of 99.99% seems excellent, but how does it perform when applied in a population. Let's apply this real-world test in an ideal world where all elite athletes were absolutely clean and never used rEPO or other substances. If we apply our highly refined test with 99.99% specificity to 10,000 elite athletes from this ideal world, we would expect to find one 'positive' test. Suddenly, our ideal sporting world would be corrupted, not by the ideal and clean athletes but by the imperfect test! Imagine the effect of a test with 99% specificity! One out of 100 clean athletes would have 'positive' test results.

The real world is different from our ideal world in two important ways. First, some athletes do cheat and use performance enhancing substances. Second, we don't know who these athletes are with any certainty unless they confess to using these drugs. So now, let's consider the performance of our urinary rEPO test with a specificity of 99.99% in a peloton of 100 cyclists in the real world. Let's also say over a season, 10,000 samples are collected from these cyclists in stage or and one-day races. With our test, we would find one 'false positive' test. But since we don't know who in the peloton is using performance enhancing drugs, we wouldn't know whether it was a 'true positive' result (identified a cyclist who was cheating) or a 'false positive' result (misclassified a clean athlete). And we will never know with complete certainty which is correct because of the imperfections inherent in diagnostic testing.

In clinical medicine, a test specificity of 99% or greater is considered excellent in most settings. If we applied a test with 99% specificity to the samples from this real world peloton, what would be the effect? When the test is applied to the 10,000 samples, we would expect there to be as many as 100 'false positive' results over the course of a season if the peloton were completely clean. This would lead to many ruined careers for honest athletes.

Although I have focused only on specificity in this discussion, the interpretation of any test, including the rEPO urinary test, is made by balancing the likelihood of 'false positive' and 'false negative' results (i.e., specificity and sensitivity) in a given population. In the end, the likelihood of a 'false positive' will depend upon the actual (yet unknown) level of rEPO use in the peloton. If you believed that most cyclists are corrupt and use performance enhancing drugs, then you would be inclined to argue that a 'positive' urinary rEPO test result correctly identified a cheater. If you believed that most cyclists were honest and did not use performance enhancing drugs, then you would be inclined to say that a 'positive' urinary rEPO result was a 'false positive' test result and that an innocent cyclist was falsely accused.

Diagnostics testing is not absolute. The interpretation of urinary rEPO results depends on the level of perceived rEPO use among athletes. It is incumbent upon diagnostic testing companies to produce reliable and valid tests for detecting rEPO, and other compounds. These tests must undergo strict testing in large enough groups to determine the sensitivity and specificity of the tests. Moreover, the results of their testing should be published in peer review journals and magazine, as is done in all of science, so that the performance of these new laboratory tests may be readily seen and evaluated by a knowledgeable public. At the same time, anonymous surveys (that protect althletes and others from recrimination) must be done on athletes, coaches, and support staff to estimate the current level of drug use. Only with this information can we begin to interpret the test results coming from the laboratory. In the meantime, athletes, coaches, support staff, sponsors, and international cycling agencies should renounce the use of performance enhancing drugs so that we can begin with the presumption that the peloton is clean.

Prof. Christopher C. Whalen

Director, Division of Epidemiology, Case Western Reserve University School of Medicine
Wednesday, Dec 28, 2005

Respond to this letter

Letters Index – The complete index to every letters page on cyclingnews.com