Every blood test report comes with a reference range. It looks authoritative — a clear boundary between healthy and sick. But where do these numbers actually come from? And why do they fail so predictably for athletes?
The answer lies in statistics, not medicine. Reference ranges are built using methods developed in the 1960s for general populations. They were never designed for people who deliberately alter their physiology through training, diet, or pharmacology. Understanding the construction process reveals exactly why they break down for athletes — and what to do about it.
📐A reference range is not a medical diagnosis. It is a statistical observation: 95% of the people we tested fell between these two numbers. The other 5% were healthy too — they were just statistical outliers.
The Reference Range Construction Process
When a laboratory establishes a reference range for a new marker, they follow a standardized process. It looks scientific — and it is — but it contains assumptions that break down for athletic populations:
Recruit a Reference Population
The lab recruits a group of "healthy" individuals — typically 120 to 500 people from the local patient population. These are people who came in for routine screening, not athletes seeking performance optimization. The sample includes elderly patients, sedentary individuals, people on medications, and those with subclinical conditions that have not yet been diagnosed.
The problem is immediate: if 90% of your reference population does not train, markers affected by training — creatinine, CK, AST — will be calibrated to sedentary values. A muscular athlete automatically becomes an outlier.
Measure and Plot the Distribution
The lab measures the marker in every reference subject and plots the distribution. Most biological markers follow a bell curve (normal distribution), but some are skewed — creatinine, for example, skews right because a small number of people have unusually high values.
For skewed distributions, labs sometimes apply log transformation or use non-parametric methods. But the key point remains: the shape of the distribution is determined by the reference population. If that population is sedentary, the "normal" shape does not include athletic physiology.
Apply the 2.5th to 97.5th Percentile Rule
The standard methodology, established by the International Federation of Clinical Chemistry in 1987, defines the reference interval as the range between the 2.5th and 97.5th percentiles of the reference population. This captures the middle 95% of values.
By definition, 2.5% of healthy people fall below the lower limit and 2.5% fall above the upper limit. These are false positives — healthy individuals flagged as abnormal purely because they sit at the edges of the distribution. For athletes, who are outliers on multiple markers, the probability of being flagged is far higher than 5%.
Publish and Forget
Once established, reference ranges rarely change. Labs may update them every 5-10 years, but the methodology stays the same. The ranges on your 2026 blood work were likely validated using population data from the early 2010s — before the explosion of strength training popularity and before most physicians had ever heard of AAS use in non-competitive athletes.
Why 95% Is Not Enough
The 95% rule sounds conservative — after all, it covers almost everyone. But probability compounds across markers. If your blood panel tests 20 markers, and each has a 5% chance of flagging a healthy person, your probability of getting at least one false positive is surprisingly high.
The Math
With 20 independent markers, each with a 5% false positive rate, the probability of at least one flag is 1 - (0.95)^20 = 64%. Nearly two-thirds of healthy people will get at least one "abnormal" result purely by chance.
The Athlete Multiplier
Athletes are not randomly distributed — they are systematically shifted on multiple markers. Your probability of a false positive is not 64%. It is higher, because you are an outlier by design, not by chance.
The Multiple Testing Problem
Systematic Failure Modes
Standard ranges fail athletes in predictable ways. Understanding the failure modes helps you spot them before they cause panic or misdiagnosis:
False Positives: Expected Physiology Flagged as Disease
False Negatives: Real Problems Hidden in the Noise
Range Inflation: Labs Using Different Ranges
A Different Methodology
GearCheck's athletic reference bands are not simply wider versions of standard ranges. They are built using a fundamentally different approach that preserves sensitivity to real pathology while eliminating false positives from expected physiology:
Athlete-Only Reference Population
Instead of using the general population, GearCheck calibrates ranges against a reference population of training individuals — people with above-average muscle mass, regular resistance training, and known AAS status where applicable. This shifts the baseline to match the physiology being measured.
Three-Tier Band System
Rather than a single normal/abnormal boundary, GearCheck uses three-tier bands:
- 2.5th to 97.5th percentile — for markers where athletic distribution is shifted but maintains normal shape (creatinine, AST, CK).
- 5th to 95th percentile — for markers where extremes are driven by pharmacology rather than pathology (HDL on AAS, SHBG on AAS).
- Fixed safety thresholds — for markers where danger is non-negotiable regardless of population (eGFR below 45, hematocrit above 55%, blood pressure above 140/90).
Confirmatory Marker Cross-Checks
For markers known to be confounded by athletic physiology, GearCheck requires confirmatory markers. Low creatinine-based eGFR must be validated with Cystatin C. Elevated AST/ALT must be checked against GGT and CK. This prevents the most common false positive patterns from triggering alerts.
Trend-Aware Interpretation
A single out-of-range value is less meaningful than the direction of change. GearCheck weights trends heavily: a stable "abnormal" value is interpreted differently from a rapidly worsening one. This mimics how experienced clinicians think, not how statistical software flags.
When to Trust the Standard Range
Standard ranges are not universally wrong. For markers unaffected by training or AAS, they remain accurate and should be trusted. These include:
GGT
Liver-specific and not elevated by training. A standard-range GGT elevation is a genuine signal of hepatobiliary stress.
Cystatin C
Muscle-independent kidney marker. Standard ranges work because training and AAS do not systematically shift Cystatin C.
Platelets
Not meaningfully affected by training. Standard thresholds for thrombocytosis remain valid for athletes.
TSH, Free T4, Free T3
Thyroid markers with well-established normal ranges across populations. Athletic activity does not significantly alter these.
