Many users of the Apple Watch have used it to track steps and stay active. VO2 max, a measurement of how much oxygen the body consumes during intense workouts, is also a watch tracked statistic. If your increased cardio activity is making you more fit, the Vo2 max should show this, assuming that Apple Vo2 max is accurate. Other fitness measures, including the Strava fitness score, measure a training load or how much activity you have actually done. In theory the two are tied together, but Vo2 max is a more objective bodily measurement compared to “fitness”.
But measuring the amount of oxygen you consume has to do with the lungs and heart, not your wrist. So how accurate is the Vo2 max reading on the Apple Watch? This article summarizes how the watch based reading compares to a lab based treadmill test. The results were an estimated reading that had the Apple Watch overestimating VO2 max by about 7%, or +4.0 mL/kg/min, in a one time test.
- Apple Watch Vo2 Max Estimate- 53.9 mL/kg/min
- DexaFit Treadmill & Mask Test Output- 57.9mL/kg/min
Individual readings will vary, and the accuracy of a test will vary based on what your actual score is. For very fit individuals or those with low fitness the minimum (14 mL/kg/min) and the ceiling available for Apple Watch output (60 mL/kg/min) also will limit the accuracy.
Understanding Vo2 Max Baseline Over Time – Measured by Apple Watch
If you have monitored workouts for a while you may see weird spikes in the estimated reading. That is likely because the wrist worn, blood oxygen concentration monitoring watch uses a calculation to get a VO2 max result. If the equation used to calculate changes, as it does during software updates in order to improve accuracy as the technology and science develop, you may see a spike. In 2020 and 2021 there were noticeable changes in Vo2 max readings just after the watch updated.
Without a software update, the accuracy of the watch is still pretty good. It’s unlikely to fluctuate day to day so trends over time are easy to spot. It is also important to understand that you will only get a VO2 max data point for uninterrupted 20+ minute outdoor workouts. If you are doing treadmill running, or your HR sensor looses a reading, or if you pause an outdoor activity you will not see a new VO2 max data point.
The image above show the trends of VO2 Max from a watch during a training cycle. Starting at 15-20 miles of running per week and moving up to 45-50 mile weeks over the course of a few months the trends are obvious.
Even with the jumps from software updates, it’s easy to see that the reading at least tracks general fitness. So is a more accurate reading necessary? If all you are looking for is a metric to encourage continued training and wellness improvement, the trend from a wrist worn calculation should be just fine. Without comparing to a lab based setting the Apple way of figuring Vo2 max at least appears consistent outside of major new watch firmware updates.
DexaFit VO2 Max Accuracy
The term “accuracy” assumes you know the true measurement. DexaFit, like other wellness centers, uses a mask based monitoring system to read your true VO2 during a 10-20 minute hrs effort.
The duration of the test is based on performance, you can opt out on your own. During an initial test this might be at a sub maximum effort. Personally, running at full 12.5% incline on a treadmill, while breathing into a mask that at least partially resists breathing in (or mentally makes you think that), was not normal. I pulled up on the test probably 20-30 seconds prior to complete exhaustion, but still was happy with the test. The examiner agreed most people can perform better on a second test just from the mental aspect of knowing what to expect. Other situational cues, a coach cheering you on , a competitor on the treadmill next to you, good music, etc. all could have impact on the mental ability to push the test.
Regardless of the settings, the accuracy of the measurement of this test style is closer to actual. If you manage to get to exhaustion, the peak is likely to be your max. Having done the test, where my heart rate peaked (178bpm) about 7-8 beats below what I’ve observed during workouts (185bpm), it appears the treadmill DexaFit measurement aligns well with an Apple Watch prediction.
In total the treadmill was 53.9 mL/kg/min which is ~7% lower than the Apple Watch 57.9mL/kg/min predicted just a day before during a run.
VO2 Max Apple Watch Accuracy Compared to DexaFit Treadmill Test
After a tough 13 minute effort on the treadmill the Dexafit test resulted in almost within a rounding error of Apple’s prediction. Although there is not a 7% variability in day to day predictions from my the watch, there was no indication that either was fundamentally flawed as a measurement. Since the treadmill is a one time thing (given the cost), and my watch providing a nearly identical reading pretty much anytime I do a 30 minute run, the nod has to go to Apple for their VO2 max accuracy.
With the treadmill test being so similar to the Watch predictions on VO2 max accuracy, what good is a treadmill test. The pain of suffering through an uphill 13 minute all out effort is not really worth it to get to an accuracy of 7% (less when you consider that the lab test has some variability anyways)
A secondary, and arguably primary, benefit to the treadmill test is to see when you pass through threshold levels. For heart rate training this is key. Below a factor of 1, your body is clearing lactic acid at a rate faster than its building up. This is the threshold at which burning starts accumulating in muscles. In endurance athletics keeping this heart rate in mind is the difference between hitting the wall and finishing strong.
Because the treadmill test starts at a warmup and escalates to max effort, you can monitor these zones of lactate build up. This again is something that can be estimated by calculators, and you can set your watch to alert you at different zones, but this is a personalized measure and knowing your own body can help make the small differences that the calculators will not enable.
Overall both the treadmill test and the watch estimate provide the same results. The VO2 max accuracy is close enough on either to get to the right general zones to monitor fitness. With the cost of a clinic treadmill test being higher ($49-$149 depending on location) it is more costly to find trends.
A watch does a good enough job to showcase if fitness is improving overall. The clinic treadmill tests are a solid option for an occasionally check-in, and can be used in a similar way that a time trial or a target race is as a way to motivate you during a training cycle.
Overall I was impressed with how close the Apple Vo2 max prediction was to the lab tests. For the most part, based on these observations, the accuracy of the Apple Watch appears good enough to use it as a solid estimate of a true V02 maximum. This assumes that you have done a hard workout and have seen enough consistency in the Apple readings to assume that the results are not a simple outlier.
Since there is not much more that you can do with a more accurate reading, the Vo2 max that Apple health kit provides is sufficient to encourage most behaviors. If for some reason you want to train your body to optimize for this metric it is worth ensuring that you have a good end result in mind that is a proxy for the measurement (improving a 5k running time, feeling better on weekly hill climb bike rides with friends, hiking a few mountains) since it ultimately is only an indicator of fitness and is not something that is routinely looked at as the end all measure of fitness.
Accuracy Of Apple Watch in Other Metrics
VO2 Max is not the only metric where the consistency and accuracy of the measurement matter. Even for things as simple as step tracking will vary depending on the device that you wind up using. We have previously reviewed the step tracking of Apple Watch, comparing it to Amazon Halo, where the difference in step count can be as high as 25%. When it comes to sleep tracking the Apple Watch accuracy was within 10% of the total sleep time measured by the Amazon Halo as well.
When it comes to other metrics related to wellness, the measurement can be even further off. Comparing body composition types reveals that lab tests can vary by almost 30% when compared to app based images and smart scales.
All this is to say, that when it comes to the accuracy of VO2 max, being within single digits percentage difference from a lab tests means it is a pretty solid estimate.