A recent study from Cardiogram and the University of California, San Francisco suggested that the Apple Watch can be a test for obstructive sleep apnea. It is exciting to think that wearable devices could provide an accessible, low-cost approach to evaluating obstructive sleep apnea. At the same time, it would be important to examine carefully the implications of this work. Are wearable devices like the Apple Watch really usable as a test for obstructive sleep apnea?
First and foremost, I am no expert in deep neural networks, a type of deep learning or machine learning that was used in the research study, although I know enough to understand its complexity because I am working with some experts in machine learning from the USC Viterbi School of Engineering through our multidisciplinary group called Sleep Health using Bioengineering, or SleepHuB. I also did not attend the study’s presentation at the American Heart Association conference earlier this month. So I will confine myself to the study’s reported findings and the implications for obstructive sleep apnea testing. The study examined over 6000 users of the Apple Watch who agreed to participate in the study. The research study participants reported whether they had been diagnosed with obstructive sleep apnea, which is not truly ideal because some people may have had sleep apnea without knowing it because they had not had a sleep study. The participants were divided into two groups: one to train the network and the other to test it. The results showed that the Apple Watch data predicted a diagnosis of obstructive sleep apnea with 90% sensitivity and 60% specificity.
What do the Apple Watch study findings mean for sleep apnea?
Sensitivity is the likelihood that the Apple Watch shows a diagnosis of obstructive sleep apnea if they had been diagnosed with it on a sleep study, so the 90% figure is a good thing and is intriguing. However, the specificity may be more important for using the Apple Watch to test for sleep apnea. Specificity is the likelihood that the Apple Watch shows that someone does not have obstructive sleep apnea if they do not have it. Although this study is not perfect because many people did not have previous sleep studies (potentially affecting the specificity), a specificity of 60% is probably not good enough to use the Apple Watch in taking care of patients at this time.
Here is one example why. Often I see patients in whom I am concerned about obstructive sleep apnea, but we get a sleep study that shows no obstructive sleep apnea. We may worry that the sleep study was not entirely accurate, meaning that they really do have obstructive sleep apnea, yet the sleep study did not provide accurate information on the night of the test. There are many reasons that this could occur. For example, someone may not sleep comfortably during the one night of their study, making their sleep lighter or more disrupted than a typical night. In such a situation, we might evaluate data from the Apple Watch or another wearable device to see if it matched the negative sleep study. We could always repeat a sleep study, but we might not if Apple Watch data suggested that obstructive sleep apnea was unlikely.
How might we use this study about the Apple Watch and sleep apnea?
The Apple Watch could be used for screening patients, diagnosing patients, or monitoring patients during treatment. For all of these uses, we might want to replace the current approach of sleep studies with something simpler and cheaper like analysis of Apple Watch data. Clearly this study is an important first step but not enough evidence to consider replacing sleep studies with the Apple Watch.
From my time as a faculty member at UCSF, I happen to know two of the study’s authors, Dr. Greg Marcus and Dr. Mark Pletcher, and respect the substantial, interesting work that was done. Because a specificity of 60% is not really good enough to make clinical decisions, I am sure the authors would agree that more research and refinement is necessary before this would ever be used for the care of patients. A wearable device like the Apple Watch may be most useful as a screening test, so future research should evaluate what are called the positive and negative predictive values of data from the Apple Watch or other wearable devices like the FitBit. I would propose the following:
- Work with the existing data to determine whether different Apple Watch data criteria could increase the specificity, even if it compromises the sensitivity. The goal would be to determine whether there is a best way to define a positive or negative Apple Watch test for sleep apnea.
- Perform studies to determine the positive and negative predictive values in various subgroups, for example those with and without loud snoring (that can also be detected by the Apple Watch). This would require performing sleep studies in everyone to see how the Apple Watch test for sleep apnea compared with sleep study results. This obviously is an expensive endeavor, and it would reqiure substantial research funding.
I am certainly not opposed to consumer devices or smartphone apps. In fact, I routinely refer patients to the SnoreLab smartphone app because it is a standardized method to track snoring. There are numerous consumer devices with unsubstantiated claims about benefits for sleep and sleep apnea. Most of them are pretty much useless, although the marketing departments would not want you to think so. I applaud the researchers and companies involved in this work, as they have taken the opposite approach of examining the potential of their approach before rushing to sell more devices on pipe dreams. I look forward to learning more about this work with the Apple Watch and the potential for wearables to streamline the evaluation and monitoring of patients with sleep disorders.
33 − = 27