When wearable devices first came to market, consumers were eager for a fun new way to track their fitness activity — while scientists immediately saw the opportunity to mine potentially game-changing data sets containing information relevant to human health. These products were rapidly adopted by millions, far surpassing the numbers of people who ever used clinically regulated products to measure similar activity.

Since then, scientists have found a number of ways to implement wearable device data into their studies. Some have built phone apps used to recruit participants and deliver health-related surveys for consumers to fill out. Others tapped into data using frameworks such as Apple’s ResearchKit to mine fitness and other information collected by users and shared with their consent.

Across these projects, a common challenge has emerged: the relatively low quality of data generated by these wearable devices. Unlike clinical devices that are regulated by the U.S. Food and Drug Administration and are extensively tested to ensure robust and reproducible results, consumer-oriented wearable devices or phone apps produce data that is messy.


Interested in the AI Dilemma? Join us this November for Techonomy 23! Learn More+

If you’ve ever used a fitness tracker and found that it seriously undercounted or overcounted steps, or that its calculations of calories burned were way off, you’ve witnessed the problem. These products were not designed to meet the standards of rigorous research studies, even when they track important health traits such as heart activity. Their data often buries real signals amid mountains of meaningless noise.

That’s why a recent advance from scientists at Yale University is such a big deal. They developed a new AI-powered approach that filters wearable data, stripping out the noise while preserving that all-important signal. In the project they described in a research journal published by Nature, the algorithm was used to clean up the kind of electrocardiogram (ECG) data collected by phones and other devices. The concept could point the way to clean up other types of wearable data in the future.

In this project, scientists focused on the data needed to detect left ventricular systolic dysfunction, a risk marker for heart failure that can be captured in an ECG. They knew that AI tools could be used to identify relevant warning signs in that data from the reliable ECG machines used in hospitals and other clinical settings, but were curious about whether a similar approach could be adapted to less reliable data from wearable devices or smartphones.

They began by training an AI model on noisy data: the type of ECG information that is generated by wearable and other consumer devices. (Because there isn’t a widely available data set of wearable ECGs, the researchers had to use clinical ECG data; they selected a small subset of the data that’s representative of the information generated by wearables.)

The data training set included more than 385,000 ECGs, of which nearly 57,000 came from patients with left ventricular systolic dysfunction. Importantly, the data represented a diversity of patient ancestries. The AI model was then tested on a wide range of heart sounds to evaluate its performance, including ECGs designed to contain much more noise than signal.

The outcome was a noise-adapted AI algorithm that accurately detects left ventricular systolic dysfunction from messy ECG data akin to that produced by wearables or smartphone apps. The approach proved robust, correctly ignoring sources of noise even when those particular noises had not been included in the training data. It also worked consistently across ages, sexes, and ethnicities.

Ultimately, the scientists hope this approach could allow for more accurate ECGs outside the clinic, enabling better detection of cardiac risk in low-resource settings or anywhere that lacks hospital-grade instrumentation. “Wearable devices [also] allow for community-wide screening, an important next step in the early detection of common and rare cardiomyopathies,” the team reports in their publication.