Uber-Inspired Software Flags COVID-19 Variants Before They Explode

Data scientists took a tool originally developed for Uber and built a new prediction model to help make sense of emerging variants.

Until recently, there was no scientific way to predict which COVID-19 variants would be the most transmissible, and therefore no way to guide public policy as new strains emerged. For the past year and a half, public health experts have had to base their planning on simple observation, and to some extent, guesswork: which variants were becoming dominant in other regions or countries, and how soon could we expect to see them here?

Now, though, a collaborative team of data scientists, biologists, and infectious disease experts has applied machine learning advances originally designed, believe it or not, for the ride-sharing industry to this challenge. The result: a new tool that can actually predict the transmissibility of variants well ahead of time, accurately forecasting variant transmission patterns for the next one to two months. (Note: this tool and a description of its scientific validation have been posted as a preprint, which is a scientific paper that has not yet undergone the rigors of peer review.)

The tool would not have been possible without an unusual pairing. In the summer of 2020, data scientists who had previously worked for Uber joined one of the world’s leading genomic institutes, teaming up with scientists dedicated to fighting the COVID-19 pandemic. Last year, the Broad Institute (in this case, Broad rhymes with “rode”) in Cambridge, Mass., quickly converted some of its industrial-scale genomics lab capacity into a pandemic testing facility. In addition to determining whether samples were positive or negative for the SARS-CoV-2 virus, the team also sequenced tens of thousands of viral genomes.

Around the world, many laboratories are contributing to the database of viral genomes as well; the GISAID repository has had 3.7 million submissions. That’s a wealth of data, but running any kind of comparison across so many genomes is prohibitively costly in computational terms.

At the Broad, scientists wanted to do more with this data, and they had just the team to make it happen — three data scientists recruited from Uber’s AI team who had created a machine learning tool called Pyro to help customize models of traffic patterns and other elements for cities or regions. The tool was particularly good for building new models that contained many uncertain variables. When it was publicly released by Uber as an open-source platform, it got a surprising amount of uptake in the life science community, where it could be used for probabilistic modeling of biological experiments. “It’s actually more useful for science than it is for a ride-sharing company,” says Fritz Obermeyer, one of the developers who formerly worked at Uber.

At the Broad, Obermeyer and his colleagues quickly took up the challenge of mining the millions of available SARS-CoV-2 genomes to try to forecast the transmissibility of new variants. Rather than comparing every genome to every other genome, they streamlined the process by analyzing clusters of closely related variants. Their preprint describes the analysis of 2.1 million genomes, clustered into nearly 1,300 lineages representing more than 1,000 different regions around the world.

The machine learning tool they built is based on the original Pyro framework — this one is called PyR₀, a play on the R₀ metric used to assess disease transmissibility. It models variant patterns based on specific mutations in the viral genome. “The predictive capability of this model relies on the repeated emergence of the same mutation in different strains independently,” Obermeyer says. “That allows us to predict the growth rate of a particular strain based on the new mutations it has acquired.”

While the model relies on mutations that have been seen before, one of its most important features is that it does not need to know what any given mutation does. Typically, scientists seeking to assess transmissibility of a variant have to perform a series of lab experiments to tease out the precise function of each new mutation. For Obermeyer’s tool, those time-consuming functional tests aren’t necessary for forecasting. The model has access to all of the mutations from genome sequence data, and can infer from the data which ones are associated with increased transmissibility. That is a huge leap in capability for epidemiological researchers focusing on the COVID-19 pandemic.

According to Bronwyn MacInnis, an infectious disease scientist at the Broad who described this work in a presentation at the recent AGBT Precision Health conference, the PyR₀ tool accurately predicted both the explosive growth of the Delta variant and the relatively minor emergence of the Mu variant (originally detected in Colombia earlier this year), long before conventional scientific approaches could have. Using genomic data for epidemiology and infectious disease has “really come of age” in the pandemic, she said. But genomic tools were not built for this kind of use. “The field really needs some great and quick innovation to keep up with the data,” she added, pointing to the former Uber team’s work as a great example.

Obermeyer points out that the model only works as well as it does because it has access to such an enormous trove of genomic data collected around the world. “It’s really important to be able to share observations [of mutations] across countries and across cities,” he says.

Now that the tool is available, public health experts have one more arrow in the quiver to help guide the pandemic response. Mask mandates, indoor capacity limits, and other measures can all be used in a more targeted manner if we can predict the likelihood of the spread of specific new COVID-19 variants. “As soon as we see that there’s a more highly transmissible strain in a particular region, then we [can] react to that by changing these intervention measures,” Obermeyer says.

Cookie	Duration	Description
__cf_bm	1 hour	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
__hssc	1 hour	HubSpot sets this cookie to keep track of sessions and to determine if HubSpot should increment the session number and timestamps in the __hstc cookie.
__hssrc	session	This cookie is set by Hubspot whenever it changes the session cookie. The __hssrc cookie set to 1 indicates that the user has restarted the browser, and if the cookie does not exist, it is assumed to be a new session.
_GRECAPTCHA	6 months	Google Recaptcha service sets this cookie to identify bots to protect the website against malicious spam attacks.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Advertisement" category.
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-analytics	1 year	Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Analytics" category.
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-functional	1 year	The GDPR Cookie Consent plugin sets the cookie to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-necessary	1 year	Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Necessary" category.
cookielawinfo-checkbox-others	1 year	Set by the GDPR Cookie Consent plugin, this cookie stores user consent for cookies in the category "Others".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	1 year	Set by the GDPR Cookie Consent plugin, this cookie stores the user consent for cookies in the category "Performance".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	CookieYes sets this cookie to record the default button state of the corresponding category and the status of CCPA. It works only in coordination with the primary cookie.
csrftoken	1 year	This cookie is associated with Django web development platform for python. Used to help protect the website against Cross-Site Request Forgery attacks
elementor	never	The website's WordPress theme uses this cookie. It allows the website owner to implement or change the website's content in real-time.
JSESSIONID	session	New Relic uses this cookie to store a session identifier so that New Relic can monitor session counts for an application.
rc::a	never	This cookie is set by the Google recaptcha service to identify bots to protect the website against malicious spam attacks.
rc::b	session	This cookie is set by the Google recaptcha service to identify bots to protect the website against malicious spam attacks.
rc::c	session	This cookie is set by the Google recaptcha service to identify bots to protect the website against malicious spam attacks.
rc::f	never	This cookie is set by the Google recaptcha service to identify bots to protect the website against malicious spam attacks.
viewed_cookie_policy	1 year	The GDPR Cookie Consent plugin sets the cookie to store whether or not the user has consented to use cookies. It does not store any personal data.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
wpEmojiSettingsSupports	session	WordPress sets this cookie when a user interacts with emojis on a WordPress site. It helps determine if the user's browser can display emojis properly.

Cookie	Duration	Description
lang	session	LinkedIn sets this cookie to remember a user's language setting.
li_gc	6 months	Linkedin set this cookie for storing visitor's consent regarding using cookies for non-essential purposes.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
mgref	1 year	This cookie is set by Eventbrite to deliver content tailored to the end user's interests and improve content creation. It is also used for event-booking purposes.
mgrefby	1 year	This cookie is set by Eventbrite to deliver content tailored to the end user's interests and improve content creation. It is also used for event-booking purposes.
UserMatchHistory	1 month	LinkedIn sets this cookie for LinkedIn Ads ID syncing.
yt-player-headers-readable	never	The yt-player-headers-readable cookie is used by YouTube to store user preferences related to video playback and interface, enhancing the user's viewing experience.
yt-remote-cast-available	session	The yt-remote-cast-available cookie is used to store the user's preferences regarding whether casting is available on their YouTube video player.
yt-remote-cast-installed	session	The yt-remote-cast-installed cookie is used to store the user's video player preferences using embedded YouTube video.
yt-remote-connected-devices	never	YouTube sets this cookie to store the user's video preferences using embedded YouTube videos.
yt-remote-device-id	never	YouTube sets this cookie to store the user's video preferences using embedded YouTube videos.
yt-remote-fast-check-period	session	The yt-remote-fast-check-period cookie is used by YouTube to store the user's video player preferences for embedded YouTube videos.
yt-remote-session-app	session	The yt-remote-session-app cookie is used by YouTube to store user preferences and information about the interface of the embedded YouTube video player.
yt-remote-session-name	session	The yt-remote-session-name cookie is used by YouTube to store the user's video player preferences using embedded YouTube video.
ytidb::LAST_RESULT_ENTRY_KEY	never	The cookie ytidb::LAST_RESULT_ENTRY_KEY is used by YouTube to store the last search result entry that was clicked by the user. This information is used to improve the user experience by providing more relevant search results in the future.

Cookie	Duration	Description
__hstc	6 months	Hubspot set this main cookie for tracking visitors. It contains the domain, initial timestamp (first visit), last timestamp (last visit), current timestamp (this visit), and session number (increments for each subsequent session).
_fbp	3 months	Facebook sets this cookie to display advertisements when either on Facebook or on a digital platform powered by Facebook advertising after visiting the website.
_ga	1 year 1 month 4 days	Google Analytics sets this cookie to calculate visitor, session and campaign data and track site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognise unique visitors.
_ga_*	1 year 1 month 4 days	Google Analytics sets this cookie to store and count page views.
_gat_UA-*	1 minute	Google Analytics sets this cookie for user behaviour tracking.
_gid	1 day	Google Analytics sets this cookie to store information on how visitors use a website while also creating an analytics report of the website's performance. Some of the collected data includes the number of visitors, their source, and the pages they visit anonymously.
AnalyticsSyncHistory	1 month	Linkedin set this cookie to store information about the time a sync took place with the lms_analytics cookie.
browser_id	5 years	This cookie is used for identifying the visitor browser on re-visit to the website.
hubspotutk	6 months	HubSpot sets this cookie to keep track of the visitors to the website. This cookie is passed to HubSpot on form submission and used when deduplicating contacts.
vuid	1 year 1 month 4 days	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos on the website.

Cookie	Duration	Description
bcookie	1 year	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser IDs.
bscookie	1 year	LinkedIn sets this cookie to store performed actions on the website.
fr	3 months	Facebook sets this cookie to show relevant advertisements by tracking user behaviour across the web, on sites with Facebook pixel or Facebook social plugin.
iutk	6 months	Issuu sets this cookie to recognise the user's device and what Issuu documents have been read.
li_sugr	3 months	LinkedIn sets this cookie to collect user behaviour data to optimise the website and make advertisements on the website more relevant.
muc_ads	1 year 1 month 4 days	Twitter sets this cookie to collect user behaviour and interaction data to optimize the website.
NID	6 months	Google sets the cookie for advertising purposes; to limit the number of times the user sees an ad, to unwanted mute ads, and to measure the effectiveness of ads.
personalization_id	1 year 1 month 4 days	Twitter sets this cookie to integrate and share features for social media and also store information about how the user uses the website, for tracking and targeting.
PREF	8 months	PREF cookie is set by Youtube to store user preferences like language, format of search results and other customizations for YouTube Videos embedded in different sites.
scribd_ubtc	10 years	Scribd sets this cookie to gather data on user behaviour across several websites and maximise the relevancy of the advertisements on the website.
test_cookie	15 minutes	doubleclick.net sets this cookie to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	6 months	YouTube sets this cookie to measure bandwidth, determining whether the user gets the new or old player interface.
VISITOR_PRIVACY_METADATA	6 months	YouTube sets this cookie to store the user's cookie consent state for the current domain.
YSC	session	Youtube sets this cookie to track the views of embedded videos on Youtube pages.
yt.innertube::nextId	never	YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__cflb	1 hour	This cookie is used by Cloudflare for load balancing.
__eoi	6 months	Description is currently not available.
_cfuvid	session	Description is currently not available.
AN	1 month	No description available.
AS	session	No description available.
ebEventToTrack	1 month	No description available.
eblang	1 year	No description available.
hmt_id	1 month	Description is currently not available.
li_alerts	1 year	Description is currently not available.
loglevel	never	No description available.
m	1 year 1 month 4 days	No description available.
SP	session	Description is currently not available.
SS	session	Description is currently not available.
stableId	1 year	Description is currently not available.

Uber-Inspired Software Flags COVID-19 Variants Before They Explode

About the Author

By Meredith Salisbury

Stories

Communities

Products

Leading Advisors

Social

About

Uber-Inspired Software Flags COVID-19 Variants Before They Explode

Share this on:

About the Author

By Meredith Salisbury

Related Content

Techonomy 23 to Focus On the Promise and the Peril of AI

By Dan Costa

The Inflation Reduction Act Could be the New, New Deal

By Robin Raskin

Soil Fungi May Be a Carbon Pool

By Caroline Hasler

Scientists Deploy AI to Spot Signal in the Noise of Wearable Data

By Meredith Salisbury

Most Popular in Health + Science

The Inflation Reduction Act Could be the New, New Deal

By Robin Raskin

6 Medical Breakthroughs Remaking Modern Health

By Ruthie Kornblatt-Stier

Navigating Platforms and Technologies for a New Healthcare Era

By Patricia Birch

Medtronic SVP Ken Washington On How AI Is Accelerating Healthcare For The Better

By Kendall Wyckoff

Why You Need to Know About Epigenetics

By Meredith Salisbury

AI Takes Med Tech to the Next Level

By Techonomy Media

The Forces Driving Healthcare to Rethink Its Platform Strategies

By Patricia Birch

12 Corporate Experts Talked Best-Practices for Innovating in Big Companies. Here Are Their 4 Conclusions.

By Techonomy Media

How AI Can Tackle 5 Global Challenges

By Sean Captain

Planning For Change With Dr. Katharine Hayhoe

By Caitlin Hamilton

Newsletter Subscriptions

Sign up for our newsletters

Newsletter Subscription

Sign up to our Premium Membership

Stories

Communities

Products

Leading Advisors

Social

About

Start typing and press enter to search

Newsletter Subscriptions

Sign up for our newsletters