When a company like Yahoo buys a web widget company for a few tens of millions, nobody usually pays much attention. This week, however, Yahoo’s purchase of Summly is making international headlines, but for all the wrong reasons—reasons that entirely miss why Summly is exciting.
Most of the stories focus on the fact that Summly’s CEO, Nick D’Aloisio, is 17 years old, and sold the company for as much as $30 million. Other than stirring feelings of tremendous inadequacy in most of us, that story will get boring in a few days.
And then there are the descriptions of what Summly does: It takes long news stories and squishes them into 400-character summaries for cell phone screens, because nobody allegedly wants to read long text on cell phones. This application, though, is a Trojan horse. For one thing, the summaries can be inaccurate. For another, consumers are increasingly showing they’ll read whole stinkin’ books on a phone. If this is all Summly is about, Yahoo is wasting its money.

A Summly article for iPhone 5
A Summly article for iPhone 5

But look more closely at the technological ambition behind Summly, provided in part by D’Aloisio, who seems to be a uniquely adept programmer, and the company’s partner, the scientific research firm SRI International. The heart of Summly today is still-crude natural language processing technology that “reads” a news story and attempts to understand it the way a human might—and then generate a summary based on that understanding. This is not just extraction—which looks for key words as clues to topics and meanings—but an attempt at abstraction, which is harder. Abstraction uses software to first understand, then summarize.
The undisputed champion of abstraction right now is IBM’s Watson computer—the one that beat everybody on Jeopardy!, and is now working with doctors at the Cleveland Clinic. Watson was a whole room of computer processors that would cost tens of millions of dollars. To win on Jeopardy!, Watson couldn’t just search for keywords. The questions were too complex. (One clue on that show: “In the 2004 opening ceremonies a sole member of this team opened the Parade of Nations; the rest of his team closed it.” Answer: “What is Greece?”) Instead, Watson was fed millions of pages of text from books, websites, song lyrics, newspapers, academic journals and more. It read those pages, categorized them, abstracted core bits for quick reference, and matched up information so it could form a higher-order concept when asked a question.
Summly is to Watson what a Hot Wheels toy car is to a real Indy race car. But still, there’s a link. The kind of thing Summly can do on a cell phone over the network will get more and more powerful—more and more Watson-like. The kind of thing Watson can do will, similarly, get packaged in ever smaller, cheaper, easier services, until it’s available on a cell phone over the network.
And then what do you have? Certainly, in the nearer term, a service that could search the world’s information much more intelligently than today’s search engines, and deliver accurate summaries linked to the original material. This has all kinds of implications for the media business, not all of it bad. It raises the prospect of keeping long-form content behind paywalls, while allowing abstraction software to deliver summaries to users. Want the full content? Click and buy it.
But well beyond that, the technology promises to become your personal research assistant. Finally, we will enter the era not of “search,” but of “find.” You’ll be able to speak or type a complex question into your cell phone: “What teenagers under 18 have sold a company for more than $1 million in the past 20 years?” The software will actually understand what you’re asking, and will understand whether the documents it’s looking at can answer the question. It could then deliver a report that attempts to answer the question, with links back to original material—much as a human research assistant might do.
Siri got us a little way toward that—but only a very little way; more gimmick than real natural language cognitive abstraction. Oh, and Google’s results for that question about teenage entrepreneurs? It delivered a link to the Wikipedia page of a 60-year-old Texas billionaire, Andy Beal, and a Forbes story, “How to make a million before you turn 20.” Some future version of a Summly/Watson service will do much better.