MDNews - San Antonio

October 2011

Issue link: http://viewer.e-digitaledition.com/i/46995

Contents of this Issue

Navigation

Page 17 of 23

+++++++++++ + + +++++++++++ Q&A +++ +++ AN EXCLUSIVE INTERVIEW With Dave Ferrucci Team Leader, Watson Project PART ONE OF FOUR Q Was the rationale behind the Watson project grounded in trans- forming health care at its initial stage? A. No. IBM didn't invest in building Watson as a way to generate a revenue stream by winning at "Jeopardy!," but two things were obvious to us early on. One was that huge amounts of information out there are captured in natural language content: docu- ments, news, blogs, textbooks, reference books, encyclopedias, dictionaries, and a huge wealth of knowledge included in journal papers, abstracts, research papers. There's just an enormous amount of information out there that could impact decision-making — whether it be in health care, business intel- ligence, finance or technical support — when we talk about guidelines and user support manuals. It's not carefully encoded in perfect database tables with exact, precise queries. How do you get at it? This is the technology we knew that, if we pursued "Jeopardy!," we would be tackling, we would be exploring and we would be advancing, because that's the kind of technology that was really needed to solve the "Jeopardy!" problem. The expectation was that we would do this in a general enough way that we could then take that technology, with that vision in mind — that we would do a better job analyzing and extracting valuable knowledge from unstructured content — and that we were going to take what we did with "Jeopardy!" and apply it to these other domains. Q How would the volume of data Watson handled for "Jeopardy!" compare to the volume of data necessary for its application in the medical field? A. If you restrict yourself to text data — there are millions and millions of books and an enormous amount of imagination — imagine having access to that and deeper analysis of it. With "Jeopardy!," you see question in, answer out. We don't really want that for applications. When we talk about decision support, informing profes- sionals and helping them get access to the information they need, just delivering an answer is not sufficient. In "Jeopardy!," you saw an answer panel that showed the top three answers and the confidence associ- ated with those answers. That confidence is coming from a deeper analysis of the unstructured content — what we call an evidence profile. The way to think about Watson is that when that "Jeopardy!" clue went in, think of that as a bunch of facts about an unknown entity, whether it be a treatment, a disease or a test. What it's doing is saying, 'This looks like a question or a clue to you, but really it's a case, a body of information.' Even in "Jeopardy!," we considered not just the clue, but also the surrounding clues in that category and the category title. We extracted from that a bunch of things that are true about the unknown entity and the answer to the question, and then it generated many possible answers — for example, possible diagnoses or possible treatments. Q For physicians who currently use some amount of data in their Watson, powered by IBM POWER7, is a work-load optimized system that can answer questions posed in natural language over a nearly unlimited range of knowledge. 18 | San Antonio MD NEWS MDNEWS.COM treatment of patients, and for individuals responsible for providing that informa- tion to physicians, what would be more compelling about Watson compared to the data systems they're using today? A. First of all, I think it's the flexibility of the way the system works. In other words, it could start using whatever information humans have been creating and find value in this case without the typical investment that goes into collecting that information. If you think about it, there are two kinds of applications out there. There's text-based applications, such as Web search. I can just give it any information you want — you just throw it in there, and no one analyzes it, no humans, just whatever it is and whoever wrote it. But what you're getting is a very broad coverage. You can look at all of that data because it didn't need a whole lot of human preparation or organization in the databases. n

Articles in this issue

Links on this page

Archives of this issue

view archives of MDNews - San Antonio - October 2011