The DeepQA Research Team Publications

2015



Murthy Devarakonda, Ching-Huei Tsou

Innovative Applications of Artificial Intelligence (IAAI-15), 2015

Automated Problem List Generation from Electronic Medical Records in IBM Watson

Decision Making in IBM Watson Question Answering

J. William Murdock

Web presentation: Ontology Summit 2015



Unsupervised Entity-Relation Analysis in IBM Watson

Aditya Kalyanpur, J William Murdock

Proceedings of the Third Annual Conference on Advances in Cognitive Systems (ACS 2015)

Abstract Text paraphrasing algorithms play a fundamental role in several NLP applications such as automated question answering (QA), summarization and machine translation. We propose a novel paraphrasing approach based on an entity-relation (ER) analysis of text. The algorithm uses a combination of deep linguistic analysis (part of speech, dependency parse information) and background resources (NGram, PRISMATIC KB, domain dictionaries) to detect and match entities and relations. We evaluate the ER approach in a QA setting by adding it to the suite of passage scoring algorithms in IBM Watson, a state-of-the-art question answering system. We show a statistically significant improvement in the ability of IBM Watson to identify justifying passages.





E T Mueller

Morgan Kaufmann/Elsevier, 2015

Commonsense Reasoning: An Event Calculus Based Approach

2014



M. Devarakonda, Dongyang Zhang, Ching-Huei Tsou , M. Bornea

e-Health Networking, Applications and Services (Healthcom), 2014 IEEE 16th International Conference on, pp. 281-286

doi Problem-oriented patient record summary: An early report on a Watson application

Medical Relation Extraction with Manifold Models

Chang Wang and James Fan

The 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), pp. 828-838

Abstract In this paper, we present a manifold model for medical relation extraction. Our model is built upon a medical corpus containing 80M sentences (11 gigabyte text) and designed to accurately and efciently detect the key medical relations that can facilitate clinical decision making. Our approach integrates domain specic parsing and typing systems, and can utilize labeled as well as unlabeled examples. To provide users with more exibility, we also take label weight into consideration. Effectiveness of our model is demonstrated both theoretically with a proof to show that the solution is a closed-form solution and experimentally with positive results in experiments.



2013

Parallel and Nested Decomposition for Factoid Questions

B. Boguraev, S. Patwardhan, A. Kalyanpur, J. Chu-Carroll, A. Lally

Natural Language Engineering, 2013





E T Mueller

Sprache und Datenverarbeitung (International Journal for Language Data Processing)1/2, 2013

Abstract This article provides an overview of work on computational modeling of narrative, with a focus on narrative understanding. It reviews representations for narrative, systems for narrative understanding, and systems for narrative generation. It proposes a major project for narrative understanding and specifies five problems that the project must address: efficiency of reasoning, effective model finding, representation of knowledge for narrative, acquisition of knowledge for narrative, and acquisition of annotated training data.

Computational models of narrative

Tools and Methods for Building Watson

Eric Brown, Eddie Epstein J William Murdock , Tong-Haing Fin

IBM Research Report RC25356, 2013

Abstract The DeepQA team built the Watson QA system for Jeopardy! in under four years by adopting a metrics-driven research and development methodology. This methodology relies heavily on disciplined integration of new and improved components, extensive experimentation at the component and end-to-end system level, and informative error analysis. To support this methodology, we adopted a formal protocol for integrating components into the overall system and running end-to-end integration tests, assembled a powerful computing environment for running a large volume of high-throughput experiments, and built several tools for deploying experiments and evaluating results. We describe our software development and integration protocol, the DeepQA computing environment for development, and the tools we built and used to create Watson for Jeopardy!. We also briefly allude to some of the more recent enhancements to these tools and methods as we extend Watson to operate in commercial applications.





Bonan Min, Ralph Grishman, Li Wan, Chang Wang, David Gondek

The 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2013)

Distant Supervision for Relation Extraction with an Incomplete Knowledge Base



Chang Wang and Sridhar Mahadevan

The 27th AAAI Conference on Artificial Intelligence (AAAI 2013)

Multiscale Manifold Learning



Chang Wang and Sridhar Mahadevan

The 23rd International Joint Conference on Artificial Intelligence (IJCAI 2013),

Manifold Alignment Preserving Global Geometry

2012



H Hajishirzi, E T Mueller

Proceedings of the Twenty-Fifth International Florida Artificial Intelligence Research Society Conference, 2012

Question answering in natural language narratives using symbolic probabilistic reasoning

Question analysis: How Watson reads a clue

A. Lally, J. M. Prager, M. C. McCord, B. K. Boguraev, S. Patwardhan, J. Fan, P. Fodor, J. Chu-Carroll

IBM Journal of Research and Development 56(3.4), 2--1, IBM, 2012

doi

Fact-based question decomposition in DeepQA

A. Kalyanpur, S. Patwardhan, B. K. Boguraev, A. Lally, J. Chu-Carroll

IBM Journal of Research and Development 56(3.4), 2012

doi

When Did that Happen? -- Linking Events and Relations to Timestamps

D. Hovy, J. Fan, A. Gliozzo, S. Patwardhan, C. Welty

Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, 2012



Labeling by Landscaping: Classifying Tokens in Context by Pruning and Decorating Trees

S. Patwardhan, B. Boguraev, A. Agarwal, A. Moschitti, J. Chu-Carroll

Proceedings of CIKM '12: International Conference on Information and Knowledge Management, 2012



A framework for merging and ranking of answers in DeepQA

DC Gondek, A. Lally, A. Kalyanpur, JW Murdock, P. Duboue, L. Zhang, Y. Pan, ZM Qiu, C. Welty

IBM Journal of Research and Development 56(3/4), 14, 2012



Relation extraction and scoring in DeepQA

C. Wang, A. Kalyanpur, J. Fan, BK Boguraev, DC Gondek

IBM Journal of Research and Development 56(3/4), 9, 2012



Automatic knowledge extraction from documents

J. Fan, A. Kalyanpur, DC Gondek, DA Ferrucci

IBM Journal of Research and Development 56(3/4), 5, 2012



Typing candidate answers using type coercion

JW Murdock , A Kalyanpur, C Welty, J Fan, DA Ferrucci, DC Gondek, L Zhang, H Kanayama

IBM Journal of Research and Development 56(3/4), 7:1 - 7:13, IBM, 2012

Abstract Many questions explicitly indicate the type of answer required. One popular approach to answering those questions is to develop recognizers to identify instances of common answer types (e.g., countries, animals, and food) and consider only answers on those lists. Such a strategy is poorly suited to answering questions from the Jeopardy! television quiz show. Jeopardy! questions have an extremely broad range of types of answers, and the most frequently occurring types cover only a small fraction of all answers. We present an alternative approach to dealing with answer types. We generate candidate answers without regard to type, and for each candidate, we employ a variety of sources and strategies to judge whether the candidate has the desired type. These sources and strategies provide a set of type coercion scores for each candidate answer. We use these scores to give preference to answers with more evidence of having the right type. Our question-answering system is significantly more accurate with type coercion than it is without type coercion; these components have a combined impact of nearly 5% on the accuracy of the IBM Watson question-answering system.



Deep parsing in Watson

MC McCord, JW Murdock , BK Boguraev

IBM Journal of Research and Development 56(3/4), 3:1 - 3:15, 2012



Special Questions and Techniques

J. Prager, E. Brown, J. Chu-Carroll

IBM Journal of Research and Development 56(3-4), 11--1, IBM, 2012

doi

Finding needles in the haystack: Search and candidate generation

J. Chu-Carroll, J. Fan, BK Boguraev, D. Carmel, D. Sheinwald, C. Welty

IBM Journal of Research and Development 56(3/4), 2012

doi

Textual resource acquisition and engineering

J Chu-Carroll, J Fan, N Schlaefer, W. Zadrozny

IBM Journal of Research and Development 56(3-4), 2012

doi

Question analysis: How Watson reads a clue

A. Lally, J. M. Prager, M. C. McCord, B. K. Boguraev, S. Patwardhan, J. Fan, P. Fodor, J. Chu-Carroll

IBM Journal of Research and Development 56(3/4), 2012



Textual evidence gathering and analysis

J. W. Murdock , J. Fan, A. Lally, H. Shima, B. K. Boguraev

IBM Journal of Research and Development 56(3/4), 8:1 - 8:14, 2012

Abstract One useful source of evidence for evaluating a candidate answer is relevant to the question. In the DeepQA pipeline, we retrieve passages using a novel technique that we call Supporting Evidence Retrieval, in which we perform separate search queries for each candidate answer, in parallel, and include the candidate answer as part of the query. We then score these passages using an assortment of algorithms that use different aspects and relationships of the terms in the question and passage. We provide evidence that our mechanisms for obtaining and scoring passages have a substantial impact on the ability of our question-answering system to answer questions and judge the confidence of the answers.

doi

Structured data and inference in DeepQA

A. Kalyanpur, B. K. Boguraev, S. Patwardhan, J. W. Murdock, A. Lally, C. Welty, C.; J. M. Prager, B. Coppola, A. Fokoue-Nkoutche, L. Zhang, Y. Pan, Z. M. Qiu

IBM Journal of Research and Development 56(3.4), 2012



Fact-based question decomposition in DeepQA

A. Kalyanpur, S. Patwardhan, BK Boguraev, A. Lally, J. Chu-Carroll

IBM Journal of Research and Development 56(3/4), 13, 2012



2011

Jointly Learning Data-Dependent Label and Locality-Preserving Projections

Chang Wang and Sridhar Mahadevan

The 22nd International Joint Conference on Artificial Intelligence (IJCAI 2011)



Heterogeneous Domain Adaptation using Manifold Alignment

Chang Wang and Sridhar Mahadevan

The 22nd International Joint Conference on Artificial Intelligence (IJCAI 2011)



Relevance Feedback Exploiting Query-Specific Document Manifolds

Chang Wang, Emine Yilmaz, and Martin Szummer

The 20th ACM Conference on Information and Knowledge Management (CIKM2011)



Manifold Alignment

Chang Wang, Peter Krafft, and Sridhar Mahadevan

Manifold Learning: Theory and Applications, Taylor and Francis CRC Press, 2011





H Hajishirzi, E Amir, E T Mueller, J Hockenmaier

Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, AUAI Press, 2011

Reasoning about RoboCup soccer narratives



H Hajishirzi, E T Mueller

Logical Formalizations of Commonsense Reasoning: Papers from the 2011 AAAI Spring Symposium, AAAI Press

Symbolic probabilistic reasoning for narratives

Improving Recall on Conjunctive Queries against Text Documents using Query-Driven Hypothesis Generation

K Barker, J Fan, C Welty

Learning by Reading Workshop at IJCAI, 2011



Mining Knowledge from Large Corpora for Type Coercion in Question Answering

James Fan, Aditya Kalyanpur, J. William Murdock and Branimir K. Boguraev

Web Scale Knowledge Extraction (WEKEX) Workshop at International Semantic Web Conference, 2011





Nico Schlaefer, Jennifer Chu-Carroll, Eric Nyberg, James Fan, Wlodek Zadrozny, David Ferrucci

CIKM '11 Proceedings of the 20th ACM international conference on Information and knowledge management , 2011

Statistical source expansion for question answering

Fact-Based Question Decomposition for Candidate Answer Re-Ranking

Aditya Kalyanpur, Siddharth Patwardhan, Branimir Boguraev, Adam Lally, and Jennifer Chu-Carroll

Proceedings of the ACM Conference on Information and Knowledge Management (CIKM), 2011, pp. 2045--2048



Relation Extraction with Relation Topics

Chang Wang, James Fan, Aditya Kalyanpur, and David Gondek

The 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP 2011).



Leveraging Wikipedia Characteristics for Search and Candidate Generation in Question Answering

J Chu-Carroll, J Fan

Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011



Using Syntactic and Semantic Structural Kernels for Classifying Definition Questions in Jeopardy!

A Moschitti, J Chu-Carroll, S Patwardhan, J Fan, G Riccardi

Proceedings of the Conference on Empirical Methods for Natural Language Processing, pp. 73--76, 2011



Leveraging Community-built Knowledge for Type Coercion in Question Answering

Aditya Kalyanpur, J William Murdock, James Fan, Chris Welty

International Semantic Web Conference (ISWC 2011). Winner of the Best Paper Award (In-Use Track), pp. 144--156, Springer



Structure Mapping for Jeopardy! Clues

J William Murdock

19th International Conference on Case Based Reasoning (ICCBR'11), pp. 6-10, Springer-Verlag, 2011



2010

Computational models of narrative: Papers from the 2010 AAAI Fall Symposium, Technical Report FS-10-04

M Finlayson, P Gervas, E T Mueller, S Narayanan, P Winston

AAAI Press, 2010



Learning to Predict Readability using Diverse Linguistic Features

Rohit Kate, Xiaoqiang Luo, Siddharth Patwardhan, Martin Franz, Radu Florian, Raymond Mooney, Salim Roukos, Chris Welty

Proceedings of the 23rd International Conference on Computational Linguistics, pp. 546--554, 2010



Prismatic: Inducing knowledge from a large scale lexicalized relation resource

J Fan, D Ferrucci, D Gondek, A Kalyanpur

Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading, pp. 122--127



Building Watson: An overview of the DeepQA project

D. Ferrucci, E. Brown, J. Chu-Carroll, J. Fan, D. Gondek, A.A. Kalyanpur, A. Lally, J.W. Murdock, E. Nyberg, J. Prager, others

AI Magazine 31(3), 59--79, American Association for Artificial Intelligence, 2010



2009

Automating commonsense reasoning using the event calculus

E T Mueller

Communications of the ACM 52(1), 113--117, 2009

Abstract Commonsense reasoning is the human ability to make inferences about properties and events in the everyday world. The automation of commonsense reasoning, long a goal of the eld of articial intelligence and an area of active research in the last decade, is attaining a level of maturity. Automating commonsense reasoning allows us to build applications that are more user-friendly and more understanding of the world. Several major computational approaches to commonsense reasoning have been explored. Analogical processing implements the notion that people reason about novel situations by analogy to familiar ones. Probability theory allows us to reason given uncertain knowledge of the state of the world and how the world works. Qualitative reasoning focuses on reasoning about physical systems. Methods based on natural language make use of large textual corpora of commonsense knowledge. Society of mind approaches stress the use of multiple interacting methods and representations. One approach that has achieved a high degree of success because of its steadfast focus on hard benchmark problems of commonsense reasoning, is logic. One logic-based formalism that stands out as both comprehensive and easy to use is the event calculus.

doi