In this massively data rich world, the equilibrium between information and knowledge has increasingly shifted from knowledge toward information. Advanced text and data mining (TDM) is not yet ubiquitous and even if it were, not all content is structured enough to leverage TDM potential. In developing the supercomputer Watson with the ability to process, analyze and extract information from natural language such as PLOS article text, IBM is beginning to shift the equilibrium back to knowledge.

Understanding Relationships

PLOS and IBM Watson are collaborating to bring quality Open Access biomedical literature to healthcare entrepreneurs and innovators, and to do so in a way that provides full article content and context including PubMed citation information from the National Library of Medicine.

The collaboration is “not just about PLOS or Open Access,” says PLOS Chief Technology Officer CJ Rayhill, “it’s about improved healthcare through immediate access to relevant clinical, translational and basic biomedical discoveries documented in the peer-reviewed literature.”

For the past year, IBM Watson has been ingesting PLOS article content directly from PubMed Central, beginning with PLOS Biology and PLOS Medicine. This large collection of content is then used by IBM Watson in two ways, direct and indirect. In direct and immediate use, structured metadata as well as concepts extracted from article text are put to an algorithm by IBM Watson to obtain insights into relationships that might improve the way medicine is practiced in the clinic or hospital, at point of care. Used indirectly, developers of software applications create programs that extract customized information and concepts from PLOS articles—or any other large collection of text.

Accelerating Discovery

Watson’s ability to analyze massive amounts of information means it can keep up – faster and better than any human brain – with advances reported in scientific journals. The future may lie in the possibility that in the same way clinicians are beginning to use mobile apps to guide medical treatment decisions, scientists will use an app to help map their own research discovery pathways, based on the entire text of Open Access literature in PubMed Central. In this way, Open Access articles provide entrepreneurs the reliable biomedical information they need to develop digital healthcare, translational medicine and even research breakthroughs. The conversation between PLOS and IBM Watson is one more example of PLOS accelerating research progress and transforming research communication–through collaboration.

PLOS helped IBM Watson project leaders understand important differences in Open Access biomedical literature–not all Open Access is created equal. PLOS advised in the training of IBM Watson to understand research article content and to recommend that as IBM Watson moves to ingest additional scientific data from PubMed Central, key contextual information is maintained, including DOIs and links to cited literature. Importantly, through inclusion of Open Access articles, entrepreneurs will benefit from the ability of Watson to provide a complete picture of research results as presented in an article, with the context of those results maintained relevant to the body of Open Access literature cross-referenced within the article. For the sake of limited time and brainpower, you and I might restrict our reading to an article’s abstract. Watson doesn’t have those problems.

A collaboration between PLOS and IBM might seem out of place. But it’s important to appreciate, says Rayhill, that “information and insights contained in Open Access publications have value in commercial applications.” Those at the forefront of clinical care, biomedical research or policy development can access this knowledge and benefit from improved decision making.

Image Credit: parameter_bond, Flickr.com