Built-in data provenance

For advanced users, we offer optional data provenance solutions via our Data Provenance Pipeline (DPP). The DPP rigorously controls data lineage, ensuring that you always know exactly where your data originated and how it was processed. This is a must-have for bioscience and financial institutions who rely on our datasets to make billion-dollar decisions every day.

State-of-the-art data pipeline

The Vectorspace data engineering pipeline takes unstructured text from any data source and applies state-of-the-art machine learning techniques based on unsupervised learning and NLP/NLU to find hidden relationships between entities (e.g. genes, proteins, diseases & drug compounds) that can accelerate the process of discovery.