MeTA is a modern C++ data sciences toolkit featuring

text tokenization, including deep semantic features like parse trees

inverted and forward indexes with compression and various caching strategies

a collection of ranking functions for searching the indexes

topic models

classification algorithms

graph algorithms

language models

CRF implementation (POS-tagging, shallow parsing)

wrappers for liblinear and libsvm (including libsvm dataset parsers)

UTF8 support for analysis on various languages

multithreaded algorithms

Documentation

Doxygen documentation can be found here.

Project setup

See the setup guide for installation instructions.

Tutorials

We have walkthroughs for the following parts of MeTA:

Users

Contact us through our GitHub issues page if you’d like your application of MeTA on our site!