At Lionbridge, we know that high quality training data can be difficult to find. To help students, data scientists, and development teams get the data they need, we’ve posted a large amount of dataset aggregations on our blog. Here, you can find all of those datasets in one convenient place and search for the data you need based on use case or data type. This list will be constantly updated, providing you with the best curated dataset library available online.

The datasets have been listed in alphabetical order according to use case. Some datasets have been repeated if they belong to multiple categories.

Audio Datasets

Computer Vision Dataset Library

Data Analytics

Fintech and Financial Services Data

Language Dataset Library

NLP Datasets

Social Media Datasets

Miscellaneous Datasets

This dataset library will be constantly updated with new curated lists of the best datasets for each category and use case. Subscribe to our newsletter to receive notifications for future updates and keep up with all the latest in machine learning.

Lionbridge Data Annotation Services

Still can’t find the data you need for your project? Get in touch to learn more about our services. With over 20 years of experience in translation, linguistics, and AI training data, Lionbridge is trusted by governments and large tech companies worldwide. We are a leader in NLP data outsourcing, image annotation, and more.