Find the best tool for you out of a list of fast tools and frameworks for data annotation or labeling for images, videos, text (NLP) or audio.

I had trouble getting a good overview of all the tools and frameworks around for data annotation so I created this list. I will try to keep it up to date.

Computer Vision NLP Audio Others Open Source tools and frameworks Images

Video

LiDAR

3D Text Audio Time Series

MultiDomain

Open Source Tools and Frameworks

Here you find a list of open-source projects grouped by datatypes!

Computer Vision

Images

Alturos.ImageAnnotation – A collaborative tool for labeling image data

Anno-Mage – A Semi-Automatic Image Annotation Tool which helps you in annotating images by suggesting you annotations for 80 object classes using a pre-trained model

CATMAID – Collaborative Annotation Toolkit for Massive Amounts of Image Data

CVAT – Powerful and efficient Computer Vision Annotation Tool

deeplabel – A cross-platform image annotation tool for machine learning

imagetagger – An open-source online platform for collaborative image labeling

imglab – A web-based tool to label images for objects that can be used to train dlib or other object detectors

Labelbox – Labelbox is the fastest way to annotate data to build and ship computer vision applications

labelImg – LabelImg is a graphical image annotation tool and label object bounding boxes in images

labelme – Image Polygonal Annotation with Python

LOST – Design your own smart Image Annotation process in a web-based environment

make-sense – makesense.ai is free to use online tool for labeling photos

MedTagger – A collaborative framework for annotating medical datasets using crowdsourcing.

OpenLabeler – OpenLabeler is an open-source desktop application for annotating objects for AI applications

OpenLabeling – Label images and video for Computer Vision applications

PixelAnnotationTool – Software that allows you to manually and quickly annotate images in directories

Pixie – Pixie is a GUI annotation tool which provides the bounding box, polygon, free drawing, and semantic segmentation object labeling

turktool – A modern React app for scalable bounding box annotation of images

VoTT – An open-source annotation and labeling tool for image and video assets

Yolo_mark – GUI for marking bounded boxes of objects in images for training neural network Yolo v3 and v2

Video

Diffgram – Training Data Software for Teams Shipping Deep Learning AI Systems. Track objects through time.

UltimateLabeling – A multi-purpose Video Labeling GUI in Python with integrated SOTA detector and tracker

VATIC – VATIC is an online video annotation tool for computer vision research that crowdsources work to Amazon’s Mechanical Turk.

Lidar

semantic-segmentation-editor – Web labeling tool for the camera and LIDAR data

3D

KNOSSOS – KNOSSOS is a software tool for the visualization and annotation of 3D image data and was developed for the rapid reconstruction of neural morphology and connectivity.

NLP

Text

ML-Annotate – Label text data for machine learning purposes. ML-Annotate supports binary, multi-label and multi-class labeling.

SMART – Smarter Manual Annotation for Resource-constrained collection of Training data

TagEditor – Annotation tool for spaCy

YEDDA – A Lightweight Collaborative Text Span Annotation Tool (Chunking, NER, etc.). ACL best demo nomination.

Audio

Audio

audio-annotator – A JavaScript interface for annotating and labeling audio files.

audio-labeler – An in-browser app for labeling audio clips at random, using Docker and Flask.

EchoML – Play, visualize and annotate your audio files

peak.js – Browser-based audio waveform visualization and UI component for interacting with audio waveforms, developed by BBC UK.

wavesurfer.js – Simple annotations tool, check the example.

Others

Time Series

Curve – Curve is an open-source tool to help label anomalies on time-series data

TagAnomaly – Anomaly detection analysis and labeling tool, specifically for multiple time series (one time series per category)

time-series-annotator – The CrowdCurio Time Series Annotation Library implements classification tasks for time series.

WDK – The Wearables Development Toolkit (WDK) is a set of tools to facilitate the development of activity recognition applications with wearable devices.

MultiDomain

Dataturks – Dataturks support E2E tagging of data items like video, images (classification, segmentation, and labeling) and text (full-length document annotations for PDF, Doc, Text, etc) for ML projects.

Label Studio – Label Studio is a configurable data annotation tool that works with different data types

If your looking for data-labeling service providers check out my other blog.

Do you know a tool or framework and would like me to add it to the list? Just comment below or drop me an mail at isusmelj at gmail.com!

Source:

github.com/heartexlabs/awesome-data-labeling

github.com/taivop/awesome-data-annotation

github.com/jsbroks/awesome-dataset-tools