Back to Cvat

CVAT: Computer Vision Annotation Tool

README.md

2.65.013.9 KB
Original Source

CVAT: Computer Vision Annotation Tool

Website · Docs · Changelog · Tutorials · Academy · Blog

What is CVAT Community?

CVAT Community is the free, self-hosted open-source edition of CVAT — one of the most widely used data annotation platforms for building high-quality visual datasets for computer vision and visual AI. Since 2018, CVAT has become one of the best-known data annotation tools in computer vision, with a large open-source community, millions of Docker pulls, and broad adoption across research and production AI teams.

CVAT Community supports image, video, and 3D annotation, dataset management, team collaboration, cloud storage integration, developer-friendly SDKs and APIs, and gives your team full control over your data and annotation infrastructure. The platform serves as the foundation of CVAT Online and CVAT Enterprise, and is actively maintained by the CVAT engineering team.

Why teams choose CVAT Community:

  • Own your data: Run entirely within your own infrastructure. No data leaves your environment.
  • AI-powered annotation: Connect your own ML models for detection, segmentation, and tracking to speed up labeling.
  • Team collaboration: Multi-user and multi-organization support with roles, task assignments, and review workflows.
  • MIT-licensed core: Use, modify, and distribute CVAT Community under the permissive MIT License. Some serverless assets and dependencies may have separate licenses.
  • Production-grade: The foundation of all CVAT commercial products — battle-tested at scale.
  • True open-source: Transparent development, active community, on GitHub since 2018.

This repository contains the source code and deployment assets for CVAT Community.

For a fully managed setup, annotation services, or enterprise features, see CVAT Online, CVAT Enterprise and CVAT Labeling Services.

Getting Started

💡 Want to explore CVAT before deploying anything? Try CVAT Online (Free plan) directly in your browser. Feature availability and usage limits vary by plan; see CVAT Online pricing for details.

Installation

Prerequisites:

💡 CVAT is primarily tested with Chromium-based browsers (Google Chrome, Microsoft Edge). Firefox may work with some caveats; Safari/WebKit is not supported.

1. Start the default stack

Clone the repository and launch the services.

bash
git clone https://github.com/cvat-ai/cvat
cd cvat

# Optional: set your IP or domain
# export CVAT_HOST=your-ip-or-domain

docker compose up -d

2. Create an admin account

bash
docker exec -it cvat_server bash -ic 'python3 ~/manage.py createsuperuser'

See the Installation Guide for full instructions and OS-specific setup.

3. Sign in and start labeling

  • Open http://localhost:8080 (or your CVAT_HOST) in your browser.
  • Log in with your superuser account.
  • Create a project or task, upload your data (images, videos, or point clouds), and define labels to start annotating.

Learn more about annotation tools and workflows in the CVAT Documentation or take our free course – CVAT Academy.

For alternative deployments (AWS, Kubernetes, external PostgreSQL, backups, upgrades), see the Deployment Guides.

Key Capabilities

  • Manual & Auto-labeling: Annotate images, videos, and 3D point clouds with bounding boxes, polygons, masks, keypoints, cuboids, tags, and more. Speed up labeling by connecting your own models for automatic annotation.
  • Task Management: Organize datasets into projects, split them into tasks and jobs, assign work to annotators, and track progress in real time.
  • Collaboration: Create organizations, invite teammates, assign roles, and collaborate on annotations with comments and issues.
  • Quality Control: Review annotations, flag issues, compare results across annotators with consensus, and run Ground Truth and Honeypot checks through the server API.
  • Analytics: Monitor user activity, working time by job, events, and server logs with Grafana dashboards.
  • Data Ops & Integrations: Export/import in 20+ formats (COCO, YOLO, Pascal VOC, KITTI, etc.), connect to cloud storage (S3, Azure, Google Cloud), and automate via REST API and Python SDK.

Advanced capabilities such as advanced project analytics, quality control UI, built-in auto-labeling with SAM 2 and SAM 3, AI agents, SSO, and more are available in CVAT Online paid plans (Solo, Team) and CVAT Enterprise.

Developer Tools

CVAT is designed for automation. Beyond the Web UI, you can integrate it into your pipelines using:

  • Python SDK: install with pip install cvat-sdk and automate task creation, uploads, and exports from Python.
  • Command line tool: install with pip install cvat-cli and script common CVAT workflows from the terminal.
  • REST API: full programmatic control over CVAT.

Data and Formats

CVAT Community supports image, video, and 3D (point cloud) annotation workflows. You can move data in and out using 20+ industry-standard formats: CVAT (XML), COCO (JSON), YOLO (TXT), Ultralytics YOLO (TXT/YAML), Pascal VOC (XML), KITTI (TXT), MOT (TXT), and more.

Full list of supported formats.

ML and AI Models

CVAT Community supports automatic annotation via pre-built serverless models powered by Nuclio, covering detection, segmentation, pose estimation, and tracking:

ModelFrameworkType
Segment Anything (SAM)PyTorchInteractor
Inside-Outside Guidance (IOG)PyTorchInteractor
RetinaNet R101PyTorchDetector
HRNet32 Whole Body PosePyTorchPose Estimation
TransTPyTorchTracker
YOLO v7ONNXDetector
Mask RCNN Inception ResNet v2OpenVINODetector
Face Detection 0205OpenVINODetector
Faster RCNN Inception v2TensorFlowDetector

To enable automatic annotation, add the serverless component to your deployment:

bash
docker compose -f docker-compose.yml -f components/serverless/docker-compose.serverless.yml up -d

This starts the serverless infrastructure. To make models available in CVAT, install nuctl and deploy the functions you need, for example SAM or YOLO, as described in the Automatic Annotation Guide.

Which CVAT edition should I choose?

  • CVAT Online: the fastest way to try CVAT and start labeling without deployment. Use it to evaluate CVAT in the browser, explore managed features, and move to cost-efficient paid plans when you need more capacity or team workflows.
  • CVAT Community: the MIT-licensed self-hosted edition for teams that want to run CVAT themselves, customize the stack, and control their infrastructure.
  • CVAT Enterprise: for organizations that need CVAT in their own cloud or internal environment, enterprise support, security controls such as SSO, paid platform features, and SLAs.
  • Labeling Services: for teams that want to outsource annotation work to CVAT.ai’s experienced labeling team instead of building an internal labeling operation. Customers get trial access to CVAT Online during the project.

For detailed plan limits and feature availability, see CVAT Online pricing, CVAT Enterprise, and Labeling Services.

Support

For dedicated support, SLAs, or advanced deployments, consider CVAT Enterprise.

Contributing

We welcome all contributions: bug reports, documentation fixes, integrations, and code.

Security

License

CVAT Community is released under the MIT License.

  • Code in /serverless is also MIT-licensed, but may use third-party assets under separate licenses (including non-commercial). Review those licenses before use.
  • This software uses FFmpeg libraries under LGPL/GPL. See the Dockerfile and FFmpeg legal info for details.

Additional Resources

For the latest product releases, feature walkthroughs, and all things CVAT see:

<table cellspacing="10" border="0"><tr> <td><a href="https://www.cvat.ai/resources/blog"></a></td> <td><a href="https://www.cvat.ai/resources/academy"></a></td> <td><a href="https://www.cvat.ai/resources/case-studies"></a></td> <td><a href="https://www.youtube.com/@cvat-ai"></a></td> <td><a href="https://www.linkedin.com/company/cvat-ai"></a></td> </tr></table> <!-- Badges -->