docs/how_to/count_in_zone.md
With supervision, you can count the number of objects in a zone in an image or video. In this guide, we will show how to count the number of cars in a traffic video.
View the notebook that accompanies this tutorial.
To make it easier for you to follow our tutorial download the video we will use as an example. You can do this using the supervision.assets module:
from supervision.assets import download_assets, VideoAssets
download_assets(VideoAssets.VEHICLES_2)
First, we need to initialize a model. Let's use a YOLOv8 model with the default COCO checkpoint. We also need to load a video on which to run inference.
Create a YOLO model instance and load the source video using supervision's VideoInfo helper. The model will process each frame during inference, while VideoInfo extracts resolution and frame-rate metadata needed by the polygon zone annotator. A shared color palette ensures consistent zone coloring throughout the output video.
import numpy as np
import supervision as sv
import cv2
from ultralytics import YOLO
model = YOLO("yolov8s.pt")
VIDEO = str(VideoAssets.VEHICLES_2)
colors = sv.ColorPalette.default()
video_info = sv.VideoInfo.from_video_path(VIDEO)
To count objects in a zone, you need to know the coordinates where you want to draw the zone.
You can calculate coordinates using the PolygonZone web utility.
To use the PolygonZone website, you will need to upload an image or frame from a video. You can retrieve a frame using this code:
generator = sv.get_video_frames_generator(VIDEO)
iterator = iter(generator)
frame = next(iterator)
cv2.imwrite("first_frame.png", frame)
PolygonZone will give you NumPy arrays that you can use with supervision to count objects in zones.
<video width="100%" loop muted autoplay> <source src="https://media.roboflow.com/polygonzone.mp4" type="video/mp4"> </video>Save the coordinates in an array:
polygons = [
np.array([[718, 595], [927, 592], [851, 1062], [42, 1059]]),
np.array([[987, 595], [1199, 595], [1893, 1056], [1015, 1062]]),
]
With the coordinates of the zones to draw ready, we can set up our zones:
Instantiate a PolygonZone for each polygon array, pairing it with a PolygonZoneAnnotator for visual overlay and a BoxAnnotator for drawing detection boxes. Each zone will later trigger on incoming detections to determine which objects fall inside its boundaries, enabling per-zone counting in the inference callback.
zones = [
sv.PolygonZone(polygon=polygon, frame_resolution_wh=video_info.resolution_wh)
for polygon in polygons
]
zone_annotators = [
sv.PolygonZoneAnnotator(
zone=zone,
color=colors.by_idx(index),
thickness=4,
text_thickness=8,
text_scale=4,
)
for index, zone in enumerate(zones)
]
box_annotators = [
sv.BoxAnnotator(
color=colors.by_idx(index),
thickness=4,
text_thickness=4,
text_scale=2,
)
for index in range(len(polygons))
]
We can run inference on a video using the sv.process_video function. This function accepts a callback that runs inference on each frame and compiles the results into a video.
Below, we can call our YOLOv8 model, annotate predictions and zones, then save the results to a file called result.mp4.
def process_frame(frame: np.ndarray, i) -> np.ndarray:
results = model(frame, imgsz=1280, verbose=False)[0]
detections = sv.Detections.from_ultralytics(results)
for zone, zone_annotator, box_annotator in zip(
zones, zone_annotators, box_annotators
):
mask = zone.trigger(detections=detections)
detections_filtered = detections[mask]
frame = box_annotator.annotate(
scene=frame, detections=detections_filtered, skip_label=True
)
frame = zone_annotator.annotate(scene=frame)
return frame
sv.process_video(source_path=VIDEO, target_path="result.mp4", callback=process_frame)
Here is an example of inference run on the video:
<video width="100%" loop muted autoplay> <source src="https://blog.roboflow.com/content/media/2023/03/trim-counting.mp4" type="video/mp4"> </video>Create sv.PolygonZone with a polygon defining your region. Call zone.trigger(detections) on each frame — it returns a mask of detections inside the zone.
Yes. Use sv.LineZone — define a start and end point. zone.trigger(detections) returns a tuple of two boolean arrays, (crossed_in, crossed_out), indicating which detections crossed the line in each direction. LineZone requires detections.tracker_id; run a tracker first so the same object can be matched across frames.
Yes. You can pass tracker IDs from sv.ByteTrack alongside your detections, but sv.PolygonZone still evaluates the zone on each frame and reports which objects are currently inside it. If you want to count each object only once when it first enters the zone, maintain a set of seen tracker_id values after filtering detections with zone.trigger(detections), or use a dedicated entry/crossing counting tool such as sv.LineZone when it better matches your use case.