demo.ipynb
We write your reusable computer vision tools. Whether you need to load your dataset from your hard drive, draw detections on an image or video, or count how many detections are in a zone. You can count on us! 🤝
We hope that the resources in this notebook will help you get the most out of Supervision. Please browse the Supervision Docs for details, raise an issue on GitHub for support, and join our discussions section for questions!
BoxAnnotator, LabelAnnotatorMaskAnnotatorclass_idconfidenceVideoInfoget_video_frames_generatorVideoSinkDetectionDataset.from_yolosplitDetectionDataset.as_pascal_vocNOTE: In this notebook, we aim to show - among other things - how simple it is to integrate supervision with popular object detection and instance segmentation libraries and frameworks. GPU access is optional but will certainly make the ride smoother.
Let's make sure that we have access to GPU. We can use nvidia-smi command to do that. In case of any problems navigate to Edit -> Notebook settings -> Hardware accelerator, set it to GPU, and then click Save.
!nvidia-smi
NOTE: To make it easier for us to manage datasets, images and models we create a HOME constant.
import os
HOME = os.getcwd()
print(HOME)
NOTE: During our demo, we will need some example images.
!mkdir {HOME}/images
NOTE: Feel free to use your images. Just make sure to put them into images directory that we just created. ☝️
%cd {HOME}/images
!wget -q https://media.roboflow.com/notebooks/examples/dog.jpeg
!wget -q https://media.roboflow.com/notebooks/examples/dog-2.jpeg
!wget -q https://media.roboflow.com/notebooks/examples/dog-3.jpeg
!wget -q https://media.roboflow.com/notebooks/examples/dog-4.jpeg
!wget -q https://media.roboflow.com/notebooks/examples/dog-5.jpeg
!wget -q https://media.roboflow.com/notebooks/examples/dog-6.jpeg
!wget -q https://media.roboflow.com/notebooks/examples/dog-7.jpeg
!wget -q https://media.roboflow.com/notebooks/examples/dog-8.jpeg
!pip install -q supervision
import supervision as sv
print(sv.__version__)
(np.ndarray): An array of shape (n, 4) containing the bounding boxes coordinates in format [x1, y1, x2, y2](Optional[np.ndarray]): An array of shape (n, W, H) containing the segmentation masks.(Optional[np.ndarray]): An array of shape (n,) containing the confidence scores of the detections.(Optional[np.ndarray]): An array of shape (n,) containing the class ids of the detections.(Optional[np.ndarray]): An array of shape (n,) containing the tracker ids of the detections.NOTE: In our example, we will focus only on integration with YOLO-NAS and YOLOv8. However, keep in mind that supervision allows seamless integration with many other models like SAM, Transformers, and YOLOv5. You can learn more from our documentation.
import cv2
IMAGE_PATH = f"{HOME}/images/dog.jpeg"
image = cv2.imread(IMAGE_PATH)
!pip install -q super-gradients
from super_gradients.training import models
model = models.get("yolo_nas_l", pretrained_weights="coco")
result = model.predict(image)
detections = sv.Detections.from_yolo_nas(result)
"detections", len(detections)
!pip install -q "ultralytics<=8.3.40"
from ultralytics import YOLO
model = YOLO("yolov8m.pt")
result = model(image, verbose=False)[0]
detections = sv.Detections.from_ultralytics(result)
model = YOLO("yolov8m-seg.pt")
result = model(image, verbose=False)[0]
detections_segmentation = sv.Detections.from_ultralytics(result)
"detections", len(detections)
box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()
annotated_image = box_annotator.annotate(image.copy(), detections=detections)
annotated_image = label_annotator.annotate(annotated_image, detections=detections)
sv.plot_image(image=annotated_image, size=(8, 8))
NOTE: By default sv.LabelAnnotator use corresponding class_id as label, however, the labels can have arbitrary format.
box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()
labels = [
f"{model.model.names[class_id]} {confidence:.2f}"
for class_id, confidence in zip(detections.class_id, detections.confidence)
]
annotated_image = box_annotator.annotate(
image.copy(),
detections=detections,
)
annotated_image = label_annotator.annotate(
annotated_image, detections=detections, labels=labels
)
sv.plot_image(image=annotated_image, size=(8, 8))
mask_annotator = sv.MaskAnnotator(color_lookup=sv.ColorLookup.INDEX)
annotated_image = mask_annotator.annotate(
image.copy(), detections=detections_segmentation
)
sv.plot_image(image=annotated_image, size=(8, 8))
NOTE: sv.Detections filter API allows you to access detections by index, index list or index slice
detections_index = detections[0]
detections_index_list = detections[[0, 1, 3]]
detections_index_slice = detections[:2]
box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()
images = []
for d in [detections_index, detections_index_list, detections_index_slice]:
annotated_image = box_annotator.annotate(image.copy(), detections=d)
annotated_image = label_annotator.annotate(annotated_image, detections=d)
images.append(annotated_image)
titles = [
"by index - detections[0]",
"by index list - detections[[0, 1, 3]]",
"by index slice - detections[:2]",
]
sv.plot_images_grid(images=images, titles=titles, grid_size=(1, 3))
NOTE: Let's use sv.Detections filter API to display only objects with class_id == 0
detections_filtered = detections[detections.class_id == 0]
box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()
annotated_image = box_annotator.annotate(image.copy(), detections=detections_filtered)
annotated_image = label_annotator.annotate(
annotated_image, detections=detections_filtered
)
sv.plot_image(image=annotated_image, size=(8, 8))
NOTE: Let's use sv.Detections filter API to display only objects with confidence > 0.7
detections_filtered = detections[detections.confidence > 0.7]
box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()
labels = []
for class_id, confidence in zip(
detections_filtered.class_id, detections_filtered.confidence
):
labels.append(f"{model.model.names[class_id]} {confidence:.2f}")
annotated_image = box_annotator.annotate(
image.copy(),
detections=detections_filtered,
)
annotated_image = label_annotator.annotate(
annotated_image, detections=detections_filtered, labels=labels
)
sv.plot_image(image=annotated_image, size=(8, 8))
NOTE: Let's use sv.Detections filter API allows you to build advanced logical conditions. Let's select only detections with class_id != 0 and confidence > 0.7.
detections_filtered = detections[
(detections.class_id != 0) & (detections.confidence > 0.7)
]
box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()
labels = [
f"{class_id} {confidence:.2f}"
for class_id, confidence in zip(
detections_filtered.class_id, detections_filtered.confidence
)
]
annotated_image = box_annotator.annotate(
image.copy(),
detections=detections_filtered,
)
annotated_image = label_annotator.annotate(
annotated_image, detections=detections_filtered, labels=labels
)
sv.plot_image(image=annotated_image, size=(8, 8))
NOTE: supervision offers a lot of utils to make working with videos easier. Let's take a look at some of them.
NOTE: During our demo, we will need some example videos.
!pip install -q supervision[assets]
!mkdir {HOME}/videos
NOTE: Feel free to use your videos. Just make sure to put them into videos directory that we just created. ☝️
%cd {HOME}/videos
from supervision.assets import VideoAssets, download_assets
download_assets(VideoAssets.VEHICLES)
VIDEO_PATH = VideoAssets.VEHICLES.value
NOTE: VideoInfo allows us to easily retrieve information about video files, such as resolution, FPS and total number of frames.
sv.VideoInfo.from_video_path(video_path=VIDEO_PATH)
frame_generator = sv.get_video_frames_generator(source_path=VIDEO_PATH)
frame = next(iter(frame_generator))
sv.plot_image(image=frame, size=(8, 8))
RESULT_VIDEO_PATH = f"{HOME}/videos/vehicle-counting-result.mp4"
NOTE: Note that this time we have given a custom value for the stride parameter equal to 2. As a result, get_video_frames_generator will return us every second video frame.
video_info = sv.VideoInfo.from_video_path(video_path=VIDEO_PATH)
with sv.VideoSink(target_path=RESULT_VIDEO_PATH, video_info=video_info) as sink:
for frame in sv.get_video_frames_generator(source_path=VIDEO_PATH, stride=2):
sink.write_frame(frame=frame)
NOTE: If we once again use VideoInfo we will notice that the final video has 2 times fewer frames.
sv.VideoInfo.from_video_path(video_path=RESULT_VIDEO_PATH)
NOTE: In order to demonstrate the capabilities of the Dataset API, we need a dataset. Let's download one from Roboflow Universe. To do this we first need to install the roboflow pip package.
!pip install -q roboflow
!mkdir {HOME}/datasets
%cd {HOME}/datasets
import roboflow
from roboflow import Roboflow
roboflow.login()
rf = Roboflow()
project = rf.workspace("roboflow-jvuqo").project("fashion-assistant-segmentation")
dataset = project.version(5).download("yolov8")
NOTE: Currently Dataset API always loads loads images from hard drive. In the future, we plan to add lazy loading.
ds = sv.DetectionDataset.from_yolo(
images_directory_path=f"{dataset.location}/train/images",
annotations_directory_path=f"{dataset.location}/train/labels",
data_yaml_path=f"{dataset.location}/data.yaml",
)
len(ds)
ds.classes
IMAGE_NAME = next(iter(ds.images.keys()))
image = ds.images[IMAGE_NAME]
annotations = ds.annotations[IMAGE_NAME]
box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()
mask_annotator = sv.MaskAnnotator()
labels = [f"{ds.classes[class_id]}" for class_id in annotations.class_id]
annotated_image = mask_annotator.annotate(image.copy(), detections=annotations)
annotated_image = box_annotator.annotate(annotated_image, detections=annotations)
annotated_image = label_annotator.annotate(
annotated_image, detections=annotations, labels=labels
)
sv.plot_image(image=annotated_image, size=(8, 8))
ds_train, ds_test = ds.split(split_ratio=0.8)
"ds_train", len(ds_train), "ds_test", len(ds_test)
ds_train.as_pascal_voc(
images_directory_path=f"{HOME}/datasets/result/images",
annotations_directory_path=f"{HOME}/datasets/result/labels",
)