Back to Ultralytics

Troubleshooting Common YOLO Issues

docs/en/guides/yolo-common-issues.md

8.4.8020.3 KB
Original Source

Troubleshooting Common YOLO Issues

<p align="center"> </p>

This guide covers the most common problems you'll hit when working with Ultralytics YOLO26, grouped by where they occur: installation, model training, prediction, and deployment. Jump to the category that matches your error, or scan the FAQ for quick answers. Each entry states the issue and a concrete fix you can apply directly.

<p align="center"> <iframe loading="lazy" width="720" height="405" src="https://www.youtube.com/embed/TG9exsBlkDE" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen> </iframe>

<strong>Watch:</strong> Ultralytics YOLO26 Common Issues | Installation Errors, Model Training Issues

</p>

Common Issues

Installation Errors

Installation errors can arise due to various reasons, such as incompatible versions, missing dependencies, or incorrect environment setups. First, check to make sure you are doing the following:

  • You're using Python 3.8 or later as recommended.
  • Ensure that you have the correct version of PyTorch (1.8 or later) installed.
  • Consider using virtual environments to avoid conflicts.
  • Follow the official installation guide step by step.

Additionally, here are solutions to common installation issues.

!!! tip "Keep Ultralytics and its dependencies current"

Many import, GPU, and export errors are fixed simply by upgrading. Run `pip install -U ultralytics` and confirm your PyTorch and CUDA versions are compatible before deeper debugging.

Import Errors or Dependency Issues

If you get errors importing YOLO26 or run into dependency conflicts, try these steps:

  • Fresh Installation: Starting with a clean install can resolve unexpected issues, especially after updates that change the package structure or functionality.
  • Update Regularly: Use the latest version of the library, since older releases might not be compatible with recent updates.
  • Check Dependencies: Verify that all required dependencies are correctly installed and are of compatible versions.
  • Review Changes: If you installed an older version, refer to the official documentation or changelog to understand any major structural changes.

Running YOLO26 on GPU

If you're having trouble running YOLO26 on a GPU, consider the following troubleshooting steps:

  • Verify CUDA Compatibility and Installation: Ensure your GPU is CUDA compatible and that CUDA is correctly installed. Use the nvidia-smi command to check the status of your NVIDIA GPU and CUDA version.
  • Check PyTorch and CUDA Integration: Ensure PyTorch can utilize CUDA by running import torch; print(torch.cuda.is_available()) in a Python terminal. If it returns 'True', PyTorch is set up to use CUDA.
  • Environment Activation: Ensure you're in the correct environment where all necessary packages are installed.
  • Update Your Packages: Outdated packages might not be compatible with your GPU. Keep them updated.
  • Program Configuration: Check whether your code requests GPU usage. You can set the device argument when running training or prediction (for example, device=0).

!!! warning "Older GPUs and cuDNN 9.11.0+"

Support for GPU architectures earlier than Turing — compute capability (SM) below 7.5, such as the 1080 Ti — was dropped in cuDNN 9.11.0. On an older GPU you may need a build of PyTorch compiled against an earlier CUDA/cuDNN version. Check your setup with:

```python
import torch

cap = torch.cuda.get_device_capability(0) if torch.cuda.is_available() else (0, 0)
cudnn = torch.backends.cudnn.version() or 0
ok = "not compatible" if cudnn >= 91100 and (cap[0] < 7 or (cap[0] == 7 and cap[1] < 5)) else "should be ok"
print(f"Compute capability: SM {cap[0]}.{cap[1]}, cuDNN: {cudnn} => {ok}")
```

Model Training Issues

Common training problems and their fixes are covered below.

Verification of Configuration Settings

Issue: You are unsure whether the configuration settings in the .yaml file are being applied correctly during model training.

Solution: The configuration settings in the .yaml file should be applied when using the model.train() function. To ensure that these settings are correctly applied, follow these steps:

  • Confirm that the path to your .yaml configuration file is correct.

  • Make sure you pass the path to your .yaml file as the data argument when calling model.train(), as shown below:

    python
    model.train(data="/path/to/your/data.yaml", batch=4)
    

Accelerating Training with Multiple GPUs

Issue: Training is slow on a single GPU, and you want to speed up the process using multiple GPUs.

Solution: Increasing the batch size can accelerate training, but it's essential to consider GPU memory capacity. To speed up training with multiple GPUs, follow these steps:

  • Ensure that you have multiple GPUs available.

  • Set the device argument to a list of GPU indices, e.g., device=[0,1,2,3].

  • Increase the batch size accordingly to fully utilize the multiple GPUs without exceeding memory limits.

  • Modify your training command to utilize multiple GPUs:

    python
    # Adjust the batch size and other settings as needed to optimize training speed
    model.train(data="/path/to/your/data.yaml", batch=32, device=[0, 1, 2, 3])
    

Continuous Monitoring Parameters

Issue: You want to know which parameters should be continuously monitored during training, apart from loss.

Solution: While loss is a crucial metric to monitor, it's also essential to track other metrics for model performance optimization. Some key metrics to monitor during training include:

You can access these metrics from the training logs or by using tools like TensorBoard or wandb for visualization. Implementing early stopping based on these metrics can help you achieve better results.

Tools for Tracking Training Progress

Issue: You are looking for recommendations on tools to track training progress.

Solution: To track and visualize training progress, you can consider using the following tools:

  • TensorBoard: TensorBoard is a popular choice for visualizing training metrics, including loss, accuracy, and more. You can integrate it with your YOLO26 training process.
  • Comet: Comet provides an extensive toolkit for experiment tracking and comparison. It allows you to track metrics, hyperparameters, and even model weights. Integration with YOLO models is also straightforward, providing you with a complete overview of your experiment cycle.
  • Ultralytics Platform: Ultralytics Platform offers a specialized environment for tracking YOLO models, giving you a one-stop platform to manage metrics, datasets, and even collaborate with your team. Given its tailored focus on YOLO, it offers more customized tracking options.

How to Check if Training is Happening on the GPU

Issue: The 'device' value in the training logs is 'null,' and you're unsure if training is happening on the GPU.

Solution: The 'device' value being 'null' typically means that the training process is set to automatically select an available GPU, which is the default behavior. To train on a specific GPU, set the device argument when you start training. device is a training argument, so setting it in your dataset .yaml has no effect:

=== "Python"

```python
from ultralytics import YOLO

# Load a model
model = YOLO("yolo26n.pt")

# Use GPU 0; device=[0, 1] for multiple GPUs, device="cpu" for CPU
model.train(data="path/to/data.yaml", device=0)
```

=== "CLI"

```bash
yolo train data=path/to/data.yaml device=0
```

Keep an eye on the 'runs' folder for logs and metrics to monitor training progress effectively.

Key Considerations for Effective Model Training

Here are some things to keep in mind, if you are facing issues related to model training.

Dataset Format and Labels

  • Importance: The foundation of any machine learning model lies in the quality and format of the data it is trained on.
  • Recommendation: Ensure that your custom dataset and its associated labels adhere to the expected format. It's crucial to verify that annotations are accurate and of high quality. Incorrect or subpar annotations can derail the model's learning process, leading to unpredictable outcomes.

Model Convergence

  • Importance: Achieving model convergence ensures that the model has sufficiently learned from the training data.
  • Recommendation: When training a model 'from scratch', it's vital to ensure that the model reaches a satisfactory level of convergence. This might necessitate a longer training duration, with more epochs, compared to when you're fine-tuning an existing model.

Learning Rate and Batch Size

  • Importance: These hyperparameters play a pivotal role in determining how the model updates its weights during training.
  • Recommendation: Regularly evaluate if the chosen learning rate and batch size are optimal for your specific dataset. Parameters that are not in harmony with the dataset's characteristics can hinder the model's performance.

Class Distribution

  • Importance: The distribution of classes in your dataset can influence the model's prediction tendencies.
  • Recommendation: Regularly assess the distribution of classes within your dataset. If there's a class imbalance, there's a risk that the model will develop a bias towards the more prevalent class. This bias can be evident in the confusion matrix, where the model might predominantly predict the majority class.

Cross-Check with Pretrained Weights

  • Importance: Leveraging pretrained weights can provide a solid starting point for model training, especially when data is limited.
  • Recommendation: As a diagnostic step, consider training your model using the same data but initializing it with pretrained weights. If this approach yields a well-formed confusion matrix, it could suggest that the 'from scratch' model might require further training or adjustments.

Common problems encountered during model prediction and their fixes are covered below.

Getting Bounding Box Predictions With Your YOLO26 Custom Model

Issue: When running predictions with a custom YOLO26 model, there are challenges with the format and visualization of the bounding box coordinates.

Solution:

  • Coordinate Format: YOLO26 provides bounding box coordinates in absolute pixel values. To convert these to relative coordinates (ranging from 0 to 1), you need to divide by the image dimensions. For example, let's say your image size is 640x640. Then you would do the following:

    python
    # Convert absolute coordinates to relative coordinates
    x1 = x1 / 640  # Divide x-coordinates by image width
    x2 = x2 / 640
    y1 = y1 / 640  # Divide y-coordinates by image height
    y2 = y2 / 640
    
  • File Name: To obtain the file name of the image you're predicting on, access the image file path directly from the result object within your prediction loop.

Filtering Objects in YOLO26 Predictions

Issue: Facing issues with how to filter and display only specific objects in the prediction results when running YOLO26 using the Ultralytics library.

Solution: To detect specific classes use the classes argument to specify the classes you want to include in the output. For instance, to detect only cars (assuming 'cars' have class index 2):

bash
yolo task=segment mode=predict model=yolo26n-seg.pt source='path/to/car.mp4' show=True classes=2

Understanding Precision Metrics in YOLO26

Issue: Confusion regarding the difference between box precision, mask precision, and confusion matrix precision in YOLO26.

Solution: Box precision measures the accuracy of predicted bounding boxes compared to the actual ground truth boxes using IoU (Intersection over Union) as the metric. Mask precision assesses the agreement between predicted segmentation masks and ground truth masks in pixel-wise object classification. Confusion matrix precision, on the other hand, focuses on overall classification accuracy across all classes and does not consider the geometric accuracy of predictions. It's important to note that a bounding box can be geometrically accurate (true positive) even if the class prediction is wrong, leading to differences between box precision and confusion matrix precision. These metrics evaluate distinct aspects of a model's performance, reflecting the need for different evaluation metrics in various tasks.

Extracting Object Dimensions in YOLO26

Issue: Difficulty in retrieving the length and height of detected objects in YOLO26, especially when multiple objects are detected in an image.

Solution: To retrieve the bounding box dimensions, first use the Ultralytics YOLO26 model to predict objects in an image. Then, extract the width and height information of bounding boxes from the prediction results.

python
from ultralytics import YOLO

# Load a pretrained YOLO26 model
model = YOLO("yolo26n.pt")

# Specify the source image
source = "https://ultralytics.com/images/bus.jpg"

# Make predictions
results = model.predict(source, save=True, imgsz=320, conf=0.25)

# Extract bounding box dimensions
boxes = results[0].boxes.xywh.cpu()
for box in boxes:
    x, y, w, h = box
    print(f"Width of Box: {w}, Height of Box: {h}")

Deployment Challenges

GPU Deployment Issues

Issue: Deploying models in a multi-GPU environment can sometimes lead to unexpected behaviors like unexpected memory usage, inconsistent results across GPUs, etc.

Solution: Check for default GPU initialization. Some frameworks, like PyTorch, might initialize CUDA operations on a default GPU before transitioning to the designated GPUs. To bypass unexpected default initializations, specify the GPU directly during deployment and prediction. Then, use tools to monitor GPU utilization and memory usage to identify any anomalies in real-time. Also, ensure you're using the latest version of the framework or library.

Model Conversion/Exporting Issues

Issue: During the process of converting or exporting machine learning models to different formats or platforms, users might encounter errors or unexpected behaviors.

Solution: Review the supported formats and per-format options in the Export mode documentation, then work through these checks:

  • Compatibility Check: Ensure that you are using versions of libraries and frameworks that are compatible with each other. Mismatched versions can lead to unexpected errors during conversion.
  • Environment Reset: If you're using an interactive environment like Jupyter or Colab, consider restarting your environment after making significant changes or installations. A fresh start can sometimes resolve underlying issues.
  • Official Documentation: Always refer to the official documentation of the tool or library you are using for conversion. It often contains specific guidelines and best practices for model exporting.
  • Community Support: Check the library or framework's official repository for similar issues reported by other users. The maintainers or community might have provided solutions or workarounds in discussion threads.
  • Update Regularly: Ensure that you are using the latest version of the tool or library. Developers frequently release updates that fix known bugs or improve functionality.
  • Test Incrementally: Before performing a full conversion, test the process with a smaller model or dataset to identify potential issues early on.

Community and Support

Get help and share solutions through these channels and resources.

Forums and Channels for Getting Help

GitHub Issues: The YOLO26 repository on GitHub has an Issues tab where you can ask questions, report bugs, and suggest new features. The community and maintainers are active here, and it's a great place to get help with specific problems.

Ultralytics Discord Server: Ultralytics has a Discord server where you can interact with other users and the developers.

Official Documentation and Resources

Ultralytics YOLO26 Docs: The official documentation provides a comprehensive overview of YOLO26, along with guides on installation, usage, and troubleshooting.

Conclusion

Most YOLO26 issues trace back to a handful of causes: version mismatches, dataset formatting, and GPU configuration. When an error isn't covered here, search the GitHub Issues tab or ask on the Discord server — chances are someone has already solved it. For deeper training problems, see the Model Training Tips guide for practical advice on getting better results with your computer vision projects.

FAQ

How do I resolve installation errors with YOLO26?

Installation errors can often be due to compatibility issues or missing dependencies. Ensure you use Python 3.8 or later and have PyTorch 1.8 or later installed. It's beneficial to use virtual environments to avoid conflicts. For a step-by-step installation guide, follow our official installation guide. If you encounter import errors, try a fresh installation or update the library to the latest version.

Why is my YOLO26 model training slow on a single GPU?

Training on a single GPU might be slow due to large batch sizes or insufficient memory. To speed up training, use multiple GPUs. Ensure your system has multiple GPUs available and set the device argument, e.g., device=[0,1,2,3]. Increase the batch size accordingly to fully utilize the GPUs without exceeding memory limits. Example command:

python
model.train(data="/path/to/your/data.yaml", batch=32, device=[0, 1, 2, 3])

How can I ensure my YOLO26 model is training on the GPU?

If the 'device' value shows 'null' in the training logs, it generally means the training process is set to automatically select an available GPU. To explicitly assign a specific GPU, pass the device argument when you start training, for example yolo train data=path/to/data.yaml device=0 for the first GPU. Consult the nvidia-smi command to confirm your CUDA setup.

How can I monitor and track my YOLO26 model training progress?

Tracking and visualizing training progress can be efficiently managed through tools like TensorBoard, Comet, and Ultralytics Platform. These tools allow you to log and visualize metrics such as loss, precision, recall, and mAP. Implementing early stopping based on these metrics can also help achieve better training outcomes.

What should I do if YOLO26 is not recognizing my dataset format?

Ensure your dataset and labels conform to the expected format. Verify that annotations are accurate and of high quality. If you face any issues, refer to the Data Collection and Annotation guide for best practices. For more dataset-specific guidance, check the Datasets section in the documentation.