Software To Help You Turn Your Data Into AI
Forget fragmented workflows, annotation tools, and Notebooks for building AI applications. Encord Data Engine accelerates every step of taking your model into production.
2023 is upon us! Encord is happy to be back in the office after re-charging over the holidays, and charging forward to enhance the annotation and machine learning development experience of our partners. We're starting the year off with several strong improvements to the DICOM tool! We're balancing with universal improvements to the label editor, and setting up foundational changes in our workflows capabilities. Please read and review!
When making annotations, it’s often important to have context beyond the image data itself, such as confirming the study identifier, or which manufacturer it came from. You can now access the metadata for a given series by clicking the icon next to the view angle selection interface, allowing you to confirm any of necessary details. Conveniently drag the metadata overview to a convenient location so you can monitor while you annotate!
Medical professional often need to view corresponding slices from multiple series in the same study simultaneously. We've already been providing the ability to view slices from different series in one session, and you will soon be able to submit the entire study at once when done with your annotations. This will allow you to treat studies, not just series, as units of annotation work — making it easier to progress through your work in the way that makes sense to you.
Annotating tasks that require a specific arrangement of series and study data can be made much easier via not only custom window arrangements, but pre-populated the view windows with the data you need. Encord is soon rolling out custom hanging protocol presets which can automatically fill with the correct series and slice information based on the type of DICOM data being loaded. Contact us about setting up hanging protocol presets for your data type and modality, and get ready for the next level of productivity.
Now that you've got your view windows arranged just how you want, take full advantage by annotating in those views. The first version of being able to annotate in other views, will allow you to use the 'Start annotating' button to set a main annotation window and annotate in that window going forward. Future iterations are also in the works to allow seamlessly annotating in all views without the switch button. Don’t hesitate to be in touch if you feel this is something you or your team would benefit from.
Determining the best scoring metric for automated benchmark QA functionality, whether in production or training projects, is often an iterative, empirical process. We’re introducing the ability to update benchmark scoring parameters in ongoing projects to help teams iterate and discover the system that works for them. Use the benchmark function editor in the project’s settings, and then hit ‘Recalculate scores’ in the performance dashboard to test the effects of your changes. As always, contact us if you’re interested in our benchmark QA or annotator training functionality.
Working with deep, complex ontologies can sometimes be a delicate operation to get everything right. In order to make annotating complex objects a more stress-free task you can now clear your selection of radio-button ontology attributes as well. So go on, click around. No worries.
And for those working with special image or video data, we’ve added gamma correction in addition to standard brightness and contrast controls, to make it easier to work with media featuring annotations which may be hard to distinguish from other background components. We would love to hear how these tools amplify your annotation output!
Large data uploads can often be interrupted by unstable network conditions. Now, you can use the new 'skip_duplicate_urls' parameter in the upload specification files to remove the hastle involved in de-duplicating data on interrupted upload tasks! Details in the documentation.
We're winding down this month's update, but just winding up for the year. DICOM and related tools have seen a large bump in power and versatility over the last month — don’t hesitate to be in touch about how you can use these features, or if you’re curious about the upcoming evolution of these powerful tools. We’re looking forward to expanding more universal features such as enhanced workflow management and collaboration tools soon. Stay tuned and stay in touch, 2023 here we come!
Join the Encord Developers community to discuss the latest in computer vision, machine learning, and data-centric AI
Join the communityRelated Blogs
In computer vision, where accurate training data is the lifeblood of successful models, video annotation plays an important role. However, annotating each frame individually is time-consuming and prone to inconsistencies. Nearby frames often exhibit visual similarities, and annotations made on one frame can be extrapolated to others. Enter automated polygon and bitmask tracking! Automated segmentation tracking significantly reduces annotation time, while simultaneously improving accuracy - gone are the days of tediously labeling every frame in a video. Polygon and Bitmask tracking provides the tooling required to build labeled training data at scale and at speed. Polygon tracking meticulously outlines objects with a series of interconnected vertices, offering precision and flexibility unparalleled in video annotation. Conversely, Bitmask tracking simplifies the annotation process by representing object masks as binary images, streamlining efficiency without compromising clarity. Join us as we explore these techniques that are not just enhancing the process of video annotation, but also paving the way for more accurate and efficient machine learning models. 🚀 Understanding Polygon and Bitmask Tracking Polygon Tracking A polygon is a geometric shape defined by a closed loop of straight-line segments. It can have three or more sides, forming a boundary around an area. In video annotation, polygons are used to outline objects of interest within frames. By connecting a series of vertices, we create a polygon that encapsulates the object’s shape. Advantages of Polygon-Based Tracking Accurate Boundary Representation: Polygons provide a precise representation of an object’s boundary. Unlike bounding boxes (which are rectangular and may not align perfectly with irregular shapes), polygons can closely follow the contours of complex objects. Flexibility: Polygons are versatile. They can adapt to various object shapes, including non-rectangular ones. Whether you’re tracking a car, a person, or an animal, polygons allow for flexibility in annotation. Use Cases of Polygon Tracking Object Segmentation: When segmenting objects from the background, polygons excel. For instance, in medical imaging, they help delineate tumors or organs. Motion Analysis: Tracking moving objects often involves polygon-based annotation. Analyzing the trajectory of a soccer ball during a match or monitoring pedestrian movement in surveillance videos are examples. Bitmask Tracking A bitmask is a binary image where each pixel corresponds to a specific object label. Instead of outlining the object’s boundary, bitmasks assign a unique value (usually an integer) to each pixel within the object region. These values act as identifiers, allowing pixel-level annotation. Advantages of Bitmask-Based Tracking Bitmasks enable precise delineation at the pixel level. By assigning values to individual pixels, we achieve accurate object boundaries. This is especially useful when dealing with intricate shapes or fine details. Use Cases of Bitmask Tracking Semantic Segmentation: In semantic segmentation tasks, where the goal is to classify each pixel into predefined classes (e.g., road, sky, trees), bitmasks play a vital role. They provide ground truth labels for training deep learning models. Instance Segmentation: For scenarios where multiple instances of the same object class appear in a frame (e.g., identifying individual cars in a traffic scene), bitmask tracking ensures each instance is uniquely labeled. Temporal Consistency Maintaining temporal consistency when annotating objects in a video is crucial. This means that the annotations for an object should be consistent from one frame to the next. Inconsistent annotations can lead to inaccurate results when the annotated data is used for training machine learning models. Temporal smoothing and interpolation techniques can be used to improve the consistency of the tracking. Temporal smoothing involves averaging the annotations over several frames to reduce the impact of any sudden changes. Interpolation, on the other hand, involves estimating the annotations for missing frames based on the annotations of surrounding frames. Both these techniques can greatly improve the quality and consistency Read the documentation, to know how to use interpolation in your annotation.  Applications of Polygon and Bitmask Tracking Object Detection and Tracking With polygon tracking, objects of any shape can be accurately annotated, making it particularly useful for tracking objects that have irregular shapes or change shape over time. Bitmask tracking takes this a step further by marking each individual pixel, capturing even the smallest details of the object. This level of precision is crucial for detecting and tracking objects accurately within a video. Semantic Segmentation In semantic segmentation, the goal is to classify each pixel in the image to a particular class, making it a highly detailed task. Bitmask tracking, with its ability to mark each individual pixel, is perfectly suited for this task. It allows for the creation of highly accurate masks that can be used to train models for semantic segmentation. Polygon tracking can also be used for semantic segmentation, especially in scenarios where the objects being segmented have clear, defined boundaries. Interactive Video Editing Interactive video editing is a process where users can manipulate and modify video content. This involves tasks such as object removal, color grading, and adding special effects. Polygon and bitmask tracking can greatly enhance the process of interactive video editing. With these techniques, objects within the video can be accurately tracked and annotated, making it easier to apply edits consistently across multiple frames. This can lead to more seamless and high-quality edits, improving the overall video editing process. Semantic Context and Automation Semantic Context Scene Understanding: When placing polygons or bitmasks for video annotation, it’s crucial to consider the context of the scene. The semantics of the scene can guide accurate annotations. For instance, understanding the environment, the objects present, and their spatial relationships can help in placing more accurate and meaningful annotations. Object Relationships: The way objects interact within a scene significantly affects annotation choices. Interactions such as occlusion (where one object partially or fully hides another) and containment (where one object is inside another) need to be considered. Understanding these relationships can lead to more accurate and contextually relevant annotations. Automated Annotation Tool AI Assitance: With the advancement of machine learning models, we now have the capability to propose initial annotations automatically. These AI tools can significantly reduce the manual effort required in the annotation process. They can quickly analyze a video frame and suggest potential annotations based on learned patterns and features. Human Refinement: While AI tools can propose initial annotations, human annotators are still needed to refine these automated results for precision. Annotators can correct any inaccuracies and add nuances that the AI might have missed. This combination of AI assistance and human refinement leads to a more efficient and accurate video annotation process. Read the blog The Full Guide to Automated Data Annotation for more information.  Real-World Applications Polygon and Bitmask tracking, along with the concepts of semantic context and automation, have a wide range of real-world applications. Here are a few key areas where they are making a significant impact: Medical Imaging: In medical imaging, precise annotation can mean the difference between a correct and incorrect diagnosis. These techniques allow for highly accurate segmentation of medical images, which can aid in identifying and diagnosing a wide range of medical conditions. Autonomous Vehicles: Polygon and Bitmask tracking allow these vehicles to understand their environment in great detail, helping them make better driving decisions. Video Surveillance: In video surveillance, tracking objects accurately over time is key to identifying potential security threats. These techniques can improve the accuracy and efficiency of video surveillance systems, making our environments safer. These are just a few examples of the many possible applications of Polygon and Bitmask tracking. As these techniques continue to evolve, they are set to revolutionize numerous industries and fields. In summary, Polygon and Bitmask tracking are transforming video annotation, paving the way for more precise machine learning models. As we continue to innovate in this space, we’re excited to announce that Encord will be releasing new features soon. Stay tuned for these updates and join us in exploring the future of computer vision with Encord. 🚀
March 22
Model validation is a key machine learning (ML) lifecycle stage, ensuring models generalize well to new, unseen data. This process is critical for evaluating a model's predictions independently from its training dataset, thus testing its ability to perform reliably in the real world. Model validation helps identify overfitting—where a model learns noise rather than the signal in its training data—and underfitting, where a model is too simplistic to capture complex data patterns. Both are detrimental to model performance. Techniques like the holdout method, cross-validation, and bootstrapping are pivotal in validating model performance, offering insights into how models might perform on unseen data. These methods are integral to deploying AI and machine learning models that are both reliable and accurate. This article delves into two parts: Key model validation techniques, the advantages of a data-centric approach, and how to select the most appropriate validation method for your project. How to validate a Mask R-CNN pre-trained model that segments instances in COVID-19 scans using Encord Active, a data-centric platform for evaluating and validating computer vision (CV) models. Ready to dive deeper into model validation and discover how Encord Active can enhance your ML projects? Let’s dive in! The Vital Role of a Data-Centric Approach in Model Validation A data-centric approach to model validation places importance on the quality of data in training and deploying computer vision (CV) and artificial intelligence (AI) models. The approach recognizes that the foundation of any robust AI system lies not in the complexity of its algorithms but in the quality of the data it learns from. High-quality, accurately labeled data (with ground truth) ensures that models can truly understand and interpret the nuances of the tasks they are designed to perform, from predictive analytics to real-time decision-making processes. Why Data Quality is Paramount The quality of training data is directly proportional to a model's ability to generalize from training to real-world applications. Poor data quality—including inaccuracies, biases, label errors, and incompleteness—leads to models that are unreliable, biased, or incapable of making accurate predictions. A data-centric approach prioritizes meticulous data preparation, including thorough data annotation, cleaning, and validation. This ensures the data distribution truly reflects the real world it aims to model and reduces label errors. Improving Your Model’s Reliability Through Data Quality The reliability of CV models—and even more recently, foundation models—in critical applications—such as healthcare imaging and autonomous driving—cannot be overstated. A data-centric approach mitigates the risks associated with model failure by ensuring the data has high fidelity. It involves rigorous validation checks and balances, using your expertise and automated data quality tools to continually improve your label quality and datasets. Adopt a data-centric approach to your AI project and unlock its potential by downloading our whitepaper.  Key Computer Vision Model Validation Techniques A data-centric approach is needed to validate computer vision models after model training that looks at more than just performance and generalizability. They also need to consider the unique problems of visual data, like how image quality, lighting, and perspectives can vary. Tailoring the common validation techniques specifically for computer vision is about robustly evaluating the model's ability to analyze visual information and embeddings across diverse scenarios: Out-of-Sample Validation: Essential for verifying that a CV model can generalize from its training data to new, unseen images or video streams. This approach tests the model's ability to handle variations in image quality, lighting, and subject positioning that it hasn't encountered during training. Cross-Validation and Stratified K-Fold: Particularly valuable in computer vision is ensuring that every aspect of the visual data is represented in both training and validation sets. Stratified K-Fold is beneficial when dealing with imbalanced datasets, common in computer vision tasks, to maintain an equal representation of classes across folds. Leave-One-Out Cross-Validation (LOOCV): While computationally intensive, LOOCV can be particularly insightful for small image datasets where every data point's inclusion is crucial for assessing the model's performance on highly nuanced visual tasks. Bootstrapping: Offers insights into the stability of model predictions across different visual contexts. This method helps understand how training data subset changes can affect the model's performance, which is particularly relevant for models expected to operate in highly variable visual environments. Adversarial Testing: Tests the model's resilience against slight, often invisible, image changes. This technique is critical to ensuring models are not easily perturbed by minor alterations that would not affect human perception. Domain-Specific Benchmarks: Participating in domain-specific challenges offered by ImageNet, COCO, or PASCAL VOC can be a reliable validation technique. These benchmarks provide standardized datasets and metrics, allowing for evaluating a model's performance against a wide range of visual tasks and conditions, ensuring it meets industry standards. Human-in-the-Loop: Involving domain experts in the validation process is invaluable, especially for tasks requiring fine-grained visual distinctions (e.g., medical imaging or facial recognition). This approach helps ensure that the model's interpretations align with human expertise and can handle the subtleties of real-world visual data. Ensuring a model can reliably interpret and analyze visual information across various conditions requires a careful balance between automated validation methods and human expertise. Choosing the right validation techniques for CV models involves considering the dataset's diversity, the computational resources available, and the application's specific requirements. Luckily, there are model validation tools that can help you focus on validating the model. At the same time, they do the heavy lifting of providing the insights necessary to validate your CV model’s performance, including providing AI-assisted evaluation features. But before walking through Encord Active, let’s understand the factors you need to consider for choosing the right tool. How to Choose the Right Computer Vision Model Validation Tool When choosing the right model validation tool for computer vision projects, several key factors come into play, each addressing the unique challenges and requirements of working with image data. These considerations ensure that the selected tool accurately evaluates the model's performance and aligns with the project's specific demands. Here's a streamlined guide to making an informed choice: Data Specificity and Complexity: Opt for tools that cater to the variability and complexity inherent in image data. This means capabilities for handling image-specific metrics such as Intersection over Union (IoU) for object detection and Mean Absolute Error (MAE) for tasks like classification and segmentation are crucial. Robust Data Validation: The tool should adeptly manage image data peculiarities, including potential discrepancies between image annotations and the actual images. Look for features that support comprehensive data validation across various stages of the model development cycle, including pre-training checks and ongoing training validations. Comprehensive Evaluation Metrics: Essential for thoroughly assessing a computer vision model's performance. The tool should offer a wide array of metrics, including precision-recall curves, ROC curves, and confusion matrices for classification, alongside task-specific metrics like IoU for object detection. It should also support quality metrics for a more holistic, real-world evaluation. Versatile Performance Evaluation: It should support a broad spectrum of evaluation techniques for deep insights into accuracy, the balance between precision and recall, and the model’s ability to distinguish between different classes. Dataset Management: The validation tool should help with efficient dataset handling for proper training-validation splits. For the sake of performance and scale, it should be able to manage large datasets. Flexibility and Customization: The fast-paced nature of computer vision demands tools that allow for customization and flexibility. This includes introducing custom metrics, supporting various data types and model architectures, and adapting to specific preprocessing and integration needs. Considering those factors, you can select a validation tool (open-source toolkits, platforms, etc.) that meets your project's requirements and contributes to developing reliable models. Using Encord Active to Validate the Performance of Your Computer Vision Model Encord Active (EA) is a data-centric model validation solution that enables you to curate valuable data that can truly validate your model’s real-world generalizability through quality metrics. In this section, you will see how you can analyze the performance of a pre-trained Mask R-CNN object detection model with Encord Active on COVID-19 predictions. From the analysis results, you will be able to validate and, if necessary, debug your model's performance. This walkthrough uses Encord Annotate to create a project and import the dataset. We use Encord Active Cloud to analyze the model’s failure modes. We recommend you sign up for an Encord account to follow this guide.  Import Predictions Import your predictions onto the platform. Learn how to import Predictions in the documentation. Select the Prediction Set you just uploaded, and Encord Active will use quality data, label, and model quality metrics to evaluate the performance of your model: Visualize Model Performance Summary on the Validation Set Evaluate the model’s performance by inspecting the Model Summary dashboard to get an overview of your model’s performance on the validation set with details error categorization (true positive vs. false positive vs. false negative), the F1 score, and mean average precision/recall based on a confidence (IoU) threshold: Manually Inspect the Model Results Beyond visualizing a summary of the model’s performance, using a tool that allows you to manually dig in and inspect how your model works on real-world samples is more than helpful. Encord Active provides an Explorer tab that enables you to filter models by metrics to observe the impact of metrics on real-world samples. EA’s data-centric build also lets you see how your model correctly or incorrectly makes predictions (detects, classifies, or segments) on the training, validation, and production samples. Let’s see how you can achieve this: On the Model Summary dashboard, → Click True Positive Count metric to inspect the predictions your model got right: Click on one of the images using the expansion icon to see how well the model detects the class, the confidence score with which it predicts the object, other scores on performance metrics, and metadata. Still under the Explorer tab → Click on Overview (the tab on the right) → Click on False Positive Count to inspect instances that the model failed to detect correctly It seems most classes flagged as False Positives are due to poor object classification quality (the annotations are not 100% accurate). Let’s look closely at an instance: In that instance, the model correctly predicts that the object is ‘Cardiomediastinum’. Still, the second overlapping annotation has a broken track for some reason, so Encord Active classifies its prediction as false positive using a combination of Broken Object Track and other relevant quality metrics. Under Filter → Add filter, you will see parameters and attributes to filter your model’s performance. For example, if you added your validation set to Active through Annotate, you can validate your model’s performance on that set and, likewise, on the production set. Visualize the Impact of Metrics on Model Performance Evaluate the model outcome count to understand the distribution of the correct and incorrect results for each class. Under the Model Evaluation tab → Click on Outcome to see the distribution chart: Now, you should see the count for the number of predictions the model gets wrong. Using this chart, you can get a high-level perspective on the issues with your model. In this case, the model fails to segment the ‘Airways’ object in the instances correctly. The Intersection-of-Union (IoU) Threshold is 0.5, the threshold for the model’s confidence in its predictions. Use the IOU Threshold slider under the Overview tab to see the outcome count based on a higher or lower threshold. You can also select specific classes you want to inspect under the Classes option. Dig Deeper into the Metrics Once you understand the model outcome count, you can dig deeper into specific metrics like precision, recall, and F1 scores if they are relevant to your targets. Notice the low precision, recall, and F1 scores per class! Also, group the scores by the model outcome count to understand how the model performs in each class. You could also use the precision-recall curve to analyze and highlight the classes harder for the model to detect with high confidence. Also break down the model’s precision and recall values for the predictions of each object over the relevant metrics you want to investigate. For example, if you want to see the precision and recall by the Object Classification Quality metric, under Metric Performance → Select the Metric dropdown menu, and then the metric you want to investigate the model’s precision by: Validate the Model’s Performance on Business Criteria Now it’s time to see the metrics impacting the model’s performance the most and determine, based on your information, if it’s good or bad (needs debugging) for business. For instance, if the Confidence scores are the least performing metrics, you might be worried that your vision model is naive in predictions given the previous consensus on the outcome count (false positives and negatives). Here is the case for this model under the Metric Performance dashboard (remember, you can use the IoU Threshold slider to check the metric impact at different confidence intervals): The Relevative Area (the object's size) significantly influences our model’s performance. Considering the business environment you want to deploy the model, would this be a good or bad event? This is up to you to decide based on your technical and business requirements. If the model does not work, you can run more experiments and train more models until you find the optimal one. Awesome! You have seen how Encord Active plays a key role in providing features for validating your model’s performance with built-in metrics. In addition, it natively integrates with Encord Annotate, an annotation tool, to facilitate data quality improvement that can enhance the performance of your models. Conclusion Selecting the right model validation tools ensures that models perform accurately and efficiently. It involves the assessment of a model's performance through quantitative metrics such as the IoU, mAP (mean Average Precision), and MaE, or qualitatively, by subject matter experts. The choice of evaluation metric should align with the business objectives the model aims to achieve. Furthermore, model selection hinges on comparing various models using these metrics within a carefully chosen evaluation schema, emphasizing the importance of a proper validation strategy to ensure robust model performance before deployment.​ Validating model performance is particularly vital in sectors where such inaccuracies could compromise safety. Check out our customer stories to learn from large and small teams that have improved their data quality and model performance with the help of Encord. Platforms like Encord, which specialize in improving data and model quality, are instrumental in this context. Encord Active, among others, provides features designed to refine data quality and bolster model accuracy, mitigating the risks associated with erroneous predictions or data analysis.
March 2
8 min
Even as foundation models gain popularity, advancements in object detection models remain significant. YOLO has consistently been the preferred choice in machine learning for object detection. Let’s train the latest iterations of the YOLO series, YOLOv9, and YOLOV8 on a custom dataset and compare their model performance. In this blog, we will train YOLOv9 and YOLOv8 on the xView3 dataset. The xView3 dataset contains aerial imagery with annotations for maritime object detection, making it an ideal choice for evaluating the robustness and generalization capabilities of object detection models. If you wish to curate and annotate your own dataset for a direct comparison between the two models, you have the option to create the dataset using Encord Annotate. Once annotated, you can seamlessly follow the provided code to train and evaluate both YOLOv9 and YOLOv8 on your custom dataset. Read the Encord Annotate Documentation to get started with your annotation project.  Prerequisites We are going to run our experiment on Google Colab. So if you are doing it on your local system, please bear in mind that the instructions and the code was made to run on Colab Notebook. Make sure you have access to GPU. You can either run the command below or navigate to Edit → Notebook settings → Hardware accelerator, set it to GPU, and the click Save. !nvidia-smi To make it easier to manage datasets, images, and models we create a HOME constant. import os HOME = os.getcwd() print(HOME) Train YOLOv9 on Encord Dataset Install YOLOv9 !git clone https://github.com/SkalskiP/yolov9.git %cd yolov9 !pip install -r requirements.txt -q !pip install -q roboflow encord av # This is a convenience class that holds the info about Encord projects and makes everything easier. # The class supports bounding boxes and polygons across both images, image groups, and videos. !wget 'https://gist.githubusercontent.com/frederik-encord/e3e469d4062a24589fcab4b816b0d6ec/raw/fa0bfb0f1c47db3497d281bd90dd2b8b471230d9/encord_to_roboflow_v1.py' -O encord_to_roboflow_v1.py Imports from typing import Literal from pathlib import Path from IPython.display import Image import roboflow from encord import EncordUserClient from encord_to_roboflow_v1 import ProjectConverter Data Preparation Set up access to the Encord platform by creating and using an SSH key. # Create ssh-key-path key_path = Path("../colab_key.pub") if not key_path.is_file(): !ssh-keygen -t ed25519 -f ../colab_key -N "" -q key_content = key_path.read_text() We will now retrieve the data from Encord, converting it to the format required by Yolo and storing it on disk. It's important to note that for larger projects, this process may encounter difficulties related to disk space. The converter will automatically split your dataset into training, validation, and testing sets based on the specified sizes. # Directory for images data_path = Path("../data") data_path.mkdir(exist_ok=True) client = EncordUserClient.create_with_ssh_private_key( Path("../colab_key").resolve().read_text() ) project_hash = "9ca5fc34-d26f-450f-b657-89ccb4fe2027" # xView3 tiny encord_project = client.get_project(project_hash) converter = ProjectConverter( encord_project, data_path, ) dataset_yaml_file = converter.do_it(batch_size=500, splits={"train": 0.5, "val": 0.1, "test": 0.4}) encord_project_title = converter.title Download Model Weight We will download the YOLOv9-e and the gelan-c weights. Although the YOLOv9 paper mentions versions yolov9-s and yolov9-m, it's worth noting that weights for these models are currently unavailable in the YOLOv9 repository. !mkdir -p {HOME}/weights !wget -q https://github.com/WongKinYiu/yolov9/releases/download/v0.1/yolov9-e-converted.pt -O {HOME}/weights/yolov9-e.pt !wget -P {HOME}/weights -q https://github.com/WongKinYiu/yolov9/releases/download/v0.1/gelan-c.pt You can predict and evaluate the results of object detection with the YOLOv9 weights pre-trained on COCO model. Check out the blog YOLOv9 Explained and How to Run it if you want to run object detection on pre-trained YOLOv9 weights. Train Custom YOLOv9 Model for Object Detection We train a custom YOLOv9 model from a pre-trained gelan-c model. !python train.py \ --batch 8 --epochs 20 --img 640 --device 0 --min-items 0 --close-mosaic 15 \ --data $dataset_yaml_file \ --weights {HOME}/weights/gelan-c.pt \ --cfg models/detect/gelan-c.yaml \ --hyp hyp.scratch-high.yaml You can examine and validate your training results. The code for validation and inference with the custom model is available on Colab Notebook. Here we will focus on comparing the model performances. Converting Custom YOLOv9 Model Predictions to Encord Active Format pth = converter.create_encord_json_predictions(get_latest_exp("detect") / "labels", Path.cwd().parent) print(f"Predictions exported to {pth}") Download the predictions on your local computer and upload them via the UI to Encord Active for analysis of your results. Moving on to training YOLOv8! Train YOLOv8 on Encord Dataset Install YOLOv8 !pip install ultralytics==8.0.196 from IPython import display display.clear_output() import ultralytics ultralytics.checks() Dataset Preparation As we are doing a comparative analysis of two models, we will use the same dataset to train YOLOv8. Train Custom YOLOv8 Model for Object Detection from ultralytics import YOLO model = YOLO('yolov8n.pt') # load a pretrained YOLOv8n detection model model.train(data=dataset_yaml_file.as_posix(), epochs=20) # train the model model.predict() The code for running inference on the test dataset is available on the Colab Notebook shared below. Converting Custom YOLOv8 Model Predictions to Encord Active Format pth = converter.create_encord_json_predictions(get_latest_exp("detect", ext="predict") / "labels", Path.cwd().parent) print(f"Predictions exported to {pth}") Download this JSON file and upload it to Encord Active via UI. Comparative Analysis on Encord Active On Encord Active under the tab Model Evaluation, you can compare both the model’s predictions. You can conveniently navigate to the Model Summary tab to view the Mean Average Precision (mAP), Mean Average Recall (mAR), and F1 score for both models. Additionally, you can compare the differences in predictions between YOLOv8 and YOLOv9. Precision YOLOv8 may excel in correctly identifying objects (high true positive count) but at the risk of also detecting objects that aren't present (high false positive count). On the other hand, YOLOv9 may be more conservative in its detections (lower false positive count) but could potentially miss some instances of objects (higher false negative count). Recall In terms of recall, YOLOv8 exhibits superior performance with a higher true positive count (101) compared to YOLOv9 (43), indicating its ability to correctly identify more instances of objects present in the dataset. Both models, however, show an equal count of false positives (643), suggesting similar levels of incorrect identifications of non-existent objects. YOLOv8 demonstrates a lower false negative count (1261) compared to YOLOv9 (1315), implying that YOLOv8 misses fewer instances of actual objects, highlighting its advantage in recall performance. Precision-Recall Curve Based on the observed precision-recall curves, it appears that YOLOv8 achieves a higher Area Under the Curve (AUC-PR) value compared to YOLOv9. This indicates that YOLOv8 generally performs better in terms of both precision and recall across different threshold values, capturing a higher proportion of true positives while minimizing false positives more effectively than YOLOv9. Precision-Recall Curve is not the only metric to evaluate the performance of models. There are other metrics like F1 score, IOU distribution, etc. For more information on different quality metrics, read the blog Data, Label, & Model Quality Metrics in Encord.  Metric Correlation The metric impact on performance in Encord refers to how specific metrics influence the performance of your model. Encord allows you to figure out which metrics have the most influence on your model's performance. This metric tells us whether a positive change in a metric will lead to a positive change (positive correlation) or a negative change (negative correlation) in model performance. The dimensions of the labeled objects significantly influence the performance of both models. This underscores the importance of the size of objects in the dataset. It's possible that the YOLOv9 model's performance is adversely affected by the presence of smaller objects in the dataset, leading to its comparatively poorer performance. Metric Performance The Metric Performance in model evaluation in Encord provides a detailed view of how a specific metric affects the performance of your model. It allows you to understand the relationship between a particular metric and the model's performance. In conclusion, the comparison between YOLOv8 and YOLOv9 on Encord Active highlights distinct performance characteristics in terms of precision and recall. While YOLOv8 excels in correctly identifying objects with a higher true positive count, it also exhibits a higher false positive count, indicating a potential for over-detection. On the other hand, YOLOv9 demonstrates a lower false positive count but may miss some instances of objects due to its higher false negative count. If you want to improve your object detection model, read the blog How to Analyze Failure Modes of Object Detection Models for Debugging for more information.  The precision-recall curve analysis suggests that YOLOv8 generally outperforms YOLOv9, capturing a higher proportion of true positives while minimizing false positives more effectively. However, it's important to consider other metrics like F1 score and IOU distribution for a comprehensive evaluation of model performance. Moreover, understanding the impact of labeled object dimensions and specific metric correlations can provide valuable insights into improving model performance on Encord Active.
March 1
8 min
Software To Help You Turn Your Data Into AI
Forget fragmented workflows, annotation tools, and Notebooks for building AI applications. Encord Data Engine accelerates every step of taking your model into production.