Iteratively improving data quality and conducting experiments are vital in developing computer-vision models.
Encord Active is a data-centric platform that enables teams to curate visual datasets to improve data and model quality
neptune.ai is a machine-learning experiment tracker that provides a central place for data scientists to log, analyze, and compare their computer-vision experiments.
Together, Encord Active and neptune.ai cover the entire computer-vision modeling process from data curation to delivering the final model to production.
Building robust computer vision models is a highly iterative process that depends on two main pillars: data quality and the ability to improve experiments.
Poor data quality can lead to models that fail to generalize well, wasted resources, and delayed (or even failed) projects. Insufficient experiment tracking leads to valuable insights lost, resource underutilization, and extended project timelines.
All machine learning engineers, data scientists, and teams developing computer vision models encounter these challenges at some point. They are amplified in collaborative environments where multiple stakeholders work on different aspects of the same project and in rapidly growing organizations that must move fast.
Many teams solve this challenge by developing their computer vision pipeline as a continuous iterative loop—continuously exploring and validating the quality of their data, then iterating on model-building to improve both the process and the speed at which they can develop and deploy their models.
The goal is to ensure models are trained on high-quality data and to track and analyze their performance. Achieving these two objectives paves the way for robust, reliable, high-performing models.
In this article, we’ll explore how teams achieve this with Encord Active and neptune.ai. We’ll use Encord Active to survey, validate, and evaluate data and Neptune to log, compare, and collaborate on experiments and models.
By the end of this article, you’ll have learned how to use both tools to streamline your workflow, improve your data quality, and analyze model performance.
What is Encord Active?
Encord Active is a data-centric computer vision (CV) platform for teams to curate visual data, find label errors, and evaluate data and model quality.
Encord Active enables you to:
Encord Active is part of the Encord data engine for AI that includes Annotate for data annotation and Index for data curation and management. For the experiments described in this article, we’ll use the open-source version of Active, available on GitHub.
What is neptune.ai?
Neptune is a machine-learning experiment tracker. The platform provides a single place to record, compare, store, and collaborate on your experiments and models.
This is particularly useful for projects involving multiple stages, such as data preprocessing, model training, and inference. With Neptune, you can keep an overview and analyze a project’s progress from start to finish.
Neptune supports a full range of ML tasks. For computer vision projects, the following features are essential:
-
1
Log, visualize, and retrieve image segmentation workflow metadata.
-
2
Compare images between runs and analyze a series of images through an easy-to-navigate gallery view.
-
3
Enable reproducibility through experiments and data versioning.
-
4
Share all the metadata and results with your team, managers, and other stakeholders.
Do you feel like experimenting with neptune.ai?
Using Encord Active and neptune.ai to build high-performing computer vision models
In this walkthrough, you will:
- Train a simple image classifier based on the data and use Neptune to track the training performance.
The Caltech 101 dataset is a popular dataset for CV tasks. Researchers at the California Institute of Technology (Caltech) created and released it in 2003. The dataset comprises 101 object categories, each containing about 40 to 800 images.
Step 1: Set up Encord Active and Neptune
Create a Python virtual environment to install and isolate the dependencies:
Install Encord Active, Neptune, and the neptune-pytorch integration in your environment:
Note that installing Encord Active will also install the deep learning framework PyTorch and the Torchvision library, which we’ll use to load and transform the dataset in this project.
With everything installed, launch Jupyter Notebook by running the following command in your terminal:
Step 2: Download the image classification dataset
First, we’ll download the Caltech101 dataset using the `torchvision.dataset` module. In a new Jupyter notebook, execute the following code snippet:
This will download the Caltech101 dataset into a new folder called `caltech101`in the current directory. We’ll define this folder as our `data_dir`:
Step 3: Create an Encord Active project
We need to create a local project before we can start working with our dataset in Encord Active. A project is a structured collection of data you intend to use to test, validate, and evaluate your models within Encord Active. The project contains the dataset, ontology (label-data relationship), configurations, quality metrics, and model predictions.
Here’s an example of the structure for an `animal_species` classification project that also includes model predictions:
Create a project directory
First, we create a directory for our Encord Active project:
Set up a data collector
Next, we define a helper function, `collect_all_images`, that takes a folder path as the input and returns a list of `Path` objects representing all image files within this folder:
We’ll use this helper function to collect the Caltech101 image files:
Prepare the label transformer
To train models, we must provide labels for our dataset. The images of the Caltech101 dataset we downloaded are organized into different folders by category. Thus, the folders’ names represent the category of the images they contain.
Encord Active allows us to create a custom label transformer to assign labels to image files. In our case, we’ll make use of the directory structure to assign the labels:
Initialize the Encord Active project
To initialize the Encor Active project, first, we’ll Import the necessary modules:
Next, we initialize a local project using Encord Active’s `init_local_project` function.
We’ve now laid the groundwork for the Encord Active image classification workflow.
Step 4: Compute image embeddings and analyze them with metrics
Image embeddings transform visual data into a format that models can process and learn. We gain insights into our dataset’s quality, diversity, and potential issues by analyzing these embeddings with metrics. This guides our data curation and preprocessing to ensure accurate and robust models from training data.
Encord Active provides utility functions to run predefined subsets of metrics. The cloud-based version of Active allows you to use custom metrics. These metrics evaluate specific properties of the images (e.g., area, sharpness, or brightness).
To compute embeddings and metrics, we first need to import the required modules:
Then, we use the `run_metrics_by_embedding_type` function to calculate the predefined quality metrics for our images:
Once you run that, Encord Active will take a few minutes to compute the embeddings on your images and store them as a pickle file in an “embeddings” folder.
Step 5: Explore the image dataset with Encord Active
To interact with our Encord Active project, we have to create a `Project` object pointing to our `project_path`:
Explore image characteristics with Encord Active
Now, we can explore the images in our dataset. To launch the Encord Active web app, run the following command:
Your browser should open a new window with Encord Active. It should launch the following web page with all your projects:
If the terminal seems stuck and nothing happens in your browser, try visiting http://localhost:8080 .
If you get the error `Error: Port 8080 already in use. Try changing the `–port` option.`, consider another port: `!encord-active start –port 8081`
Encord Active UI overview
When you launch the Encord Active OS app, you’ll see its tab-structured user interface:
Under the Data tab of the Summary dashboard, you can get an overview of your dataset’s quality and distribution based on the built-in metrics.
In the Explorer dashboard, you can explore your dataset, inspect images, and review the computed metrics and embeddings. This is where you’ll spend time understanding the nuances of your dataset and making decisions on how to improve it.
View the label class distribution
Exploring class distribution is crucial to avoid biased model predictions. A balanced dataset helps train a model that performs well across all categories and allows the model to weigh classes appropriately during training.
You can explore the label class distribution in the Summary tab by navigating to the Annotations tab and inspecting the Metric Distribution chart:
For the Caltech101 dataset, the “motorbikes” class appears the most often and the “inline_skate” class the least frequently.
Identifying and understanding outliers
Understanding outliers and anomalies is beneficial for developing models resilient to variations in the data. It helps us decide whether to include certain outliers in our training set, potentially improving model generalization, or exclude them if they represent data collection errors or other irrelevant variations.
Image characteristics such as green channel intensity, blue channel intensity, area, image sharpness, and uniqueness provide valuable insights into the dataset’s diversity and potential challenges. Analyzing these metrics allows us to better prepare our data and models for the complexities of real-world applications.
Let’s focus on the Data tab to identify outliers within your dataset. Encord Active classifies outliers into two categories:
- Severe Outliers: These are highlighted in red and may indicate corrupted images or significant anomalies that could adversely affect model training.
- Moderate Outliers: Shown in orange, these outliers are less extreme but still warrant examination to decide if they should be included in the training dataset.
The bar chart showcasing the outliers across different image metrics gives us quantitative evidence of variation in our dataset. Observing many severe outliers, especially in image width and color channels like green, indicates a substantial deviation from typical image properties.
These deviations could be due to data collection inconsistencies, image corruption, or specific characteristics of the photographed objects. Investigating these extremes is crucial to ensuring our models’ robustness and accuracy.
Distortions and noise are two common issues that may affect the image quality for model training. Distortions may arise from the lens used to capture the image, leading to warping or blurring of the image content. Noise is random variations in brightness or color information in images. It’s caused by poor lighting conditions, sensor quality, or compression artifacts.
A low-quality image might be too blurry, too dark, or have distortions that would not be present in the model’s intended environment.
Check for blurry images
Depending on the specific application, images with blur can adversely affect your model’s accuracy. A model you train on high-resolution, sharp images may struggle to interpret and make correct predictions on less clear images.
Blurry images can result in misinterpretations and errors in the model’s output, which could be critical. Therefore, examining such images within your dataset is essential to determining whether to exclude them or improve their quality.
You can view one of the blurry images in the Explorer dashboard to get more insights:
You can also click “SIMILAR” next to an image to view others with comparable sharpness metrics. This feature is handy in the Caltech101 dataset, where clicking “SIMILAR” may reveal a set of images with a consistent level of blur, which can be a characteristic of a specific class or the result of how images were collected.
Tagging images
If you find such images, you can tag them in Explorer for review later on. Manually inspect the images you reckon are not suitable for the training data → Select them → Tag them as “Blurry”:
Identifying images with poor lighting
To identify images with poor lighting or low visibility:
- Change the “Data Metrics” dropdown selection to “Brightness.”
- To find the darkest images, sort the results in ascending order.
This surfaces images that might be underexposed or taken in poor lighting, affecting your model’s pattern recognition capabilities.
Change the “Data Metrics” dropdown selection to “Brightness” and sort in descending order:
The brightest images look reasonably good quality, but you can look through the Explorer to spot images that may not meet the training or data requirements.
To get the darkest images, sort the “Brightness” in ascending order:
The dark images do not appear to be wrong. You should explore other images to determine which ones to sift through. For example, you could look for images that have low contrast if that’s important to your project.
If manually detecting these issues is time-consuming, another way is to automatically detect them using Encord Active Cloud’s ML-assisted features to detect data quality issues.
Step 6: Curate the training data
Once you have a thorough understanding of your dataset’s characteristics, the next steps typically involve:
- Adjusting image quality, when possible, to resolve issues like blur or poor lighting.
- Re-annotate images if the current annotations are incorrect or incomplete.
- Selecting the best samples representing the diversity and realities of the environment where you will deploy the model.
In the Explore tab, you can filter based on “Blur” and “Sharpness” metrics. You’ll set thresholds that include images within a specific quality range and exclude those that are too blurry or sharp. This ensures that your dataset is balanced in terms of image clarity.
After filtering on “Blur” and “Sharpness,” the number of images should be down to about 5,900 instances. Encord Active provides image tagging that you can programmatically interact with.
Tagging the filtered images for training involves:
- Reviewing the filtered images and confirming they meet the project’s quality standards.
- Using Encord Active’s tagging feature to label these images as “train,” making them easy to retrieve programmatically for model training.
For this tutorial, we’ll conclude the filtering at this point. In a more involved project, you can iterate over this process, further refining the dataset to ensure it contains the most representative and high-quality samples.
Use the `ActiveClassificationDataset()` module to import the training and test sets for image classification with Encord Active:
Initializing the project generates a `project_meta.yml` file that contains the project description, title, and hash. Ensure to replace “<ENTER YOUR PROJECT HASH>” below with the actual project hash found in `project_meta.yml`:
The code below builds a dataset for model training. It applies the specified transformations to each image and includes only those tagged as “train”:
The dataset appears in the SQLite database `encord-active.sqlite` inside your project folder with the tag “train.”
With your curated dataset, you can advance to model building and training. As you progress, log your experiments with neptune.ai to track and manage your models effectively.
Step 7: Computer vision model training and experiment tracking with neptune.ai
After identifying and rectifying the quality issues in your image dataset, you have to move the data downstream to train your computer vision model. This is when having an experiment tracking system in place becomes handy.
Neptune is a highly scalable experiment tracker. Once you integrate Neptune’s client into your pipeline, it will send all relevant experiment metadata to the platform.
Set up Neptune
If you haven’t signed up for Neptune yet, create an account (it’s free).
To initialize a Neptune project for our environment, we’ll securely store our API token as an environment variable:
Next, define the neural network:
Log experiments to Neptune
To associate your training runs (`run`) with Neptune, instantiate the `NeptuneLogger`, a ready-made integration with PyTorch provided by Neptune. It will automatically log metadata related to the training process, such as the model architecture, gradients, artifacts, parameters, and metrics:
Define training and testing functions
Next, we’ll define `train()` and `test()` functions to handle the training process. These functions will also log training metrics like loss and accuracy to Neptune.
Using unique namespace paths (“batch/loss” and “batch/accuracy”) allows us to track performance issues (overfitting, underfitting, etc.) over batches and epochs.
Run the training loop
We initiate the model training with the training loop set for ten epochs. The loop will handle training and validation, with all details logged to Neptune:
Neptune logs the loss and accuracy after each batch, providing real-time insight into your model’s performance.
You’ll have access to these metrics in Neptune’s dashboard, allowing you to compare and analyze them across different training runs.
Step 8: Analyze your experiments in Neptune’s web app
After training your computer vision model, head over to the Neptune dashboard. Access it by navigating to Neptune’s public project example and selecting your run ID, which is automatically set to `TES-1` for the first run but can be customized for easier identification.
Explore experiment results
Within your Neptune project:
- Review performance charts: The dashboard contains metrics and performance graphs, such as accuracy and loss over time. These provide visual feedback on the training process and can reveal trends such as improvements or signs of overfitting.
- View metadata and artifacts: Access all logged information, including data versions, source code, and training artifacts. This comprehensive metadata collection supports reproducibility and deeper analysis.
- Experiment comparison: Compare different experiment runs using tools like tables, parallel coordinates, and charts. This helps identify which hyperparameters, model architectures, or datasets yield the best results.
For more guidance on using these tools, see Neptune’s user guide, which outlines the app’s capabilities.
Customization and collaboration
You can create and save custom views and dashboards, improving analysis and debugging processes. This feature is handy in collaborative settings where sharing insights and findings is crucial.
To create a dedicated dashboard for a project:
-
1
Click “Save view as new” after setting up your charts and tables as desired.
-
2
Label it meaningfully, for instance, “Caltech101-project”.
Assess the model’s behavior on training and test sets (loss, accuracy, bias). Are there signs of underfitting, or does the model generalize well to new data?
If the current architecture is shallow, consider experimenting with deeper networks or leveraging transfer learning with pre-trained models. Adjusting hyperparameters based on your findings may also be necessary.
Logging images for insight
Logging images to Neptune during training has several advantages:
- Visual Inspection: It allows for the direct visual assessment of what the model is being trained on and what it predicts.
- Quality Assurance: Spot-check images that are correctly or incorrectly classified to understand model behavior.
- Debugging: Identify potential issues with data quality or preprocessing steps that might affect training.
Here’s how you can log images to your dashboard:
Head back to your Neptune dashboard Click “Images”. You should see the logged training images:
Uploading models to Neptune
When satisfied with the model’s performance, use `log_model()` to store the trained model directly in Neptune:
This facilitates easy model versioning and comparison. After uploading the model, don’t forget to end the experiment to free up resources:
That’s it! Now, you have a complete record of your experiment—its code, data, results, and the trained model—on Neptune, making it easier to track progress and collaborate with others.
Next steps
In this guide, we’ve developed a computer vision model using Encord Active and Neptune. We’ve seen how to explore a dataset, curate high-quality images, and track your models’ performance.
Encord Active addresses the crucial aspect of data quality by enabling thorough exploration, validation, and data evaluation, laying a solid foundation for model development.
Neptune provides a structured framework for logging, comparing, and collaborating on experiments. Based on insights gained during training, it is an essential part of the MLOps stack for assessing, maintaining, and improving model performance.
Explore more content topics:
Source link
lol