The challenges of data annotation are far-reaching for companies in retail, media, consumer goods, transportation and e-commerce, and many others. To overcome the challenges, these companies need to train their AI algorithms to identify objects in still images and videos at great speed and unparalleled accuracy.
Sama Senior Product Manager Jerome Pasquero joined Episode 30 of the How AI Happens podcast to give his insight on where the industry is at today.
The Challenges and Solutions of Data Annotation in Computer Vision
Why training data is vital in academia and successful AI deployments?
There was a shift in the AI and computer vision space over the last few years. It went from a race to build the best AI model architectures to focusing on training and production data that impact these models. Many of the advances in the AI boom can be credited to how AI firms are delving into pure research to be published in academia (like Yoshua Bengio, who is head of the University of Montreal’s Institute for Learning Algorithms) and industry journals.
When these companies are vying for attention in the academic world, they are using publicly available data sets. This data works for testing AI model performance, but it isn’t ideal for preparing data models to tag images and videos for production. High-quality training and production data sets are highly valuable and must be protected. Companies that fine-tune the performance of their architectures with “dummy data” with others running proprietary training data are simply comparing oranges to apples, which isn’t helpful.
AI Architecture is to Data as Chicken is to Egg
In the AI space initially, there was a lot of emphasis and effort on building and enhancing robust data architecture models such as computational hardware, cloud services, and open-source platforms. Recently, many researchers and experts have argued that data quality and machine learning are most important.
Whether training data will ever become commoditized because of the pace of AI evolution is yet to be seen, but data quality is undoubtedly a significant competitive differentiator. Access to computational resources like graphical processing units (GPUs) is another. Trying to establish which to focus on first is like the chicken and the egg conundrum.
Prioritizing Data Annotations – To Specialize or Generalize?
Many AI models, like many people, are either trained to be generalists with broad but shallow understanding or specialists, with narrow but deep knowledge. An example is an AI model being trained on annotated videos and images related to pedestrian traffic from a single city intersection, compared to assets from multiple camera angles of pedestrians crossing many streets.
Companies should first prioritize setting a filtering goal, like first labeling their top-priority data, like top-selling products in a retail store. Specializing in a prioritized subset of objects can help for near real-time identification, like in a self-serve grocery store. For slower-paced recognition, like identifying a footbridge on a city street, annotation agents with generalized experienced can assist.
The Value of Preprocessing Computer Vision Data
Preprocessing images and video lets algorithms identify the objects in a data set that have good lighting, clarity, and are in positions that can be recognized. Later, in production processing, tools like segmentation and LiDAR – or in challenging cases, the human annotator – can apply learnings from preprocessed data.
They can match up two nearly identical images like a game of “Guess Who?” in the Upside Down, and identify objects which are flipped, obscured, shadowy, or off-kilter.
Measuring Accuracy and Retraining with Fresh Data
Companies generally only have a limited amount of data models, but lots of computer vision data to work with. Testing accuracy to see if an algorithm has “data drift”, loss of identification accuracy, in production can take many forms.
Customers may object to inaccurate pricing for products with new packaging, or other changes. Or you can randomly test samples of the latest data against the original training data set. When a data model drifts, it doesn’t mean it is being fed bad data – it could be an evolution of products, branding, or packaging. It helps to know when these changes are coming.
Empowering Annotation Agents to Be Data Model Teachers
The world is going to change in ways we don’t fully understand. The challenge will be in making data meaningful. But the fast pace of change is likely to require human annotators to assist computer vision algorithms far into the future.
Central to the development is the ethical use of AI to innovate and create business efficiency while providing opportunities to those who can’t access high standards of education.
Subscribe to the How AI Happens podcast on the How AI Happens website.