Introducing Object Tracking with Video Annotation

Matthew Landry

July 16, 2018

6 Minute Read

 

Today, Sama announces the availability of our latest image annotation toolset for advanced video object tracking. These new tools, in the hands of our expert annotation workforce, level up Sama’s object tracking performance while maintaining the same extreme quality results of our ground truth training data services.

What this means for our customers is an even more scalable approach to annotating the growing stream of video object tracking data. Faster training data production speeds your algorithm development and gets you to market faster.

Why focus on video object tracking?

Tesla, as an example, has over 250,000 cars on the road, each packed with high quality cameras to capture the world around them. Video footage collected from a fleet of this size can feed an extremely sophisticated autonomous driving deep learning system. And it's not just Tesla. In the Bay Area, we've become accustomed to seeing data capture vehicles from just about every autonomous driving company out there -- and all of them are collecting video.

As the computer vision industry progresses from simple object identification (can the algorithm tell what an object is?) to object tracking (can the algorithm follow a specific object over time?), we need tools that can effortlessly annotate this video stream. Sama delivers.

What difference does a tool make?

The traditional approach to an object tracking project is to split the video into individual images and then annotate each image separately, paying careful attention to ensure consistent identifiers for each unique object in sequential images. It's very challenging work, as any Sama agent or quality analyst will tell you. It takes careful attention to detail and often exceeds the capabilities of most annotation services. (We had to build some supporting tools in our platform to make it tractable.)

Sama’s introduction of video annotation for object tracking completely changes the game. Now, an entire video sequence can be assessed as a whole, whether the clip contains 2 frames or 2,000 frames. This feature makes it much easier and faster to follow a single object -- even if it's moving -- from beginning to end of a video. If the object disappears from the camera view and reenters later (think: overtaking a cyclist in traffic, only to have them blow past you at the next intersection), we can easily, accurately accommodate it. The whole process is more efficient while maintaining the highest annotation quality, especially as the density of objects increases. And believe me, image complexity at the cutting edge of computer vision is getting up there.

No, really, why are you so excited?

One of the coolest aspects of the new tool is how it semi-automatically annotates frames, which makes for a more efficient workflow. If a user starts by drawing a bounding box around an object, the tool automatically estimates the object's location in subsequent or previous frames. Our expert annotation workforce carefully scrutinizes those estimates, and manually tweaks them as needed to get the tracking fully dialed in.

When we think about where to focus our platform development, we're always looking for ways to augment the capabilities of our human workforce. We think about how we can make our data services better by using technology to make our team more efficient and more accurate -- with ever more complicated annotation projects. Video annotation is a very visceral demonstration of this approach.

laptop_VideoAnnotation2

(By the way, the process couldn't be easier for customers. Hand over camera footage -- color, b&w, high frame rate, low frame, SD, UHD, whatever -- to our project team, and we manage the entire project from start to finish, delivering annotation results that you can immediately route into your training pipeline.)

That’s a wrap!

It's the leveling up in the speed -- with the highest accuracy -- of our ground truth training data annotation service that really matters. We work with a many clients developing sophisticated vision algorithms, with very aggressive targets for annotation completeness and correctness. Ground truth training data is precious and object tracking video sequences particularly so. Data scientists need confidence in the quality of that training data so that they squeeze the maximum performance out of their deep learning models, focusing on the architecture and hyper-parameter tuning instead of grooming erroneous data.

Partnering with Sama, you can get the most from your object tracking projects. We’re proud of this production-ready video annotation tool, and have big plans for evolving it. If you have object tracking on your mind and would like to see a demo of our annotation platform in action: Drop us a line!