Blog > Sama Engineering
Sama’s Experiment-Driven Approach to Solving for High-Quality Labels at Scale

Sama’s Experiment-Driven Approach to Solving for High-Quality Labels at Scale

Though data annotation is a critical part of the machine learning (ML) production lifecycle, access to accurate and scalable datasets often represents a significant bottleneck for ML engineers.

This is because many common approaches to labeling data – whether leveraging semi-supervised learning techniques, crowd-based methods, or automatic label generation – come with drawbacks in regards to accuracy, cost, and/or time consumption.

While each approach to labeling has its limitations, human-machine collaboration shows promise in solving this difficult problem. At Sama, a powerful combination of skilled annotators and an AI-powered platform allows us to efficiently deliver a high standard of label quality to our customers: also known as human-in-the-loop machine learning.

We use Experiment-Driven Development (EDD) to measure how new features and improvements made to our annotation platform increase data annotation efficiency, while maintaining the accuracy of human expert labeling. Our methods and results for implementing EDD are outlined in a recent paper, “Experiment-driven improvements in Human-in-the-loop Machine Learning Annotation via significance-based A/B testing,” by Rafael Alfaro-Flores, Juan Esquivel-Rodrıguez, Jose Salas-Bonilla, and Loic Juillard of Sama.

Read the research paper here, or read on for an overview of Experiment-Driven Development and how Sama — and our customers — benefit from it.

What is Human-in-the-Loop Machine Learning?

Crowd-sourced and automated annotation are two of the most common methods of data labeling, and both bring with them serious issues for most annotation requirements.

  • Crowd-sourced annotation is often accurate (although this can change depending on who’s in the crowd) but can be slow and expensive.
  • Automation, on the other hand, is fast but inaccurate, specifically when applied to new or highly specialized data sets, such as medical applications.

In either of those setups, EDD is not as good a fit because we do not have a constant person-to-person feedback loop that we can leverage to complement the experimental metrics.

Human-in-the-loop machine learning aims to solve these problems by involving experts to carry out enough manual labeling to properly train the model, while relying on automation on the back end to carry out the rest of the annotation process. This method is both faster than crowd-sourced annotation and more accurate than automated annotation.

So human-in-the-loop is a solid method of data annotation and machine learning model training but how do we improve it?

Enter Experiment-Driven Development (EDD)

Experiment-Driven Development is a feature and process introduction philosophy that relies on evidence-based results to inform adoption decisions. EDD is experiment-driven in that it embraces tried and true aspects of the scientific method to test whether new features or internal processes will increase effectiveness and efficiency or not.

A standard method to determine the impact of changes to a system when utilizing an EDD approach is A/B testing. A/B testing in EDD is not limited to programmatic or algorithmic experiments. It also opens up the opportunity to see how changes on the human side of the data labeling process impact efficiency.

Why does Sama use EDD?

Since Sama has highly-trained annotators that produce high-quality data for our clients, we are in a position to tailor our platform features to leverage this expertise. With EDD we can experiment on improvements and make sure that features, which we hypothesize will make high-skilled annotators better, are indeed helping them, as opposed to being detrimental to their performance.

One case study presented in the research paper shows just that: a comparison was made between two relatively equal annotation teams to see how a change impacted their work. This is a powerful way to increase overall process efficiency.

The key here is a change of mindset. As Alfaro-Flores et al say in their paper:

“Organizations that are successful in implementing a change between ‘having data’ to ‘being data-driven’ need to embark on a paradigm shift to be more experiment-driven, too.”

This is to say, don’t approach human-in-the-loop machine learning and data annotation purely passively. It isn’t a one-and-done deal. Just as software is iterated on and constantly improved, machine learning tactics can be experimented on, shifted, and molded to fit current organizational needs. Sama’s EDD approach is effective because it changes the lens through which we can see the potential of annotation features in a platform that aims to power ML model training.

Learn more about how Sama does EDD

Sama employs two advances in annotation methods to realize the advantages of automated annotation along with human annotators knowledgeable of the subject domain. The annotation instrumentation and the architecture of the Experiment-Driven Development flow specific metrics on key tasks in the annotation process while A/B testing case studies measure the effectiveness of changes introduced in the annotation process.

Related Resources

Part 3: A/B Testing with Python

9 Min Read

Part 2: A/B Testing

4 Min Read

Part 1: Experiment Driven Development

3 Min Read