When an organization sets out to build and deploy an AI model, a number of decisions have to be made. Decisions around building a robust data pipeline are among the most fundamental.
Choosing the right data partner will obviously impact how well a model performs. If we look a bit deeper, we also find that this choice has a real impact on the well-being of the whole data supply chain workforce. Sometimes collectively referred to as data enrichment professionals, all workers involved in the data supply chain—from labelers to quality analysts—are affected by an organization’s data procurement decisions.
The complexity of the data procurement process combined with a lack of standards around equitable data supply chains has several downstream implications for these essential but largely unseen workers. Despite their foundational role, working conditions for data enrichment professions can be precarious: they can face wage uncertainty, low pay and limited career growth opportunities.
Your data enrichment choices have a direct impact on workers’ well-being
There is, however, an opportunity to make a difference. The decisions organizations make while procuring data labeling services can have a meaningful impact on the working conditions of data enrichment professionals.
Partnership on AI (PAI) is a multistakeholder organization that brings together academics, researchers, civil society organizations, companies building and using AI technology, and other groups working to better understand AI’s lasting impacts. Their latest white paper Responsible Sourcing of Data Enrichment Services examines the working conditions of data enrichment professionals, and seeks to:
- Critically evaluate the impact of the industry’s current practices on workers;
- Explore practices the industry can adopt to improve worker well-being; and
- Advance the discourse around the future of data enrichment work and the indispensable role it plays in AI development.
The paper gives AI practitioners and decision makers visibility into the impact of their decisions in procuring data enrichment services – from selecting providers, to running pilots, to conducting quality assurance and more.
Since data labeling is a key part of this equation, it was important for Sama to partner with PAI: to both share our lessons learned from several years of striving for more ethical supply chains, and to continue to learn from industry peers about responsible AI.
In fall of 2020, Sama participated in the five-week series of Responsible Sourcingworkshops held by PAI, the output of which was a set of strategic recommendations for this white paper. The workshops brought together more than 30 professionals from different areas of the data enrichment ecosystem including representatives from data enrichment providers, researchers and product managers at AI companies, and leaders of civil society and labor organizations.
To fully understand the impact of AI on society, one must examine the bias and shortcomings of models, but also the means by which they are created.
While more work and research is needed, Responsible Sourcing of Data Enrichment Services shares actionable insight that practitioners can use as a starting point to raise important conversations with internal stakeholders. Our hope is that these conversations will be a step in the right direction: toward giving data enrichment professionals better recognition for the critical role they play in building AI.