↳Sama’s commitment to our people, the planet, & governance is outlined in our annual Impact Report. Read Now.
Podcast
Upleveling Data Labeling with Sama’s Jerome Pasquero

Upleveling Data Labeling with Sama’s Jerome Pasquero

As a Senior Product Manager at Sama, Jerome Pasquero understands the power of data, and he joins us today to share a wealth of knowledge on how better annotation ensures better models.

Key Points From This Episode:

  • Jerome’s background, interest in AI, and how he landed his role at Sama.
  • Social initiatives, training data, and what attracted Jerome to Sama.
  • The shift from focusing on AI models to the importance of data quality.
  • Why academia requires the use of a foundational dataset to compare models.
  • The reason for the early focus on building new AI models.
  • Whether datasets will become open source in the future as models have.
  • The role of annotation in making data meaningful and useful.
  • Challenges of annotating data and different approaches to doing so.
  • The three components of data annotation: models, filtering, and the annotation pipeline.
  • How to hone in on goals for filtering data into valuable subsets that align with your desired outcomes.
  • How to measure a model’s accuracy by focusing on user experience and more.
  • What data drift is and how to prevent it by keeping track of it and retraining models where necessary.
  • How to know that your training data is close enough to your production data.
  • What excites Jerome most about the world of data and annotation.

Stream the full episode below, or head here to select your favorite listening app and view the full transcript.

Related Resources

sama-voxel-ml-pulse-report-podcast-logo

ML Pulse Report with Voxel51 CSO Jason Corso and Sama VP Duncan Curtis

15 Min Listen

GM for Amazon CodeWhisperer Doug Seven

15 Min Listen

Bell Senior Data Scientist Dalia Shanshal

15 Min Listen