How It's Made

7 steps to get started with large-scale labeling

Omar Alonso

Omar Alonso

Mar 12, 2021

How Instacart built a crowdsourced data labeling process (and how you can too!)

Organizations that develop technologies rooted in information retrieval, machine learning, recommender systems, and natural language processing depend on labels for modeling and experimentation. Humans provide these labels in the context of a specific task, and the data collected is used to construct training sets and evaluate the performance of different algorithms.

1. Assess the lay of the land

2. Identify your use cases

3. Understand your data

4. Design your Human Intelligent Task (HIT)

5. Determine your guidelines

6. Communicate your task

7. Maintain high quality

Ready for Takeoff!

Acknowledgments & Further Reading

Learn more about Design at Instacart on our Design blog.

Omar Alonso

Omar Alonso

Omar Alonso is a member of the Instacart team. To read more of Omar Alonso's posts, you can browse the company blog or search by keyword using the search bar at the top of the page.

Most Recent in How It's Made

One Model to Serve Them All: How Instacart deployed a single Deep Learning pCTR model for multiple surfaces with improved operations and performance along the way

How It's Made

One Model to Serve Them All: How Instacart deployed a single Deep Learning pCTR model for multiple surfaces with improved operations and performance along the way

Authors: Cheng Jia, Peng Qi, Joseph Haraldson, Adway Dhillon, Qiao Jiang, Sharath Rao Introduction Instacart Ads and Ranking Models At Instacart Ads, our focus lies in delivering the utmost relevance in advertisements to our customers, facilitating novel product discovery and enhancing…...

Dec 19, 2023
Monte Carlo, Puppetry and Laughter: The Unexpected Joys of Prompt Engineering

How It's Made

Monte Carlo, Puppetry and Laughter: The Unexpected Joys of Prompt Engineering

Author: Ben Bader The universe of the current Large Language Models (LLMs) engineering is electrifying, to say the least. The industry has been on fire with change since the launch of ChatGPT in November of…...

Dec 19, 2023
Unveiling the Core of Instacart’s Griffin 2.0: A Deep Dive into the Machine Learning Training Platform

How It's Made

Unveiling the Core of Instacart’s Griffin 2.0: A Deep Dive into the Machine Learning Training Platform

Authors: Han Li, Sahil Khanna, Jocelyn De La Rosa, Moping Dou, Sharad Gupta, Chenyang Yu and Rajpal Paryani Background About a year ago, we introduced the first version of Griffin, Instacart’s first ML Platform, detailing its development and support for end-to-end ML in…...

Nov 22, 2023