New data classification feature transforms how enterprises build high-quality training data, delivering up to 80% faster results and 25% improvement in consistency, without sacrificing quality

Sama , delivering data certainty for enterprise AI through tech-enabled annotation, validation and evaluation services, today announced a major advancement in how AI training data is created. The company’s new classification product with Bulk Annotation eliminates one of the industry’s most persistent inefficiencies: the need to manually label nearly identical items over and over again. Sama’s Bulk Annotation capabilities significantly reduce effort while increasing efficiency, accuracy and dataset consistency. Early pilots have shown to increase throughput up to 80% and reduce annotation inconsistencies as much as 25%, while maintaining Sama’s industry-leading quality standards.

The repetitive process of data labeling is slow, expensive and prone to inconsistency, and it impacts every company building AI systems at scale. Whether categorizing thousands of product variants for e-commerce, validating outputs from large language models, or organizing vast document libraries, teams have traditionally needed to review and label each individual item, even when many items are essentially the same.

Sama’s Bulk Annotation feature uses advanced machine learning techniques within the platform to identify groups of similar items, including duplicates, variants and near-matches, and present them together so annotators can classify entire groups at once. A single annotation is then applied across all related items, dramatically reducing wasted effort while improving the consistency of the final dataset.

“As AI becomes mission-critical for more enterprises, the quality and efficiency of training data is now a competitive advantage,” said Duncan Curtis, SVP of AI product and technology at Sama. “Companies can’t afford to build AI on inconsistent data, but they also can’t afford to spend months on repetitive labeling work. Our Bulk Annotation feature solves both problems at once.”

Unlike traditional annotation tools that simply speed up manual work, Sama’s solution rethinks the workflow entirely. The platform’s intelligence layer handles the complexity of grouping related items, so clients don’t need to prepare or restructure their data in advance. Quality assurance also becomes more efficient, with review teams able to validate at the group level rather than checking every individual item.

This technology addresses a growing need across industries. Retailers managing catalogs with thousands of product variations can now annotate entire product families in one step. Companies deploying generative AI can validate model outputs more efficiently and consistently. Financial services and healthcare organizations dealing with complex documents benefit from faster classification even as their data requirements evolve. Bulk Annotation is designed to work whether data structures are stable or constantly changing, giving enterprises flexibility as their AI initiatives mature.

“We created Bulk Annotation by listening to our workforce and clients,” said Karan Vasdev, product manager at Sama. “Following UX research and sampling from our R&D team that further validated frustrations we were hearing from both our annotators and our clients, we were able to design, build and deliver this new capability in under four months.”

While other solutions rely on generic annotation tooling or fragmented labor pools, Sama integrates platform innovation with a managed, expert, in-house workforce. This alignment allows the company to optimize workflows end to end, achieving throughput and quality gains that are difficult to replicate elsewhere in the market.

Bulk Annotation is available now to all Sama clients, and existing projects will be migrated seamlessly to the new version.

About Sama

Sama delivers data certainty for enterprise AI through tech-enabled annotation, validation and evaluation services. By combining advanced platforms with expert human judgment, Sama helps some of the world’s largest companies, including 30% of the Fortune 50, move AI models from development to production with confidence. With thousands of skilled data professionals and industry-leading quality guarantees, Sama tackles the critical challenge that over 63% of AI models fail to reach production due to poor data quality.

Founded in 2008, Sama has delivered more than 40 billion data points and created employment opportunities that have helped over 70,000 people lift themselves out of poverty. As a certified B Corporation, Sama is committed to advancing both technological innovation and social impact. Learn more at www.sama.com .

