Meta releases comprehensive dataset for AI practitioners

Meta dataset

Meta’s dataset enables researchers to evaluate the fairness and robustness of certain types of AI models.

The Casual Conversations v2 dataset by Meta was informed and shaped by a literature review around relevant demographic categories and was created in consultation with internal experts in fields such as civil rights. This dataset offers a list of 11 self-provided and annotated categories to further measure algorithmic fairness and robustness in these AI systems.

The dataset features 26,467 video monologues recorded in seven countries featuring 5,567 paid participants who provided self-identified attributes such as age and gender, and is the next generation following the original Casual Conversations consent-driven dataset, which we released in 2021. In addition to an expanded list of categories, Casual Conversations v2 differs from the first version, including participant monologues recorded outside the United States.

The countries included in v2 are India, Brazil, Indonesia, Mexico, Vietnam, the Philippines, and the United States. Another difference in the latest dataset is that participants were given the chance to speak in both their primary and secondary languages. The types of monologues include both scripted and nonscripted speech.

Also Read: Indian short form video market’s monetization could reach $8-12 billion by 2030: Report

The first dataset’s labels included only age, three subcategories of gender (female, male, and other), apparent skin tone, and ambient lighting. With the understanding that there are numerous underrepresented communities of people, languages, and attributes, the new model attempts to dig deeper into subcategories to identify potential model gaps in fairness and robustness.

Gender is one of the categories commonly used to assess fairness in computer vision tasks. The Casual Conversations v2 dataset invited participants to disclose their gender with the additional option of completing a freeform field. Along with Prof. Pascale Fung, director of the Centre for AI Research, and other researchers from the Hong Kong University of Science and Technology, Meta conducted a literature review of governmental and academic resources for potential categories and then published our findings for other researchers to build upon this work. Internal civil rights experts and domain experts at Metawere also consulted.

In the dataset, participants self-labeled their spoken and native languages and were able to speak in more than one language or dialect across their video submissions. Of the 11 categories included in Casual Conversations v2, seven were provided by the participants, while the remaining were manually labeled by annotators. The self-provided categories are age, gender, language/dialect, geolocation, disability, physical adornments, and physical attributes. For the remaining categories (voice timbre, apparent skin tones, recording setup, and activity), we trained vendors with detailed guidelines to enhance consistency and reduce the likelihood of subjective annotations during the labeling process.