Technological progress can displace workers from existing work, as well as create new work. Combined with demographic changes and macroeconomic fluctuations, it has also spurred the growth of non-standard employment. Within these trends, Neha Arya shines the spotlight on “data workers” in India’s gig and platform economy. Noting the vulnerability of human labour behind artificial intelligence (AI) systems, she deliberates on the way forward for India as a major player in global AI supply chains.
Technological advancements reshape models of interaction among individuals, firms, governments, and across these groups. These changes often influence the fundamental nature of work, along with several dynamics of the labour market. Lin (2011) used historical US Census data for the period 1965-2000 (that used the Dictionary of Occupation Titles) to show the novel job titles it captured, for example, “web developer”, “chat room host”, and “radiopharmacist”. Using Lin’s approach, Autor et al. (2021) found that over 60% of employment in 2018 was under job titles that did not exist in 1940. However, while technology, such as automation, can displace workers from existing jobs or tasks, it also creates new work, reinstating demand for workers with specific expertise (Acemoglu and Restrepo 2018). Identifying these new occupational titles is especially crucial for developing countries like India, which have a large, informal and vulnerable workforce and are experiencing rapid technological advancements.
Technological innovations, demographic changes, and macroeconomic fluctuations have enabled proliferation of several forms of non-standard employment (NSE) worldwide. These include part-time/on-call work, temporary agency work/multi-party employment arrangements, disguised employment/dependent self-employment1 (International Labour Organization (ILO), 2016). The Covid-19 pandemic further accelerated this trend. Developing countries, marked by a high degree of casual employment, have also experienced this change in the nature of employment. A 2019 global survey found that 80% of the respondents preferred flexible work opportunities, and 65% of businesses reported cost-efficiency gains of such flexibilisation (International Workplace Group (IWG), 2019). India’s flexi-staffing industry rose by 15.3% during 2023-24 driven mainly by FMCG (fast-moving consumer goods), e-commerce, manufacturing, healthcare, retail, logistics, banking, and energy sectors (Indian Staffing Federation (ISF), 2024). This surge raises concerns about increasing informality and vulnerable employment, especially amid the rapid expansion of the gig and platform economy.
The human labour behind artificial intelligence
While broad issues around India’s gig and platform economy have gained prominence, the emerging category of “data workers” (new work that is vital for Artificial Intelligence (AI) systems) remains largely overlooked in the discourse. Since the term AI was coined by a group of computer scientists (McCarthy, Minsky, Rochester, and Shannon) in 1956, it has evoked a mix of hope, fear, and uncertainty for the future of work. After several efforts, the ongoing AI revolution is now observing a global race for AI leadership. Generative AI (GenAI) models are swiftly becoming popular, like OpenAI’s ChatGPT,- which became the fastest-growing web application in history with 100 million monthly active users within 2.5 months and the attainment of 500 million users in a short span of time (Hadi and Najm 2023, Paris 2025). The term “generative” underscores the fact that these AI systems can create or generate new material autonomously without human input (Feuerriegel et al. 2023). However, a huge amount of human labour goes into development of these AI systems. Many of these AI systems (including ChatGPT, Google’s Gemini, DALL-E, among others) are based on a complex “human-in-the loop” (HITL) model (Rani and Dhir 2024). HITL uses the judgement of human data workers for annotation, labelling, and categorising raw data (like text files, images or videos) to train machine learning models (IBM, 2025). Data curators, labellers, content moderators, validators, and human feedback providers work to ensure that AI does not perform poorly or dangerously (for instance, in autonomous cars). Accuracy of these data is crucial for efficiency and better predictability/performance of AI models. Thus, data workers are the backbone of AI systems, ensuring their functionality, accuracy, and safety – ironically while themselves working in precarious, fragmented, and often invisible conditions.
Why is this an important contemporary and future concern for India? To meet cost-efficiency goals, businesses increasingly rely on gig workers in the AI supply chain – often outsourcing tasks to crowd workers via digital labour platforms (DLPs) or smaller firms employing data workers. For instance, at the announcement of Amazon’s Mechanical Turk or MTurk (a virtual labour marketplace/crowdwork platform) in 2006, Amazon’s CEO Jeff Bezos referred to it as “artificial artificial intelligence”. It signified that the “Human Intelligence Tasks” (HITs) available on the platform were microtasks (often simple and repetitive) to be performed by a reserve army of cheap labour. A prominent outcome of data workers using MTurk (called ‘turkers’) for a project (by Jia Deng et al.), was the release of ImageNet dataset, the largest labelled image dataset, in 2009. It was fuelled by the work of millions of workers across the globe, who manually labelled a million images for very low wages. A 2016 survey by the Pew Research Center, of almost 3,000 turkers from the US revealed that over 50% of all workers reported hourly earnings below $5 (Pew, 2016). Unsurprisingly, a large majority of data workers are in the Global South, where wages are significantly lower. For instance, in Kenya, data workers mostly receive hourly wages of only US$2, while in Argentina hourly wages go as low as US$1.7. In addition, workers are often bound by Non-Disclosure Agreements (NDAs) by companies, further invisibilising their contributions to AI systems (Dachwitz 2024). Besides low wages, concerns have also been raised about adverse mental health outcomes for data workers engaged in content moderation. Content moderators are regularly exposed to traumatising content, which has long-term psychological implications – sometimes even leading to drug dependency (Gebrekidan 2025).
India’s role in global AI supply chains
According to the European Commission, India registered one of the fastest rates of digitalisation (11%) during the period 2011-2019 – similar to China – making the National Industrial Classification (NIC) (2008) used by labour surveys too dated to capture most digitally-driven new work. Where then, were gig, platform, and data workers captured? India’s annual Periodic Labour Force Survey (PLFS) uses the National Classification of Occupations (NCO) (2015), in which “data entry clerks” are captured by “Family 4132” (Figure 1). Essentially, the categories include traditional clerical data input roles and do not explicitly cover modern AI-related data work. Overall, therefore, these workers remain statistically invisible, as is the case for digital platform-based gig workers.
Platform gig and data work in AI supply chains are key present-day illustrations of the “reinstatement effect” (Acemoglu and Restrepo 2019) of technology. Job advertisements (see Figures 2, 3, and 4 below) for roles involving data work, like data validation and data annotation, list key competencies including data analysis, excellent written and verbal communication skills, attention to detail, among others. Figure 5 illustrates the rising demand for data workers in India (now emerging as a key hub for data annotation), powered by a diverse workforce, producing high-quality datasets for global use. In 2024, an estimated 50,000 Indian (freelance) annotators were present on international digital platforms, and 20,000 full-time annotators within India, according to this Economic Times report (citing data from TeamLease). The same report also states that the global market for data annotations is valued at an estimated US$8.22 billion, and is expected to grow swiftly at nearly 26.2% annually by 2028. From US$250 million in 2020-21, India is expected to service over US$7 billion of the global annotation market by 2030 (National Association of Software and Service Companies (NASSCOM, 2021). Even India’s ‘National Strategy for Artificial Intelligence’ identifies data annotation work as having the potential of “absorbing a large portion of the workforce that may find itself redundant due to increasing automation” (NITI Aayog, 2018). But, besides other issues, a concern is that the HITL model may lead to potential de-skilling of workers performing repetitive tasks to train or improve AI systems. Additionally, while location-based gig work has gained regulatory2 attention through collectivisation efforts (Tiwari 2025, Jain 2025, Elizabeth 2024) – often supported by informal labour unions and widespread public discussions – AI data workers remain largely absent from mainstream discourse.
Figure 1. Occupations under ‘Family 4132- Data Entry Clerks’ group from NCO-2015
Group Code |
Occupation Title |
Description |
4132.0401 |
Data Entry Machine Operator |
Enters alphabetic, numeric, or symbolic data into computer, and verifies it. |
4132.0402 |
Domestic Data Entry Operator |
Electronically enters data (daily/ hourly work reports) on client or office sites. |
4132.0600 |
Coding Machine Operator |
Handles coding machines to print codes on different materials. |
4132.0800 |
Duplicating Machine Operator/Photocopier |
Operates and monitors photocopying machines. |
4132.0900 |
Embossing Machine Operator |
Operates power driven embossing machines. |
4132.1000 |
Addressing Machine Operator |
Operates electrically-driven printing machines. |
4132.1300 |
Book Keeping Machine Operator |
Records business transactions using computer softwares, and performs general clerical duties. |
4132.1400 |
Bill Processing Clerk |
Prepares bills, statements, calculates payrolls and other amounts, using computer software. |
4132.9900 |
Data Entry Clerks, Other |
Operates book-keeping and computing machines not elsewhere classified |
Figure 2. Data labelling – permanent
Figure 3. Data annotator – freelance

Figure 4. Classification data annotation – freelance
Policy directions
Although India ranks 14th in AI research globally, with a share of 1.4% during 2018-2023, compared to US’s share of 30.4% and China’s share of 22.8%. However, it has already come into focus as a global market for AI technologies – recently emerging as the second largest, and among the fastest growing markets globally for ChatGPT. As the future implications of the ongoing AI revolution remain obscure for all, India stands at a pivotal moment to shape its role in the global AI supply chain. To fully leverage AI’s economic and (decent) employment potential, a coordinated policy approach is needed. While a national AI strategy lays down a blueprint, updating NCO to encompass AI data-related jobs (including crowdsourced microtask work), establishing AI-focused skill development hubs, regulating gig work in the AI supply chain, and promoting AI-related research and development in an equitable and inclusive manner, are crucial. There is a need to identify such gig work via digital labour registries, promote the upskilling of workers, and ensure the accountability of platforms throughout the chain. The uncertain AI era needs proactive measures to avoid continued polarisation3 of skills and jobs (Kuriakose and Iyer 2020) in India’s labour market. Moreover, declining labour share of income (Karabarbounis and Neiman 2013) owing to technological advancements and popularity of work fragmentation, need immediate regulatory, civil society, and legislative responses. Building a resilient, ethical AI workforce requires both innovation and inclusion. As a ‘hub’ for AI supply chain labour, India has an opportunity as well as a responsibility to improve labour market conditions for these data workers, who must not be disconnected from the wider benefits they generate.
Notes:
- “Disguised employment” refers to arrangements where workers provide their labour while having contractual arrangements corresponding to self-employment.
- “Dependent self-employment” applies to persons who operate a business without employees but do not have complete control over their work.
- Rajasthan Platform Based Gig Workers (Registration and Welfare) Act, 2023; Karnataka Platform-Based Gig Workers (Social Security and Welfare) Act, 2025; Bihar Platform Based Gig Workers (Registration, Social Security and Welfare) Act, 2025.
- Job polarisation refers to the shrinking share of middle-skill jobs (typically involving routine tasks), while both high-skill and low-skill jobs grow within the economy.
Further Reading
- Aapti, A (2025), ‘The humans behind AI: Co-creating best practices for data workers’ well-being’, Aapti Institute, 26 April.
- Acemoglu, Daron and Pascual Restrepo, (2019), "Automation and New Tasks: How Technology Displaces and Reinstates Labor", Journal of Economic Perspectives,33 (2): 3–30.
- Chartouni, C, K Moheyddeen, R Zeid, R Naji and M Pallares-Miralles (2025), ‘Can Flexible Jobs Drive the Future of Work? Lessons from MENA’, World Bank Blogs.
- Dachwitz, I (2024), ‘Data Workers‘ Inquiry: The hidden workers behind AI tell their stories’, Netzpolitik.Org, 8 July.
- Data Workers Inquiry (2025), ‘Fasica - Data Workers' inquiry’, 5 June.
- Eyck, KV (2003), ‘Flexibilizing Employment: An Overview’, International Labour Organization (ILO), SEED Working Paper No. 41, 1 April.
- Feuerriegel, Stefan, Jochen Hartmann, Christian Janiesch and Patrick Zschech (2023), “Generatarget="_blank" rel="noopener noreferrer"tive AI”, Business & Information Systems Engineering, Vol. 66, 111-126.
- Hadi, Musadaq, Mohammed Najm and E Hasan (2023), “Introduction to ChatGPT: A new revolution of artificial intelligence with machine learning algorithms and cybersecurity”, Science Archives, Vol. 4, No. 4, 276-285.
- Kai-Hsin, H (2024), ‘A Case of Assessing the Working and Living Conditions of Data Workers in India's Global Artificial Intelligence Value Chains’, Available here.
- IBM (2025), ‘What is data labelling?’, IBM Think, 3 June.
- International Labour Office (2016), ‘Non-standard employment around the world: Understanding challenges, shaping prospects’.
- Institute for Human Development and International Labour Organization (2024), ‘India Employment Report 2024: Youth employment, education and skills’, 29 March.
- International Workplace Group (2019), ‘The IWG Global Workspace survey: Welcome to generation flex – the employee power shift’.
- Kuriakose, F and Kylasam Iyer, D (2020), ‘Job Polarisation in India: Structural Causes and Policy Implications’, 18 June. Available here.
- Lin, Jeffrey (2011), “Technological adaptation, cities, and new work”, The Review of Economics and Statistics, Vol. 93, No. 2, 554-574, The MIT Press.
- Nasscom (2021), ‘Data annotation – Billion dollar potential driving the AI revolution’.
- NITI Aayog (2018), ‘National Strategy for Artificial Intelligence’, June.
- International Labour Organization (2025), ‘Non-standard forms of employment’, 6 June.
- Pew Research Center (2016), ‘Research in the Crowdsourcing Age, a case study’, 11 July.
- Rani, U and RK Dhir (2024), ‘AI-enabled business model and human-in-the-loop (deceptive AI): implications for labor’, In Handbook of Artificial Intelligence at Work, Garcia-Murillo, M, I MacInnes and A Renda (eds.), Chapter 4, Edward Elgar Publishing.
- Rani, U and RK Dhir (2024), ‘The Artificial Intelligence illusion: How invisible workers fuel the “automated” economy’, 10 December, International Labour Organization.
- TechEquity Collaborative (2025), ‘The AI supply chain, explained’, 19 June.
- Williams, A, M Miceli and T Gebru (2025), ‘The exploited labor behind artificial intelligence’, Noema Magazine, Berggruen Institute, 8 April.
- Zucconi, A, OV Llave and M Consolini (2024), ‘Flexible work increases post-pandemic, but not for everyone’, European Foundation for the Improvement of Living and Working Conditions, 11 April.
Comments will be held for moderation. Your contact information will not be made public.