The training of AI systems and humanoid robots increasingly relies on a distributed workforce of gig workers operating from home in countries like Nigeria. These workers, many holding other jobs, use smartphones and ring lights to record themselves performing physical movements and actions that feed into machine learning models. Companies developing humanoids need vast amounts of real-world human motion data, and outsourcing this task to remote workers in lower-cost countries has become standard practice. However, this arrangement raises critical questions about labor standards, fair compensation, and the transparency of how AI development is actually being conducted globally.
This trend highlights a growing disconnect between the public narrative around AI advancement and the actual human infrastructure supporting it. While headlines focus on technological breakthroughs and corporate milestones, the workers providing essential training data often operate in precarious conditions with minimal oversight or protections. Gig workers typically lack employment benefits, clear contractual terms, or recourse if companies exploit their contributions. The practice also obscures the true cost of AI development—not just in computational resources, but in human labor that subsidizes corporate profits.
The emergence of this labor model demands urgent policy attention. Regulators, companies, and industry bodies must establish minimum standards for AI training work, including fair wages tied to local economies, transparent data usage policies, and worker protections. As AI companies scale up their operations, the ethical implications of relying on dispersed, unprotected workers become increasingly critical. Without intervention, the AI industry risks building prosperity on the backs of vulnerable workers who have little say in how their contributions are used or compensated.
As companies race to develop humanoid robots and advanced AI models, a growing workforce of gig laborers in developing countries are providing the training data that makes these systems work—often without adequate compensation or worker protections.
