Editorial illustration for: AI Models Give Conflicting Answers on Which Jobs Face Automation Risk

AI Models Give Conflicting Answers on Which Jobs Face Automation Risk

Three leading AI models gave substantially different answers when economists asked which jobs face the greatest automation exposure, according to a Wall Street Journal study published May 10. ChatGPT, Gemini, and Claude each ranked occupations differently, with no consistent agreement on which roles are most at risk. The finding puts a question mark over the reliability of AI-generated labor forecasts at the moment businesses and policymakers are using them most heavily.

What the Models Said

Economists posed identical prompts to all three models, asking each to rank jobs by AI exposure.

The results diverged in meaningful ways across categories including legal work, financial analysis, and creative professions. One model rated a given occupation as highly exposed while another placed the same role in a low-risk bracket.

The researchers did not release a single ranked list because the models produced no consensus ranking to report.

The disagreement is not trivial. Employers, governments, and workers are increasingly citing AI risk assessments to make hiring, training, and policy decisions.

When the models producing those assessments contradict each other, the downstream decisions rest on an unstable foundation.

How We Got Here

AI automation forecasts have circulated in mainstream economic research since at least 2013, when Oxford economists estimated that roughly 47% of U.S. jobs faced high automation risk over the following two decades. That estimate proved controversial and methodologically contested.

Subsequent research from institutions including the McKinsey Global Institute and the OECD produced sharply different exposure percentages using different definitions of “task automation.” The new WSJ findings suggest the same inconsistency problem has migrated into the large language model era.

The cryptocurrency and technology sectors have tracked these forecasts closely. Discussions at Consensus Miami this month centered on whether AI agents would replace knowledge workers rapidly enough to reshape demand for digital payment infrastructure, according to reporting from event coverage published May 10.

Also Read: Billions Network Climbs 20% as DePIN-Adjacent Token Posts $359 Million in Daily Volume

What This Means for AI Confidence

The models’ disagreement matters beyond academic curiosity.

Businesses spending heavily on AI tools often justify the investment by projecting productivity gains from automating specific job functions. If the tools themselves cannot identify which functions are automatable, the investment calculus becomes harder to defend.

For policymakers drafting workforce retraining programs, conflicting AI guidance creates a real allocation problem.

There is no evidence the disagreement stems from deliberate design choices. It more likely reflects differences in training data, prompt sensitivity, and the inherent ambiguity of predicting labor substitution at the task level rather than the occupation level.

Researchers and labor economists said the study should prompt more rigorous disclosure from AI developers about how their models handle economic forecasting.

The WSJ report did not indicate that any of the three companies responded with a comment before publication.

Read Next: PayPal and Google Say AI Agents Need Crypto Rails for Commerce

Assistant Editor

Mehjabeen is a journalist covering crypto news, DeFi, exchanges, trading, and market analysis. Over the past three years, she has focused on the trends and narratives shaping digital asset markets, having ghost written for several Tier 1 and Tier 2 outlets

Similar Posts