Study: LLMs show human-like inflation biases, lack consistency
A Bank of England working paper finds that large language models can replicate key empirical regularities of households' inflation perceptions. However, the study also reveals that LLMs lack a consistent model of consumer price inflation, exhibiting unexplained kinks in component sensitivity.
GPT's inflation perception mirrors human biases
A Bank of England working paper explores the ability of large language models (LLMs), specifically 'GPT-3.5 Turbo', to form inflation perceptions and expectations from macroeconomic price signals.
The study compares LLM outputs to household survey data and official statistics, replicating the Bank of England's Inflation Attitudes Survey (IAS) characteristics.
A key methodological aspect is the quasi-experimental design, leveraging GPT's training cut-off in September 2021, which means it lacks knowledge of the subsequent UK inflation surge.
This setup was crucial for tracking aggregate survey results and official statistics.
At a disaggregated level, GPT replicates empirical regularities of households' inflation perceptions, particularly for income, housing tenure, and social class.
The model also shows a heightened sensitivity to food inflation, akin to human respondents.
A novel Shapley value decomposition provides insights into the drivers of model outputs linked to prompt content.
Inconsistencies and ethical considerations
However, the study also reveals that the LLM lacks a consistent model of consumer price inflation, exhibiting unexplained kinks in component sensitivity.
This points to a potential absence of coherent reasoning or a consistent 'world model' for the economic concepts studied.
The micro-level correspondence between LLM and human responses was found to be weak and unstable, possibly due to the rudimentary economic conditioning environment.
The authors discuss the broader use of LLMs in economic analysis, noting that results from AI experiments require empirical validation.
Ethical considerations, including potential biases related to demographic characteristics, are highlighted.
The general approach could evaluate LLM behavior in social sciences, compare models, or assist in survey design, provided careful validation and understanding of inherent trade-offs.
A cautious embrace of AI
This paper offers a crucial framework for understanding the strengths and weaknesses of large language models in economic analysis.
While LLMs show promise in replicating certain human economic perceptions, their inherent inconsistencies and potential biases demand rigorous empirical validation.
For central banks and researchers, this means LLMs can be powerful tools, but only when their specific limitations are fully understood and accounted for.
Source: Inflation attitudes of large language models
IN: