LLMs use Reddit data to improve US inflation forecasts
BDI Paper Auf Deutsch lesen

LLMs use Reddit data to improve US inflation forecasts

A Banca d'Italia paper demonstrates that large language models can transform Reddit discussions into timely predictors of US inflation. The study constructs monthly narrative indicators from economics-focused subreddits.

LLMs as economic sensors

The research shows that large language models (LLMs) can convert Reddit discussions into valuable predictors for US headline CPI and core PCE inflation.

By analyzing inflation-related submissions and comments from major economics-focused subreddits, the authors construct monthly narrative indicators capturing perceived price dynamics.

Signals are generated by fine-tuning pre-trained models (BERT, Qwen, LLaMA, Gemma architectures) using labels from human annotators and ChatGPT.

These Reddit-LLM models, and their forecast combinations, improve point and density forecasts relative to standard benchmarks, including autoregressive models augmented with Michigan survey expectations and inflation swaps.

Nowcasting, implemented in pseudo-real time, also shows competitive performance against the Cleveland Fed Inflation Nowcast, especially when using early-month information.

The methodology ensures strictly backward-looking transformations to prevent look-ahead bias.

Scalable and robust insights

The paper highlights three key contributions: a transparent methodological pipeline for converting raw Reddit content into time-series regressors, empirical evidence of systematic forecasting gains up to 18 months ahead (largest at short horizons), and a comprehensive set of robustness exercises.

The results are not sensitive to benchmark choices or alternative loss functions, and they hold for a pre-COVID sample.

Importantly, much of the predictive content can be captured with fine-tuned small language models (SLMs), which often deliver performance close to larger LLMs at a fraction of the computational cost.

This supports scalable and resource-efficient deployment, making the approach suitable for routine monitoring in resource-constrained environments.

Social media's predictive power

This research validates a novel, real-time approach to inflation forecasting, leveraging the spontaneous narratives of social media.

It underscores the significant, yet often overlooked, potential of unstructured data to complement traditional economic indicators.

The findings suggest a future where central banks could integrate such dynamic, cost-effective insights into their macroeconomic monitoring frameworks.