Back to all case studies

Transforming Statistical Data Research with StatGPT

Unlock the potential of Natural Language Processing to enhance access to official statistics.

Request Demo

90%

Quality rate of automated data search

2x

Increase in self-service research by business users

4x

Faster access to cross-dataset statistical information

Challenge

Statistical agencies often struggle to deliver quality data to consumers and internal researchers. The data can be difficult to access, requiring complex queries spanning several datasets to retrieve what researchers need.

Off-the-shelf AI solutions can answer some questions about these statistics, but often, they hallucinate or fail to reference the source data. By providing a unified, AI-ready format for statistical data known as SDMX, entities like the International Monetary Fund (IMF) have been able to standardize how AI tools access their data.

Custom-built agentic AI tools optimized for the domain – like StatGPT – enable democratization of access to this data.

Industry

Economy

Dev team size

10 developers

Duration

2 years

Solution

StatGPT is a multi-stage solution that begins with the ingestion of data sources and their descriptive metadata, forming a semantic layer powered by the QuantHub ecosystem.

Once the data is onboarded, users can interact with an agentic chatbot capable of retrieving statistical data by processing natural language queries. The StatGPT portal extends the user exploration journey by providing a convenient interface for reviewing charts and advanced data query editing.

StatGPT operates as an intelligent orchestration framework that recognizes user intent consequently forming and activating corresponding agentic chains including a domain-specific knowledge-base agent, a query-constructing agent, a dataset-exploring agent, a general information agent, and several others. Importantly, before serving any knowledge or data requests, StatGPT employs a guardrail mechanism to ensure that all user interactions comply with organizational guidelines.

The main purpose of the application is to help researchers access the data they need from multiple datasets by transforming natural language requests into specific SDMX queries that retrieve and visualize data, thereby overcoming the limitations of traditional faceted-search data explorers.

icon

Multilingual Natural Language Interface

Researchers and non-technical users can retrieve statistical data in a conversational manner. In addition to English, StatGPT is customizable to support other languages.

icon

Custom Interface for Advanced Query Editing

A so-called hybrid workflow, enables advanced users to further expand and specify data queries and dynamically review data charts to enhance the AI data search.

icon

Human in the Loop Validation

The agentic system continuously requests feedback and asks for clarification from the user to ensure correct interpretation of request or fill information gaps, and increase user satisfaction.

1 / 2

Results

Data retrieval without hallucinations

Chatbot answers are always grounded in the data and can be validated by a manual faceted search.

Collaborative ecosystem

StatGPT enables users to share their research conversations with colleagues.

Index of certified AI-ready statistics

StatGPT is the main engine behind Global Trusted Data Commons initiative that aims to build up a single source of official statistics worldwide and democratize access to data for general public.

media content

Our Interface

StatGPT UI

1 / 3

Used Components

DIAL Ecosystem

Set of tools, agents and applications for building AI-powered business solutions

Ready to write your own success story?

Request Demo