High-Value Datasets

magyar

Documentation

This guide provides technical information required to access the High-Value Datasets published by the Hungarian Central Statistical Office (HCSO) through a programmable interface (API). The API enables access to current and archived statistical data in open, machine-readable formats.

API access from client-side web applications is currently not allowed (blocked by CORS policy).

Access URL

The base URL of the API is: https://data.ksh.hu/

Authentication

No authentication is required for access.

Request format and endpoints

A detailed specification is available in a standard OpenAPI description, which you can browse or download as machine-readable.

Terms of Use

When using automated queries, please act responsibly to avoid excessive load or overuse of system resources.

License

By using the API, you accept the HCSO copyright policy, available at: www.ksh.hu/copyright_eng.

Further Information

If you have any questions, please contact us at: www.ksh.hu/contact

Sample code for programmatic access

In this section, we provide a sample program for downloading data. The source code can be used in a Node.js runtime environment.

01. Downloading the list of datasets

async function listDatasets() {
  const res = await fetch('https://data.ksh.hu/datasets.json');

  if (!res.ok) throw new Error(`datasets.json download error: HTTP ${res.status} ${res.statusText}`);

  const datasetsJson = await res.text();

  let datasets;
  try {
    datasets = JSON.parse(datasetsJson);
  } catch {
    throw new Error('The datasets.json response is not valid JSON.');
  }

  if (!Array.isArray(datasets) || datasets.length === 0) {
    throw new Error('The datasets.json response does not contain a processable dataset list.');
  }
  return datasets;
}

02. Downloading metadata for a dataset

import { XMLParser } from 'fast-xml-parser';

const XML = new XMLParser({
  removeNSPrefix: true,
  ignoreAttributes: false,
  textNodeName: 'text',
  attributeNamePrefix: '',
});

async function fetchDatasetMetadataRdfAsJson(datasetId) {
  const res = await fetch(`https://data.ksh.hu/datasets/${encodeURIComponent(datasetId)}/metadata.rdf`);

  if (!res.ok) {
    throw new Error(`metadata.rdf download error: HTTP ${res.status} ${res.statusText}`);
  }

  const rdfXml = await res.text();

  let rdfJson;
  try {
    rdfJson = XML.parse(rdfXml);
  } catch {
    throw new Error('The metadata.rdf response is not valid XML.');
  }

  return rdfJson;
}

03. Downloading the SDMX schema and data file for a dataset

import { createWriteStream } from 'node:fs';
import { pipeline } from 'node:stream/promises';
import { Readable } from 'node:stream';

const rdfJson = await fetchDatasetMetadataRdfAsJson('XXXX');
const distributionContainer = rdfJson.RDF.Dataset.distribution;
const distributions = (Array.isArray(distributionContainer) ? distributionContainer : [distributionContainer])
  .map(x => x.Distribution);

for (const dist of distributions) {
  const SdmxSchemaURL = dist.conformsTo.resource;
  const SdmxDataURL = dist.downloadURL;

  // download schema
  const schemaRes = await fetch(SdmxSchemaURL);
  if (!schemaRes.ok) {
    throw new Error(`Schema download error: HTTP ${schemaRes.status} ${schemaRes.statusText}`);
  }
  const schemaText = await schemaRes.text();
  // Parse XML or JSON schema...
  // ...

  // download data
  const dataRes = await fetch(SdmxDataURL);
  if (!dataRes.ok) throw new Error(`Download error: HTTP ${dataRes.status} ${dataRes.statusText}`);
  // Save, e.g. to a file.
  await pipeline(Readable.fromWeb(dataRes.body), createWriteStream('data.csv'));
}