Skip to content

Available Datasets

GARUDA provides access to three primary data domains: Climate, Carbon, and Finance. All datasets are available via API and as downloadable Parquet files.

Coverage: 100+ weather stations across Indonesia
Time Range: 2000–present (historical data available)
Update Frequency: Daily
Size: ~2.3 GB (2020–2024)

Schema:

  • station_id (String) — BMKG station identifier
  • timestamp (Timestamp) — Observation time (UTC)
  • temperature_c (Float64) — Temperature in Celsius
  • humidity_pct (Float64) — Relative humidity percentage
  • precipitation_mm (Float64) — Precipitation in millimeters
  • wind_speed_kmh (Float64) — Wind speed
  • province (String) — Indonesian province
  • island (String) — Island name (e.g., “Jawa”, “Sumatera”, “Kalimantan”)

Example Query:

SELECT
province,
AVG(temperature_c) as avg_temp,
SUM(precipitation_mm) as total_precip
FROM climate_observations
WHERE timestamp > '2024-01-01'
GROUP BY province;

Coverage: 50+ registered carbon projects in Indonesia
Time Range: 2015–present
Update Frequency: Monthly
Status: Available in v0.3.0

Schema (Preview):

  • project_id (String) — Unique project identifier
  • project_name (String) — Project name
  • project_type (String) — e.g., “forest conservation”, “renewable energy”
  • province (String) — Project location
  • vintage_start (Date) — Credit vintage start date
  • vintage_end (Date) — Credit vintage end date
  • credit_quantity (Int64) — Total credits issued
  • credit_price_usd (Float64) — Market price

Coverage: Stock exchange, commodity prices, currency rates
Time Range: 2010–present
Update Frequency: Real-time (1-minute delay)
Status: Available in v0.3.0

Schema (Preview):

  • timestamp (Timestamp) — Quote time
  • symbol (String) — Ticker symbol
  • price_idr (Float64) — Price in IDR
  • volume (Int64) — Trading volume
  • market (String) — Market name (IDX, commodity exchange, etc.)

Coverage: Complete Saka Calendar mapping
Time Range: 2000–2050
Update Frequency: Static (annual updates)
Size: ~5 MB

Schema:

  • gregorian_date (Date) — Gregorian calendar date
  • saka_sasih (String) — Saka month name
  • saka_pawukon (String) — Pawukon cycle
  • saka_eka (Int32) — Eka (day of cycle)

Example Query:

SELECT
c.timestamp,
c.temperature_c,
s.saka_sasih
FROM climate_observations c
JOIN saka_calendar s ON DATE(c.timestamp) = s.gregorian_date
WHERE c.province = 'Bali'
LIMIT 100;

All datasets are partitioned by:

  • Province — 34 Indonesian provinces
  • Island — 17,000+ islands grouped into major islands (Jawa, Sumatera, Kalimantan, Sulawesi, Papua, Nusa Tenggara, Maluku)

Query by geographic hierarchy:

SELECT * FROM climate_observations
WHERE island = 'Jawa' AND province = 'Jawa Barat';

All datasets undergo quality checks:

  • Completeness: >95% non-null values per column
  • Consistency: Schema validation on ingestion
  • Timeliness: Updates within 24 hours of source publication

Quality metrics are available via the /data-quality endpoint (v0.3.0+).

  • Free Tier: Open Parquet downloads (CC BY 4.0)
  • Developer+: API access with rate limits
  • Enterprise: Custom data ingestion and SLA

See Pricing for details.

  • Interactive data catalog with coverage maps
  • Data quality dashboards
  • Custom data ingestion for enterprise customers
  • Advanced analytics (reversal risk scoring, climate correlation)