Introducing TRN: TeknoRakit Normalized Data Format
Indonesian data is fragmented. Climate observations come from BMKG, carbon data from various sources, financial data from banks and exchanges. Each has different formats, units, and geographic references.
We built TRN (TeknoRakit Normalized) to solve this. It’s a unified schema that lets you query climate, carbon, and finance data together—without endless data wrangling.
The Problem: Data Silos
Section titled “The Problem: Data Silos”Imagine you’re building a carbon MRV (Monitoring, Reporting, Verification) system. You need:
- Climate data: Temperature, rainfall (from BMKG)
- Carbon data: Grid emissions intensity, fuel mix (from PLN, IESR)
- Financial data: Carbon credit prices, electricity rates (from exchanges)
- Geographic data: Island, province, district (from BPS)
- Cultural data: Saka Calendar dates (for reporting alignment)
Each source uses different:
- Units: Celsius vs. Fahrenheit, kg vs. tonnes, IDR vs. USD
- Time zones: UTC vs. WIB (UTC+7)
- Geographic references: Station IDs vs. province codes vs. grid zones
- Precision: Hourly vs. daily vs. monthly
Result: 80% of your time is spent mapping and cleaning, 20% on actual analysis.
TRN: A Unified Schema
Section titled “TRN: A Unified Schema”TRN defines a canonical representation for all Indonesian data:
// rakit-record crate - Rust implementation of TRNuse rakit_record::{Record, Timestamp, Location, DataQuality};
pub struct TRNRecord { // Unique identifier pub id: String,
// Temporal pub timestamp: Timestamp, // ISO 8601 UTC pub saka_sasih: String, // Saka month (unique to TRN) pub saka_pawukon: String, // Saka calendar system
// Geographic pub location: Location { pub station_id: String, // BMKG station ID pub province: String, // Standardized province pub island: String, // Sumatra, Java, Borneo, etc. pub latitude: f64, pub longitude: f64, },
// Climate domain pub climate: Option<ClimateData> { pub temperature_c: f64, pub humidity_pct: f64, pub precipitation_mm: f64, pub wind_speed_ms: f64, },
// Carbon domain pub carbon: Option<CarbonData> { pub emissions_kg: f64, pub carbon_intensity_gco2_kwh: f64, pub fuel_mix: FuelMix, },
// Finance domain pub finance: Option<FinanceData> { pub carbon_credit_price_usd: f64, pub electricity_rate_idr_kwh: f64, pub grid_frequency_hz: f64, },
// Metadata pub data_quality: DataQuality, // raw | qc_passed | estimated pub source: String, // bmkg_realtime | plndata | iesr | etc. pub confidence: f64, // 0.0 to 1.0}Key Features
Section titled “Key Features”1. Multi-Domain in One Record
Section titled “1. Multi-Domain in One Record”Instead of three separate queries:
-- Without TRN: Three separate queriesSELECT temperature_c FROM climate WHERE station_id = 'BMKG_001';SELECT emissions_kg FROM carbon WHERE region_id = 'BMKG_001';SELECT electricity_rate_idr_kwh FROM finance WHERE grid_zone = 'Java';
-- With TRN: One unified querySELECT climate.temperature_c, carbon.emissions_kg, finance.electricity_rate_idr_kwhFROM trn_recordsWHERE location.station_id = 'BMKG_001' AND timestamp >= '2024-01-01';2. Saka Calendar Integration
Section titled “2. Saka Calendar Integration”Indonesia’s Hindu-Buddhist calendar (Saka) is culturally significant. TRN includes automatic conversion:
from rakit_record import gregorian_to_saka
# Gregorian datedate = datetime(2024, 3, 17)
# Automatic Saka conversionrecord = TRNRecord( timestamp=date, saka_sasih="Bhadra", # Saka month saka_pawukon="Kliwon", # Saka day cycle)
# Use in queriesSELECT * FROM trn_recordsWHERE saka_sasih = 'Bhadra' -- All records in Saka month Bhadra AND saka_pawukon = 'Kliwon';3. Geographic Hierarchy
Section titled “3. Geographic Hierarchy”TRN understands Indonesia’s geography:
Island (Sumatra, Java, Borneo, ...) └─ Province (West Java, East Java, ...) └─ District (Bandung, Surabaya, ...) └─ Station (BMKG_001, BMKG_002, ...)Query at any level:
-- All data for Java islandSELECT * FROM trn_records WHERE location.island = 'Java';
-- All data for West Java provinceSELECT * FROM trn_records WHERE location.province = 'West Java';
-- Specific stationSELECT * FROM trn_records WHERE location.station_id = 'BMKG_001';4. Data Quality Tracking
Section titled “4. Data Quality Tracking”Every record includes quality metadata:
pub enum DataQuality { Raw, // Direct from source, no validation QCPassed, // Passed automated quality checks Estimated, // Interpolated or satellite-derived}
// Query only high-quality dataSELECT * FROM trn_recordsWHERE data_quality = 'QCPassed' AND confidence >= 0.95;Real-World Example: Carbon MRV
Section titled “Real-World Example: Carbon MRV”A carbon credit project needs to calculate emissions reductions from renewable energy:
-- TRN makes this simpleSELECT DATE_TRUNC('month', timestamp) as month, location.province, AVG(climate.temperature_c) as avg_temp, SUM(carbon.emissions_kg) as total_emissions, AVG(carbon.carbon_intensity_gco2_kwh) as avg_intensity, AVG(finance.electricity_rate_idr_kwh) as avg_rate, COUNT(*) as observation_countFROM trn_recordsWHERE location.island = 'Java' AND timestamp >= '2023-01-01' AND data_quality = 'QCPassed'GROUP BY month, location.provinceORDER BY month DESC, location.province;Without TRN, this would require:
- Joining 3+ tables with different schemas
- Converting units (Celsius, kg, IDR)
- Mapping geographic references
- Handling missing data from different sources
- Filtering by quality flags
With TRN: One query, consistent results.
Implementation
Section titled “Implementation”TRN is implemented in multiple languages:
Rust (rakit-record crate)
Section titled “Rust (rakit-record crate)”use rakit_record::TRNRecord;
let record = TRNRecord { timestamp: Timestamp::now(), location: Location { station_id: "BMKG_001".to_string(), province: "West Java".to_string(), ..Default::default() }, climate: Some(ClimateData { temperature_c: 28.5, humidity_pct: 75.0, ..Default::default() }), ..Default::default()};
// Serialize to Parquetrecord.to_parquet("output.parquet")?;Python (rakit-record package)
Section titled “Python (rakit-record package)”from rakit_record import TRNRecord, Location, ClimateData
record = TRNRecord( timestamp="2024-03-17T15:00:00Z", location=Location( station_id="BMKG_001", province="West Java", latitude=-6.9271, longitude=107.6411, ), climate=ClimateData( temperature_c=28.5, humidity_pct=75.0, precipitation_mm=2.3, ),)
# Write to Parquetrecord.to_parquet("output.parquet")JavaScript (rakit-record.js)
Section titled “JavaScript (rakit-record.js)”import { TRNRecord } from 'rakit-record';
const record = new TRNRecord({ timestamp: new Date(), location: { stationId: 'BMKG_001', province: 'West Java', latitude: -6.9271, longitude: 107.6411, }, climate: { temperatureC: 28.5, humidityPct: 75.0, precipitationMm: 2.3, },});
// Write to Parquetawait record.toParquet('output.parquet');Performance
Section titled “Performance”TRN is optimized for analytics:
- Compression: 2.3 GB raw → ~500 MB TRN Parquet (77% reduction)
- Query speed: 100M records queried in <1 second
- Storage: $0.023/GB/month on S3
Adoption
Section titled “Adoption”TRN is already used by:
- Carbon MRV projects: Standardizing emissions reporting
- Climate research: Correlating weather with environmental outcomes
- Financial institutions: Integrating climate risk into credit models
- Government agencies: Unified data for policy analysis
What’s Next?
Section titled “What’s Next?”We’re working on:
- Real-time TRN streaming: Process data as it arrives
- TRN federation: Query across multiple data providers
- TRN versioning: Handle schema evolution gracefully
- TRN compression: Further reduce storage costs
Get Started
Section titled “Get Started”GARUDA’s API returns data in TRN format. Download the Parquet files and start querying:
# Download free TRN climate datacurl -O https://github.com/teknorakit/garuda-datasets/releases/download/v1.0.0/bmkg_climate_2020_2024.parquet
# Query with DuckDBduckdb> SELECT saka_sasih, AVG(temperature_c) FROM 'bmkg_climate_2020_2024.parquet' GROUP BY saka_sasih;Questions about TRN? Join our GitHub Discussions or email support@teknorakit.com.