Ensemble ML Methods for Predictive Analytics in South Asian Market Contexts

Abstract

Off-the-shelf forecasting models, trained predominantly on Western market data, systematically underperform in South Asian contexts where seasonality, informal economic activity, and data sparsity differ markedly. We develop ensemble methods tailored to these conditions and demonstrate consistent gains in demand forecasting across retail, agriculture, and financial sectors.

Problem

Emerging-market organizations are often sold generic predictive analytics products that assume clean, dense, Western-distributed data. In practice, the data is sparse and irregular, festivals and monsoon cycles dominate seasonality, and informal-sector signals are missing entirely—so the models forecast poorly exactly where decisions matter.

Approach

Ensemble architectures that combine gradient-boosted trees with structured seasonal models to handle sparse, irregular series
Feature engineering grounded in regional seasonality (festival calendars, agricultural cycles) rather than generic date features
Evaluation across retail, agriculture, and finance datasets drawn from South and Southeast Asian markets

Results

Consistent forecasting improvements over off-the-shelf baselines across all three sectors
Largest gains on sparse and highly seasonal series—the cases where generic models fail most
Robustness to missing data, reducing the operational burden of pipeline gaps

Why It Matters

Better forecasts translate directly into less waste, smarter inventory, and fairer resource allocation in markets that are routinely underserved by mainstream ML tooling. The work underpins the applied case studies in my courses on practical machine learning.

Resources

Code, the curated benchmark datasets, and the published paper are available through the links above.