The sources discuss seven distinct problem statements for the "Bharatiya Antariksh Hackathon
2025." Here's a summary of each:
• Problem Statement 1: Forest Fire Prediction and Spread Simulation
o Goal: To develop a system that can predict forest fires for the next day and simulate
their spread dynamics (area and depth) for up to 12 hours if no action is taken.
o Why it's important: Forest fires cause significant biodiversity losses, carbon
emissions, and economic damages, and traditional detection systems are reactive,
not predictive.
o Objectives: Create a probability map of fire risk for the next day and then simulate
the fire's spread within 1 to 6 hours from high-probability areas.
o Methodology: Uses machine learning and deep learning for prediction maps, and
cellular automata (CA) or other ML techniques for spread simulation.
o Key Data: Burnt area, surface temperature, precipitation, weather variables (wind,
humidity), elevation, slope, land cover maps, and thermal anomalies/fire data.
• Problem Statement 2: AI-Based Helpbot for Information Retrieval
o Goal: To create an AI-based helpbot capable of scanning and processing all
available information (static and dynamic) from the www.modc.gov.in web portal.
This portal archives and disseminates satellite data and related services.
o Objectives: Leverage natural language processing (NLP) and machine learning (ML)
to understand user queries (e.g., "difference between INSAT 3D and 3DS,"
"geospatial extent of a product") and provide accurate, context-aware answers in a
conversational mode, similar to ChatGPT. The solution should also be modular for
future expansion.
o Key Data: Documents (PDF, TXT), static web content, tables, and metadata tags
hosted on the website.
o Suggested Tools: Retrieval Augmented Generation (RAG), Large Language Models
(LLMs), spaCy, NLTK, Dialogflow, Node.js, Rasa.
• Problem Statement 3: Monitoring Air Pollution from Space
o Goal: To monitor air pollution from space using an integrated approach that
combines satellite observations and AI/ML techniques.
o Why it's important: Air pollution, especially particulate matter (PM1, PM2.5, PM4,
PM10), is a growing global problem that effectively interacts with radiation.
o Objective: Estimate particulate matter (PM2.5 and PM10) concentration over the
Indian region using radiance measurements from INSAT 3D series satellites.
o Methodology: Establish a relationship between satellite-measured radiance and
surface-level particulate matter concentrations. This involves converting satellite
radiance to reflectance over clear-sky pixels and training AI/ML models (e.g.,
Random Forest) using satellite data, ground-based PM observations, and
atmospheric boundary layer parameters.
o Key Data: Radiance data from INSAT 3D series, PM concentration from ground-based
networks (e.g., CPCB), and boundary layer parameters from reanalysis datasets.
• Problem Statement 4: Designing Chain of Thought Based LLM System for Spatial Analysis
o Goal: To design a chain-of-thought-based Large Language Model (LLM) system for
solving complex spatial analysis tasks through intelligent geoprocessing
orchestration.
o Why it's important: Standard Geographic Information System (GIS) workflows
require significant domain expertise, and the aim is to automate these using natural
language queries.
o Objectives: Develop a reasoning-enabled framework that combines LLMs and
geoprocessing APIs to automate multi-step GIS workflows from natural language
queries. This includes integrating data from heterogeneous sources, building a user
interface (chatbot and map-based), and demonstrating tasks like flood mapping or
site selection.
o Expected Output: A web-based or desktop application (e.g., QGIS plugin) where the
LLM can "reason" through the steps required to generate geospatial outputs.
o Key Data/Tools: Open-source satellite imagery (Bhuvan), vector data
(OpenStreetMap), open-source geoprocessing tools (QGIS, GDAL APIs), Google Earth
Engine APIs, and open LLM models (Llama, Mistral 7B). Retrieval Augmented
Generation (RAG) agents are also suggested.
• Problem Statement 5: Chase the Cloud: Leveraging Diffusion Models for Cloud Motion
Prediction
o Goal: To leverage diffusion models for predicting cloud motion using INSAT
3D/3DR/3DS imagery.
o Why it's important: Diffusion models show promise in generative tasks, and accurate
cloud motion prediction is crucial for "nowcasting" problems (e.g., lightning or
thunderstorm prediction) where real-time satellite imagery can be limited.
o Objective: Develop a generative model or diffusion network that uses a past
sequence of satellite images to predict future satellite images by conditioning noise
on past frames.
o Expected Solution: Given the last 3 hours (six frames) of imagery, predict a minimum
of two frames (1 hour ahead), with potential to predict up to 6-7 hours ahead.
o Key Data: Multi-channel INSAT satellite imageries (visible, shortwave infrared,
thermal infrared, water vapor) from the MOSDAC website, with 30-minute temporal
and 4km spatial resolution.
o Suggested Frameworks: PyTorch, TensorFlow, DDPM, DDIM, latent diffusion models.
• Problem Statement 6: AI/ML-Driven Automated Feature Detection and Change Analysis
This problem statement has three sub-problems, and participants are required to choose
only one:
o Sub-problem A: Glacial Lakes
▪ Goal: Develop an AI/ML model for automated detection and delineation of
glacial lakes in the Indian Himalayas and track their multi-temporal changes.
▪ Why it's important: Melting glaciers are forming and expanding glacial lakes,
posing a significant risk of glacial lake outburst floods (GLOFs) to
downstream communities.
▪ Challenges: Delineation is tricky due to shadows, cloud cover, snow, and ice.
▪ Key Data: Multisource satellite imagery (Sentinel-1, Sentinel-2, Landsat
5/7/8/9) available via Google Earth Engine, along with an existing glacial lake
inventory.
o Sub-problem B: Road Networks (National Highway Centerlines)
▪ Goal: Automatically extract center lines of national highways from high-
resolution LISS-4 satellite data.
▪ Challenges: Varying road widths (10-60m), shadows, occlusions, terrain
variations, and image resolution limitations.
▪ Key Data: LISS-4 imagery from the Bhuvan website and national highway
shapefiles from OpenStreetMap for reference.
o Sub-problem C: Urban Drainage Systems (Stream Centerlines)
▪ Goal: Extract center lines of streams from urban areas using LISS-4 data.
▪ Challenges: Similar to road networks – varying stream widths, shadows,
connectivity, occlusions, terrain variation, and image resolution limitations.
▪ Key Data: LISS-4 imagery, DEM data, and stream shapefiles from
OpenStreetMap for reference.
• Problem Statement 7: Air Quality Visualizer and Forecast App
o Goal: To develop an algorithm for a mobile application that provides granular, real-
time, and predictive air quality information, specifically for smaller cities or rural
areas.
o Why it's important: Current air quality forecasts primarily cover major metropolitan
cities, leaving a data gap for rural areas.
o Objectives: Display real-time AQI, visualize historical trends, forecast AQI for 24-72
hours, map pollution sources, and provide health recommendations.
o Expected Outcome: Real-time data visualization (e.g., AQI heat maps) and push
alerts based on pollution thresholds.
o Key Data: ISRO satellite data, CPCB and IMD data sets (historical data available on
Kaggle and ODC data portal).
o Suggested Tools: Mobile UI frameworks (React Native, Flutter), Google Firebase for
backend and notifications, time series predictive models (e.g., ARMA model), and
mapping APIs (Google Maps, Bhuvan Map).
Here's a summary of all the problem statements discussed in "Problem Statement Explainer Session -
2" of the "Bharatiya Antariksh Hackathon 2025":
1. Problem Statement 8: Novel Approaches for Optimizing Deep Learning in Earth
Observation with Imbalanced Data
o Goal: To address the significant challenge of class imbalance in satellite imagery
datasets used for deep learning segmentation, where some classes (e.g., small water
bodies, wetlands) are vastly underrepresented compared to dominant ones (e.g.,
vegetation, urban areas). This imbalance often causes models to neglect the minority
classes.
o Objective: Design and develop novel optimizers, loss functions, or model
architectures that can effectively handle and overcome this class imbalance
problem. The solution should also aim to mitigate issues like gradient explosion and
vanishing during training.
o Expected Outcome: Novel techniques that significantly improve the accurate
detection of rare classes in satellite imagery.
o Data: Two datasets are provided: a "roads" dataset with three classes (one making
up 88% of the data) and a "land cover" dataset with five classes (vegetation at 67%),
both exhibiting clear class imbalance.
2. Problem Statement 9: AI/ML-Based Algorithm for Identifying Tropical Cloud Clusters (TCCs)
o Goal: To develop an AI/ML algorithm capable of automatically identifying and
tracking Tropical Cloud Clusters (TCCs) using half-hourly satellite data from INSAT
observations.
o Why it's important: TCCs are crucial for understanding rainfall distribution and are
vital indicators of cyclogenesis (the formation of cyclones), potentially leading to
severe storms.
o Objective: The algorithm must detect TCCs by applying specific infrared brightness
temperature thresholds (e.g., less than 218 Kelvin for the North Indian Ocean, less
than 221 Kelvin for the South Indian Ocean). It also needs to determine the size (in
km²), intensity (in brightness temperature), and independent nature of each cloud
cluster, and track their movement.
o Expected Outcome: Outputting physical parameters of TCCs, including position
(latitude/longitude), number of pixels, mean, and maximum temperature, in NetCDF
or HD5 format. The algorithm should also manage persistence checks and distinguish
between closely located clusters.
o Data: INSAT 3D infrared brightness temperature data from the MOSDAC data portal.
3. Problem Statement 10: Identifying Coronal Mass Ejection (CME) Events from Particle Data
o Goal: To develop an algorithm that can identify "hello CME events" using particle
data from the SWIS (Solar Wind Ion Spectrometer) instrument aboard India's Aditya-
L1 mission.
o Why it's important: The sun's emissions (charged particles) significantly impact
Earth's magnetosphere, causing space weather phenomena like geomagnetic
storms. Predicting these events, especially using solely plasma parameters, is a
current challenge.
o Objective: Create an algorithm that can detect sudden jumps or transient events
within the time series data of energy histograms for alpha particles and protons
originating from the sun.
o Expected Solution: An algorithm that analyzes time series data to identify event
occurrences, with results verifiable against existing CME catalogs like Richardson and
Kane, or CACTus.
o Data: Time series data containing energy histograms, proton velocity, number
velocity, temperature, and thermal speed of plasma from the SWIS instrument,
available via the ISSDC data dissemination center.
4. Problem Statement 11: Novel Method to Detect Landslides and Boulders on the Moon
o Goal: To devise a novel method for the detection of landslides and boulders on the
lunar surface using imagery from Chandrayaan missions.
o Why it's important: Boulders and landslides represent significant obstacles for
future robotic missions on the Moon, making their accurate detection crucial for safe
navigation and planning.
o Objective: Given Chandrayaan-1, -2, or -3 images, the method should accurately
detect and count the number of boulders and landslides. A key aspect is
distinguishing elevated boulders from landslides that occur on slopes and flow along
the surface. For landslides, identifying the entire feature, including its point of origin,
is highly valued.
o Expected Outcome: A solution that can precisely delineate and enumerate lunar
boulders and landslides, with validation based on the maximum and true positive
detections.
o Data: Panchromatic images from Chandrayaan-1, -2, or -3 missions (taken at 4.4-2.7
micrometer wavelength) available through the ISSDC portal.
5. Problem Statement 12: Dual Image Super Resolution for High-Resolution Optical Satellite
Imagery and Blind Evaluation
o Goal: To generate super-resolved (higher resolution) images from two low-
resolution satellite images and, importantly, develop techniques for their blind visual
assessment (evaluation without a reference image).
o Why it's important: High-resolution satellite data acquisition is costly. Satellites often
use multiple, staggered low-resolution cameras that produce aliased signals, which
can be leveraged for computational super-resolution.
o Objectives:
▪ Produce a 2x super-resolved image from two given low-resolution images
(LR1, LR2) that are specifically simulated to be shifted by 0.5 pixels and
include degradation (MTF) and noise.
▪ Develop a blind assessment technique that evaluates the quality of super-
resolved images without requiring a reference image, ensuring sensitivity to
degradation and correlation with proxy scores like SSIM.
o Expected Solution: Solutions can employ either classical image processing
techniques (e.g., degradation function estimation, image registration, interpolation)
or modern deep learning methods.
o Data: Worldview-3 open-source dataset (0.5m resolution) from the SpaceNet AWS
platform will be provided for training and testing.
6. Problem Statement 13: Generation of High-Resolution Lunar Digital Elevation Model (DEM)
using Photoclinometry
o Goal: To generate a high-resolution lunar Digital Elevation Model (DEM) from single
(mono) lunar images, utilizing the technique of photoclinometry (shape from
shading).
o Why it's important: This addresses scenarios where stereo images are not available,
requiring the generation of DEMs solely from single images.
o Objective: The core objective is to produce a depth map from a single lunar image
and then convert it into an absolute digital elevation model. This process requires
additional data, including solar parameters (sun azimuth and elevation) and the
camera's viewing geometry, along with a reference DEM for comparison.
o Expected Outcome: A relative DEM derived from a single image, subsequently
transformed into an absolute DEM of the lunar surface, which can then be compared
against a reference DEM of the same area.
o Data: Chandrayaan-2 TMC2 data sets, including radiometrically corrected images
from both TMC1 and TMC2, accessible via the ISSDC web portal.
7. Problem Statement 14: Robust Change Detection Monitoring and Alert System
o Goal: To develop a robust system for change detection, monitoring, and alerts
based on user-defined Areas of Interest (AOI) and multi-temporal satellite imagery.
o Why it's important: Leveraging frequent satellite coverage, this system aims to
monitor wide areas for changes such as deforestation, encroachment, illegal land
occupation, or water body reclamation, and then alert relevant authorities or users.
o Objectives:
▪ Enable users to define AOIs on a map and subscribe to specific change alerts.
▪ Automate the processing of GIS and multi-temporal satellite data to identify
changes.
▪ Crucially, the system must include integrated cloud and shadow masking to
prevent false alerts caused by atmospheric conditions.
o Expected Outcomes: A fully automated, user-friendly, app-based platform with an
online interface for AOI definition. The system should automatically download and
process data, generate change events, and allow export of results in GIS-compatible
formats (GeoJSON, shapefiles). It should be scalable and flexible for future algorithm
and alert type integration.
o Data: Multi-temporal satellite imagery, with Google Earth Engine APIs suggested for
initial experimentation, data download, and pre-processing.