Overview
A benchmark for traffic that is not vehicle-only.
The HetroD Challenge evaluates whether simulation models can reproduce heterogeneous urban traffic where vehicles, scooters, cyclists, and pedestrians interact in close proximity.
Participants submit rollout results on the hidden test split. The final leaderboard is computed by the organizers and displayed on September 1.
Goals and Expected Outcomes
Realistic, safe, and type-aware simulation.
- Advance closed-loop multi-agent simulation in dense heterogeneous traffic.
- Measure realism with WOSAC-style simulation metrics while avoiding vehicle-centric bias.
- Highlight VRU-aware interaction failures, including collisions and cross-type time-to-collision risks.
- Provide a clear public benchmark for future trajectory and traffic simulation research.
Task Definition
Generate 8-second closed-loop rollouts.
Given an initial heterogeneous traffic scene, participants must generate future multi-agent rollouts for vehicles, two-wheelers, and pedestrians. Rollouts should preserve realistic kinematics, plausible interactions, and valid map-region behavior.
- Output: closed-loop trajectories for all required agents.
- Horizon: 80 future steps at 10 Hz, equivalent to 8 seconds.
- Evaluation: realism, interaction safety, region validity, and trajectory accuracy.
Challenge Timeline
One development window and one final release.
Start
June 1, 2026
Dataset access, validation tools, and submission instructions become available.
Development
June 1 - September 1
Participants train on the train split, validate with ground truth, and prepare one final hidden-test rollout submission.
Release
September 1, 2026
The challenge closes and the official leaderboard is displayed.
Dataset and Splits
HetroD train, validation, and test in ScenarioNet format.
We provide HetroD scenes in the ScenarioNet format, which defines a unified traffic scenario description with HD maps and object annotations.
- Train: includes ground-truth trajectories for model training.
- Validation: includes ground truth for local development and ablation.
- Test: ground truth is hidden. Participants submit rollout results in the required format.
Evaluation Metrics
WOSAC realism, adapted for heterogeneous traffic.
HetroD keeps the WOSAC simulation-realism structure, then adds VRU-aware and type-aware modifications so scooters, cyclists, and pedestrians are not diluted by vehicle-dominant distributions.
1. Kinematic Score
Balanced by agent type.
- Linear Speed
- Linear Acceleration
- Angular Speed
- Angular Acceleration
2. Interaction Score
VRU-aware and cross-type safety.
- Distance to Nearest Object
- Overall + VRU Collision
- Overall + Cross-type TTC
3. Region / Map Score
Valid regions depend on agent type.
- Type-aware Distance to Valid Region
- Type-aware Invalid Region Rate
4. Trajectory Score
Simple accuracy indicators.
- ADE
- minADE
- Vehicle ADE
- Two-wheeler ADE
- Pedestrian ADE
Submission
Submit one folder of scenario-level PKL files.
The test split has no public ground truth. Participants submit one folder containing generated 8-second closed-loop rollout results for all required scenarios.
- Only one final test submission is allowed per team.
- The leaderboard is hidden during the challenge period.
- Official scores and rankings are displayed on September 1.
HetroD WOSAC Submission Format
Upload one folder containing one .pkl file per ScenarioNet scenario. Each filename must match its scenario id.
<scenario_id>.pkl
Example:
00_loc18_seg10_ego_1.pkl
Sample file: 00_loc18_seg10_ego_1.pkl
{
"agent_id": np.ndarray, # shape: [A], dtype: int64 or int32
"simulated_states": np.ndarray, # shape: [32, A, 80, 4], dtype: float32
}
agent_idmust exactly match the provided per-scenario evaluation agent list. Do not add extra tracks or omit required agents.simulated_states[:, i, :, :]corresponds toagent_id[i].- The last dimension is
[x, y, z, yaw], with positions in meters and yaw in radians. - Submit future only: 80 steps at 10 Hz, equivalent to GT steps 11..90.
Leaderboard
Hidden until September 1.
Rankings will be computed from the hidden test set after the submission deadline. The public leaderboard will display overall score and category scores.