LD-Scene: LLM-Guided Diffusion for Controllable Generation of Adversarial Safety-Critical Driving Scenarios

¹The Hong Kong University of Science and Technology (Guangzhou) ²Sun Yat-sen University ³The Hong Kong University of Science and Technology

Abstract

Ensuring the safety and robustness of autonomous driving systems necessitates a comprehensive evaluation in safety-critical scenarios. However, these safety-critical scenarios are rare and difficult to collect from real-world driving data, posing significant challenges to effectively assessing the performance of autonomous vehicles. Typical existing methods often suffer from limited controllability and lack user-friendliness, as extensive expert knowledge is essentially required. To address these challenges, we propose LD-Scene, a novel framework that integrates Large Language Models (LLMs) with Latent Diffusion Models (LDMs) for user-controllable adversarial scenario generation through natural language. Our approach comprises an LDM that captures realistic driving trajectory distributions and an LLM-based guidance module that translates user queries into adversarial guidance functions, facilitating the generation of scenarios aligned with user queries. The guidance module integrates an LLM-based Chain-of-Thought (CoT) code generator and an LLM-based code debugger, enhancing the controllability and robustness in generating guidance functions. Extensive experiments conducted on the nuScenes dataset demonstrate that LD-Scene achieves state-of-the-art performance in generating realistic and effective adversarial scenarios. Furthermore, our framework provides fine-grained control over adversarial behaviors, thereby facilitating more effective testing tailored to specific driving scenarios.

Model Overview

Overall framework of LD-Scene. During the training stage, an LDM learns the distribution of realistic driving trajectories conditioned on the latent representation of historical scene input. During the inference stage, given a user query, an LLM-based code generator produces an adversarial loss function. This loss function is then validated by an LLM-based debugger through a closed-loop unit testing process and subsequently used to guide the diffusion model in generating safety-critical driving scenarios.

Safety-Critical Traffic Simulation

The ego vehicle faces a critical decision while merging at a ramp.

The ego vehicle encounters a critical situation as a static vehicle suddenly starts moving, requiring the ego vehicle to brake in time.

The ego vehicle encounters a safety-critical interaction with another vehicle at an intersection.

The ego vehicle encounters a safety-critical simulation as the leading vehicle suddenly brakes, requiring the ego vehicle to avoid a rear-end collision.

The ego vehicle encounters a safety-critical scenario involving a sudden cut-in by an adjacent vehicle.

The ego vehicle encounters a challenging scenario with an oncoming vehicle driving against traffic, requiring a prompt response.

Overall performance comparison of baseline models on the nuScenes dataset

Method	Adversariality		Behavior Plausibility					Efficiency
Method	Adv-Ego Coll.	Adv Acc.	Adv Offroad	Other Offroad	Adv-Other Coll.	Other-Ego Coll.	Other-Other Coll.	Sim Time
AdvSim	24.72	0.90	15.60	14.85	0.56	0.91	0.11	338.35
Strive	22.69	0.88	18.94	16.64	0.90	1.08	0.05	609.72
DiffScene	15.06	0.98	19.71	19.65	8.03	2.60	1.67	199.01
Safe-Sim	27.81	1.09	21.79	18.12	7.52	3.21	0.66	193.59
LD-Scene	40.75	1.36	12.52	17.95	4.93	2.17	0.66	229.40

Model Comparison on Adversariality, Behavior Plausibility, and Efficiency

This balanced performance confirms that LD-Scene achieves the strongest adversarial effectiveness while maintaining high levels of realism and generation efficiency, thus demonstrating its superiority across all evaluation metrics.

Controllable Adversarial Behavior

Case studies on adversarial safety-critical scenario generation based on different user queries, including normal collisions, high-speed overtaking collisions, and sharp-turn collisions. The results demonstrate the capability of our LD-Scene for controllable adversarial behavior, enabling the generation of diverse safety-critical scenarios.

Failure Cases

Collision with non-ego vehicle.

Rear-end collision with the ego vehicle.

Head-on collision with the ego vehicle.

Unreasonable selection of adversarial vehicle.

@article{peng2025ld, title={LD-Scene: LLM-Guided Diffusion for Controllable Generation of Adversarial Safety-Critical Driving Scenarios}, author={Peng, Mingxing and Xie, Yuting and Guo, Xusen and Yao, Ruoyu and Yang, Hai and Ma, Jun}, journal={arXiv preprint arXiv:2505.11247}, year={2025} }