Episode

Reinforcement Learning in Queue-Reactive Models: Application to Optimal Execution

Tomas Espana,Yadh Hafsi,Fabrizio Lillo,Edoardo Vittori

Nov 19, 2025•11:10

Trading and Market Microstructure

No ratings yet

Abstract

We investigate the use of Reinforcement Learning for the optimal execution of meta-orders, where the objective is to execute incrementally large orders while minimizing implementation shortfall and market impact over an extended period of time. Departing from traditional parametric approaches to price dynamics and impact modeling, we adopt a model-free, data-driven framework. Since policy optimization requires counterfactual feedback that historical data cannot provide, we employ the Queue-Reactive Model to generate realistic and tractable limit order book simulations that encompass transient price impact, and nonlinear and dynamic order flow responses. Methodologically, we train a Double Deep Q-Network agent on a state space comprising time, inventory, price, and depth variables, and evaluate its performance against established benchmarks. Numerical simulation results show that the agent learns a policy that is both strategic and tactical, adapting effectively to order book conditions and outperforming standard approaches across multiple training configurations. These findings provide strong evidence that model-free Reinforcement Learning can yield adaptive and robust solutions to the optimal execution problem.

Links & Resources

View on arXiv Download PDF

Authors

Tomas Espana Yadh Hafsi Fabrizio Lillo Edoardo Vittori

Cite This Paper

arXiv:2511.15262

Year:2025

Category:q-fin.TR

APA

Espana, T., Hafsi, Y., Lillo, F., Vittori, E. (2025). Reinforcement Learning in Queue-Reactive Models: Application to Optimal Execution. arXiv preprint arXiv:2511.15262.

MLA

Tomas Espana, Yadh Hafsi, Fabrizio Lillo, and Edoardo Vittori. "Reinforcement Learning in Queue-Reactive Models: Application to Optimal Execution." arXiv preprint arXiv:2511.15262 (2025).