Truncated Counterfactual Learning for Anytime Multi-Agent Path Finding

Thomy Phan and Shao-Hung Chan and Sven Koenig.
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2026.
[abstract] [bibtex] [pdf] [code]

@article{ phan1AAAI26,
    author      = "Thomy Phan and Shao-Hung Chan and Sven Koenig",
    title       = "Truncated Counterfactual Learning for Anytime Multi-Agent Path Finding",
    year        = "2026",
    
    
    abstract    = "Anytime multi-agent path finding (MAPF) is a promising approach to scalable and collision-free path optimization in multi-agent systems. MAPF-LNS, based on Large Neighborhood Search (LNS), is the current state-of-the-art approach where a fast initial solution is iteratively optimized by destroying and repairing selected paths, i.e., a neighborhood, of the solution. Delay-based MAPF-LNS has demonstrated particular effectiveness in generating promising neighborhoods via seed agents, according to their delays. Seed agents are selected using handcrafted strategies or online learning, where the former relies on human intuition about underlying structures, while the latter conducts black-box optimization, ignoring any structure. In this paper, we propose Truncated Adaptive Counterfactual K-ranked LEarning (TACKLE) to select seed agents via informed online learning by leveraging handcrafted strategies as human intuition. We show theoretically that TACKLE dominates its handcrafted and black-box learning counterparts in the limit. Our experiments demonstrate cost improvements of at least 60% in instances with one thousand agents, compared with state-of-the-art anytime solvers.",
    url         = "https://thomyphan.github.io/files/2026-aaai-preprint1.pdf",
    eprint      = "https://thomyphan.github.io/files/2026-aaai-preprint1.pdf",
    location    = "Singapore",
    
    journal     = "Proceedings of the AAAI Conference on Artificial Intelligence",
    
    
    
}