My PhD is set to end on the 30th of March, 2026.
I am open to fun stuggestions of what to do after.
Uppaal Coshy: Automatic Synthesis of Compact Shields for Hybrid Systems
Asger Horn Brorholt, Andreas Holck Høeg-Petersen, Peter Gjøl Jensen, Kim Guldstrand Larsen, Marius Mikučionis, Christian Schilling, and Andrzej Wąsowski
To appear in Reachability Problems, RP 2025
Abstract
We present Uppaal Coshy, a tool for automatic synthesis of
a safety strategy—or shield —for Markov decision processes over continu-
ous state spaces and complex hybrid dynamics. The general methodology
is to partition the state space and then solve a two-player safety game,
which entails a number of algorithmically hard problems such as reach-
ability for hybrid systems. The general philosophy of Uppaal Coshy is
to approximate hard-to-obtain solutions using simulations. Our imple-
mentation is fully automatic and supports the expressive formalism of
Uppaal models, which encompass stochastic hybrid automata.
The precision of our partition-based approach benefits from using finer
grids, which however are not efficient to store. We include an algorithm
called Caap to efficiently compute a compact representation of a shield
in the form of a decision tree, which yields significant reductions.
arxiv ·
download tool ·
documentation
Efficient Shield Synthesis via State-Space Transformation
Asger Horn Brorholt, Andreas Holck Høeg-Petersen, Kim Guldstrand Larsen, and Christian Schilling
Bridging the Gap Between AI and Reality, AISoLA 2024
Abstract
We consider the problem of synthesizing safety strategies for control systems, also known as
shields. Since the state space is infinite, shields are typically computed over a finite-state
abstraction, with the most common abstraction being a rectangular grid. However, for many systems,
such a grid does not align well with the safety property or the system dynamics. That is why a
coarse grid is rarely sufficient, but a fine grid is typically computationally infeasible to
obtain. In this paper, we show that appropriate state-space transformations can still allow to use
a coarse grid at almost no computational overhead. We demonstrate in three case studies that our
transformation-based synthesis outperforms a standard synthesis by several orders of magnitude. In
the first two case studies, we use domain knowledge to select a suitable transformation. In the
third case study, we instead report on results in engineering a transformation without domain
knowledge.
published version ·
arxiv ·
code
Shielded Reinforcement Learning for Hybrid Systems
Asger Horn Brorholt, Peter Gjøl Jensen, Kim Guldstrand Larsen, Florian Lorber, Christian Schilling
Bridging the Gap Between AI and Reality, AISoLA 2023
Abstract
Safe and optimal controller synthesis for switched-controlled hybrid systems, which combine
differential equations and discrete changes of the system’s state, is known to be intricately
hard. Reinforcement learning has been leveraged to construct near-optimal controllers, but their
behavior is not guaranteed to be safe, even when it is encouraged by reward engineering. One way
of imposing safety to a learned controller is to use a shield, which is correct by design.
However, obtaining a shield for non-linear and hybrid environments is itself intractable. In this
paper, we propose the construction of a shield using the so-called barbaric method, where an
approximate finite representation of an underlying partition-based two-player safety game is
extracted via systematically picked samples of the true transition function. While hard safety
guarantees are out of reach, we experimentally demonstrate strong statistical safety guarantees
with a prototype implementation and Uppaal Stratego. Furthermore, we study the impact of the
synthesized shield when applied as either a pre-shield (applied before learning a controller) or a
post-shield (only applied after learning a controller). We experimentally demonstrate superiority
of the pre-shielding approach. We apply our technique on a range of case studies, including two
industrial examples, and further study post-optimization of the post-shielding approach.
published version ·
arxiv ·
code ·
julia package