Advisor

A Simulation-First Case Study: Rethinking AI’s Role in Clinical Decision-Making

Posted October 1, 2025 | Technology |
A Simulation-First Case Study: Rethinking AI’s Role in Clinical Decision-Making

This Advisor reviews how simulation is currently used to evaluate AI systems and introduces a simulation-first approach that places operational context at the center of model development. To illustrate the benefits of this perspective, it describes a project from the author’s research: one aimed at reducing blood-product waste at a major London hospital.

Case Study: Reducing Waste of Blood Products

Platelets (blood components essential for clotting) present a unique inventory challenge. With a shelf life of (at most) five days, hospitals must carefully balance stock levels to ensure they have enough on hand to meet unpredictable demand while avoiding waste due to expired units. At a large London teaching hospital, my research team observed that many platelet units were returned from wards unused after being requested by clinicians. The standard policy — issuing the oldest available unit — is optimal when all issued units are transfused. However, when units are returned, this practice often prevents them from being reissued before expiring.

This seemed like a good opportunity to use data to improve practice. If a machine learning (ML) model could predict which requests were likely to result in returns, it could support a new policy: issuing the oldest units when a transfusion is likely and the youngest when a return is expected. This approach would increase the chances that returned units remain usable.

However, building the model would require patient-level data, involving long approval processes, integration of data from multiple health systems, and a significant investment of analyst time.

We therefore began by building a simulator to model the workflow in the hospital blood bank, including placing a replenishment order in the morning, selecting a unit to meet each clinical request, and disposing of expired units at the end of each day. We then simulated predictions from models with various levels of performance and assessed how they would affect key outcomes like waste and service level.

Model performance was controlled by adjusting the assumed sensitivity and specificity of the predictions. These measure, respectively, how well the model identified the cases we wanted to flag and how well it avoided false alarms. Each ranges from 0% to 100%. By setting both values to 100%, we could test whether even perfect predictions would make a difference. By varying sensitivity and specificity, we explored how different levels of performance would translate into improvements.

The results showed that the model was worth building. A moderately accurate model would meaningfully reduce waste, assuming the prediction was acted on. We observed that there was a much larger improvement when the issuing policy was combined with optimized replenishment orders (how many units the blood bank should order from its supplier each day). This operational insight, showing how changes to multiple decision-making processes interact, would have been impossible to learn from predictive performance metrics alone.

The simulation results gave our team the confidence to proceed with model development, knowing that the time and effort required to secure and process clinical data would be worthwhile. Once the model was developed, the workflow simulator was used to help tune the hyperparameters of the ML model and estimate its real-world impact. We also explored how these benefits might vary across hospitals, finding that the model would be especially valuable in hospitals where a greater proportion of units are returned and where units tend to be older upon delivery. This demonstrates that a good simulator can support evaluation throughout the model development process.

Just as importantly, building the simulator prompted early engagement between stakeholders. Blood bank staff, clinicians, and data scientists worked together to define the decisions that mattered, the constraints that applied, and the metrics that should be used to determine success. This collaboration ensured that any model developed would be evaluated against practical criteria and shaped from the outset to fit the real-world context in which it would operate.

The simulation-first approach helped us identify whether ML would add value, how good the model would need to be to be “good enough,” and whether it would be useful in a workflow not originally designed for ML. The resulting policy is being tested further — not just because a model was built but because the system around it was understood, challenged, and carefully modeled.

[For more from the author on this topic, see: “A Simulation-First Approach to AI Development.”]

About The Author
Joseph Farrington
Joseph Farrington is a data scientist and chartered accountant. He recently earned a PhD in machine learning at University College London, UK, under the supervision of Ken Li, Wai Keong Wong, and Martin Utley. Dr. Farrington’s research was funded by UKRI training grant EP/S021612/1, the CDT in AI-enabled Healthcare Systems, and the NIHR University College London Hospital’s Biomedical Research Centre. He can be reached at joseph.farrington.18@… Read More