Queueing Analytics: Machine Learning, Causal Queueing, and SiMLQ for Data Driven Simulation
The objective of this talk is to expose researchers to the vast possibilities of using modern machinery and data for implementing effective management analytics for queueing processes. Such processes are ubiquitous in modern economies, such as customers waiting for service, inventory waiting for processing or transportation, payments and invoices awaiting generation or clearance, and computing tasks waiting for resources. I will discuss recent developments in queueing analysis based on several papers, beginning with defining management analytics across descriptive, predictive, comparative (i.e., comparing performance indicators under different interventions), and prescriptive dimensions. We briefly review a machine learning solution for a G/G/1 queue based on [2] and its extension to G(t)/G/1 from [3], then shift our main focus to structural causal queueing models (SCQM), based on [1]. While few organizations employ queueing theorists (QTs), many have well-trained data scientists (DS), prompting the question: Can DS use data and SCQM to provide accurate comparative analytics without deep expertise in queueing? We propose a data-driven approach to represent system building blocks, enabling the creation of a non-queueing simulator without prior knowledge of the system. This approach proves effective for comparative analytics, such as analyzing expected waits in a GI/M/1 system with speed-ups. We show that DS can refine parent sets of queueing variables from data using off-the-shelf algorithms, even with moderate sample sizes, and can apply machine learning to estimate the causal structure of queues (e.g., Lindley’s Recursion), using G-computation to derive counterfactual inferences. For the GI/M/1 with speed-ups, we compare the performance of estimates by a QT using data-driven estimates of queue primitives with those of a DS using either parametric (with known inter-arrival and service time distributions) or nonparametric (with unknown distributions) SCQM-based estimators. Surprisingly, the errors made by the DS, who requires no knowledge of the system dynamics, are comparable to those made by the QT, who does require such knowledge. These findings suggest that SCQM is effective in practical settings where even expert QTs cannot derive closed-form results. We conclude with a short demonstration of SiMLQ, a software tool that uses machine learning to automate the visualization, simulation, and optimization of queueing processes. SiMLQ builds data-driven simulation models from event-log data collected by standard information systems, empowering users to enhance resource management, improve efficiency, reduce costs, and manage risks—SiMLQ: from data to action.