A central theme of networking is answering what-if questions -- i.e., given recorded data of an existing deployed system, what would be the performance impact if we changed the design of the system (e.g., deployed a new video streaming algorithm), or if the environment changed relative to how data was collected (i.e., a shift in distribution). Answering what-if questions of this nature is also known as causal reasoning, which considers the effect of events that did not occur while the data was being recorded.
Several widely used machine learning (ML) tools (e.g., off-the-shelf neural networks) are inadequate for causal reasoning since they merely capture correlations in collected data. This limits them to predictions that pertain to how a deployed system with an existing design and in an existing environment performs in the future. They incur biases when faced with "what-if" questions, or when aspects of the environment change. Other approaches such as Reinforcement Learning and Randomized Control Trials could be disruptive to the performance of real users, and are not designed to answer ``what-if'' questions about past sessions.
Causal reasoning is uniquely challenging in networking since many algorithms (e.g., adaptive bit rate algorithms, traffic engineering) are adaptive and make their decisions based on network state. Network state variables often act as confounding variables resulting in spurious correlations. Unfortunately, on the networking side, researchers are generally ill-equipped to deal with the hidden complexity that arises from performing ML over causal questions. On the ML side, researchers do not yet have effective tools to answer the types of complex real-world causal questions that the scale and adaptive nature of networking applications require.
In this project, we are exploring causal reasoning on passively collected networked data, which is not disruptive to the performance of live users. Our project supports causal reasoning not only about how the proposed change would affect sessions in the future, but also how it would have affected a given session in the past (also referred to as counterfactual reasoning).
