Projects

Ongoing Projects

Self-correcting ML-driven Resource Allocation in Networked Systems

Despite much progress, the goal of driving network design from higher level intent remains elusive. The challenges stem from the fact that while network architects often do have informal knowledge of conditions under which their network must operate, existing design tools (based on optimization and constraint solvers) need \textbf{precise} characterizations of these conditions. The goal of this project is to automate the process of designing networks from informal operator hints. The proposal will tackle this goal through recent developments in ML, and ML-based techiques for network design.

Causal ML models for Data-Driven Optimization of Internet video

A central theme of data-driven networking is answering what-if questions -- what would be the impact of changing the design of a networked system, given data obtained from a real-world deployment of an existing system. In this project, we are investigating the use of causal reasoning approaches to answer “what-if questions” using data collected from prior deployments of systems. A particular area of focus is Internet video, given video accounts for over 80% of Internet traffic today, and it is critical to deliver high quality Internet video over variable Internet environments. Our work extensively uses insights from real-world data sets of Internet video sessions.

Flexible Data Plane Programming

Data plane programming has been widely adopted by both academia and industry. As a prominent instance, the P4 programming language has been a key language enabling flexible data plane programming with the support of compilers, formal semantics, verification frameworks, testing systems , as well as commodity hardware such as Intel Tofino and Cisco Silicon One. Despite these promising development, there is still a long way to go to let average network architects program data planes in a natural and efficient way. One unique challenge for data plane programming stems from the inherent tension between abstraction and customized optimization. The goal of this project is to reconcile the two research directions by developing a radically new programming system for data plane programming, with which a user can naturally and flexibly describe her network design and optimization goals and the corresponding optimal data plane can be automaticall generated.

Next Generation Multi-Perspective Video Delivery at Internet Scale

The success of streaming video has generated interest in newer forms of multi-perspective video content, such as those generated by 360-degree cameras, multi-angle camera arrays, or light-field cameras. The immersive experience provided by these cameras can enhance user satisfaction. This project explores architectural enhancements, algorithms, and techniques to deliver multi-perspective video at Internet-scale.

ML-Driven Online Traffic Analysis at Multi-Terabit Line Rates

Self-driving networks, i.e., networks driven by real-time analytics performed on data at line-rate guided by programmatic control, can help to ensure better network security and aid with performance diagnosis and repair. Realizing such a network often require machine learning (ML) inferencing algorithms (e.g., to detect anomalous traffic). Unfortunately, as network bandwidth grows to hundreds of gigabits to even terabits per second, it is challenging to analyze network traffic at line rates today. Many production intrusion detection systems rely on out-of-band analysis resulting in slow reaction times that may take the order of minutes to resolve security issues besides requiring significant bandwidth to export data from routers. This project is tackling the challenge of inline traffic analysis using programmable switches and programmable hardware.

Past Projects

Synthesizing network designs with certifiable performance properties

Network design is ad-hoc today, and validating the design normally comes as an afterthought. Unlike the chip and software industry, where design and verification tools form a multi-billion dollar industry, network design and verification is still at an early stage. We are exploring approaches to synthesizing network designs with formally certifiable performance properties under failures. The work may be viewed as an early step towards verifying quanititative network properties.

Adaptive Bit Rate algorithms for Internet Video delivery

Recent years have seen a tremendous increase in the popularity of Internet video, which forms a major fraction of Internet traffic today, and Cisco technical report says that Internet video traffic expects to be 30 exabyte in 2020. In this trend, delivering high quality of experience (QoE) is critical since it correlates with user engagement and revenue. The project is investigating to deliver high quality video across diverse and variable network conditions. Most Internet video delivery uses adaptive bitrate (ABR) algorithms, combined with HTTP chunk-based streaming protocols, and ABR algorithm is one of the critical part for success of high QoE Internet video delivery. However, these ABR algorithms today have fixed and closed source implementations which results in two problems: 1) content publishers can't customize them according to their preference 2) no single algorithm works well across the diverse range of bandwidth conditions in the wild. In order to solve these problems we propose a novel video delivery pipeline. Our results show that our approach can improve the median QoE by 37% compared to a commercial ABR.

Towards Automated and Assurable Enterprise Network Migration:

Enterprise network operators must frequently change the design of their networks to reflect new organizational needs (e.g., company mergers). Redesigning enterprise networks is challenging given theneed to change hundreds of interdependent low-level configurations. Configuration errors can have catastrophic consequences (e.g., large-scale network outages). The project is investigating systematic frameworks to help operators redesign their networks to meet desired high-level objectives. Optimization problems are formulated that trade-off the benefits of a redesign task with the reconfiguration costs involved. Algorithms for the redesign tasks are derived by exploring synergies with theoretical work in the operations research community. We are devising ways to map high-level network design to low-level configuration complexity metrics, and investigating algorithms to minimize the complexity of network designs. The techniques are being applied to important and unexplored problem domains such as migrating security policies from enterprise data centers to a cloud computing model, reorganizing routing designs on mergers, and service differentiation policies. The research if successful will change how operators manage their networks, leading to large cost-savings for IT organizations, and the creation of more reliable and secure networks. The research will foster innovation by lowering the risks in migration to new enterprise network architectures such as cloud computing and clean-slate architectures such as those based on Software-Defined Networks.

SmartEdge for low-latency Web applications

In recent years, owing to the dramatic increase in cellular users, it has become imperative for service providers to provide better quality of experience for their users. Web download is one of the key activities contributing to a significant fraction of the mobile traffic than any other application, excluding video streaming. Mobile Web experience is still not par with desktops, particularly the page load times are in the order of a few seconds. This is because modern Web pages are complex and feature-rich customized for individual user preferences, with lots of objects fetched (as a result of parsing HTML and/or executing Javascripts) from many domains. Consequently, this results in lots of HTTP request-response interactions in the high latency cellular link. Thus, today's web page download process is ill suited to cellular networks resulting in high page load times. To tackle this challenge, we are developing systems and techniques to improve Mobile Web performance by optimizing the last-mile network delays that dominate page load latencies in cellular settings. Specifically, (1) a proxy based system design that judiciously refactors browsing functionality between a proxy (inside the cellular network) and a client based on their respective strengths (figure below) (2) develop techniques to make the proxy design scale to millions of users by reducing its computational overhead. Today's Web pages consist of hundreds of objects with complex dependencies, with some objects being more critical to page-load latencies than others (e.g., javascripts may determine which objects are needed). Yet, current Content Delivery Networks (CDNs) are agnostic to object criticality, potentially impacting overall page latencies. We seek to achieve low latency for several tens of thousands of the most popular pages by exploring novel criticality-aware algorithms for object placement and caching.

Abstractions for Enterprise Network Management

The use of abstractions to simplify network design and configuration has been a long cherished vision in the networking community. We posit that achieving more fundamental break-throughs requires abstractions that go beyond merely modeling the underlying protocols and mechanisms. We investigate a new class of abstractions that are: (i) task-driven, i.e., capture the intended performance, security, manageability, or resilience of a network design; and (ii) network-wide, i.e., capture the requirements of the network as a whole rather than of individual devices. Our focus is on the management of enterprise networks. Despite their critical importance, and their striking differences and diversity compared to carrier networks, enterprise networks have been largely unexplored by networking researchers. We envision a three-pronged research process that involves: (i) capturing the goals operators have for their networks, through interactions with operators, and bottom-up studies of actual network designs, (ii) elevating the design patterns we observe into abstractions; and (iii) demonstrating that abstractions can simplify both top-down network design, and validation of network properties. A distinguishing feature of this research is its white-box methodology to studying network designs. Rather than infer network characteristics with limited support from network operators as is common practice today, we will capitalize on our extensive ties with real network operators, and conduct studies using data such as router configuration files obtained with their support, and iterative interactions with them. We are currently designing abstractions in two areas that are critically important, and widely prevalent in enterprises. (i) use of virtualization, in particular VLANs, to simplify management goals; and (ii) network evolution through planned maintenance.

Architecting Latency Sensitive Applications for the Cloud:

Cloud computing offers IT organizations the ability to create geo-distributed, and highly scalable applications while providing attractive cost-saving advantages. Yet, architecting, configuring, and adapting cloud applications to meet their stringent performance requirements is a challenge given the rich set of configuration options, shared multi-tenant nature of cloud platforms, and dynamics resulting from activities such as planned maintenance. A unique area of focus of our research is interactive multi-tier applications (e.g., enterprise applications, web applications) which have received limited attention from the community. We are developing novel methodologies, and systems that can enable application architects to (1) judiciously architect their applications across multiplecloud data-centers while considering application performance requirements, cost saving objectives, and cloud pricing schemes guided by performance and cost models of cloud components such as key-value datastores; (2) create applications that can adapt to ongoing dynamics in cloud environments through transaction reassignment over shorter time-scales. Our research if successful can enable IT organizations to significantly reduce costs by optimally moving their operations to the cloud. We are also working on creating benchmarks based on operationally deployed applications and collecting workload traces which will be made available to the research community.

Dissecting State-of-the-Art Video Distribution Networks

We are conducting a detailed study of the YouTube CDN with a view to understanding the policies used to determine which data centers users download video from. Our analysis is conducted using unique week-long datasets simultaneously collected from the edge of five networks - two university campuses and three ISP networks - located in three different countries. Our analysis employs state-of-the-art delay-based geolocation techniques to find the geographical location of YouTube servers. Our results indicate that the RTT between users and data centers plays a prominent role in the video server selection process. More interestingly however, our results reveal a variety of factors besides RTT can influence server selection including load-balancing, diurnal effects, DNS misconfiguration, limited availability of rarely accessed video, and the need to alleviate hot-spots that may arise due to popular video content.

Trustworthy Peer-to-Peer Networks

Peer-to-peer systems are rapidly maturing from being narrowly associated with copyright violations, to a technology that offers tremendous potential to deploy new services over the Internet. In many ways, peer-to-peer systems are beginning to herald a paradigm shift in this decade, in much the same way as HTTP in the 1990's. In this project, we are studying challenges in designing peer-to-peer systems in a safe, secure and robust manner, and considering new issues to Internet management due to the proliferation of peer-to-peer systems.

Heterogeneity and Incentives for Peer-to-Peer Video Broadcasting

We propose the design of bandwidth-demanding broadcasting applications using overlays in environments characterized by hosts with limited and asymmetric bandwidth, and significant heterogeneity in outgoing bandwidth. Such environments are critical to consider to extend the applicability of overlay multicast to mainstream Internet environments where insufficient bandwidth exists to support all hosts, but have not received adequate attention from the research community. We leverage the multi-tree framework and design heuristics to enable it to consider host contribution and operate in bandwidth-scarce environments. Our extensions seek to simultaneously achieve good utilization of system resources, performance to hosts commensurate to their contributions, and consistent performance. We have implemented the system and conducted an Internet evaluation on PlanetLab using real traces from previous operational deployments of an overlay broadcasting system.