Tutorial: Modeling and Optimization of Latency in Erasure-Coded Storage Systems

Session chair(s):
Jakob
Hoydis

Presenter

Placeholder Profile
Vaneet
Aggarwal
Purdue University, USA
Placeholder Profile
Tian
Lan
George Washington University, USA
Placeholder Profile
Parimal
Parag
Indian Institute of Science, Bangalore, India

Abstract

As consumers are increasingly engaged in social networking and E-commerce activities, businesses grow to rely on Big Data analytics for intelligence, and traditional IT infrastructure continues to migrate to the cloud and edge, these trends cause distributed data storage demand to rise at an unprecedented speed. Erasure coding has seen itself quickly emerged as a promising technique to reduce storage cost while providing similar reliability as replicated systems, widely adopted by companies like Facebook, Microsoft and Google. However, it also brings new challenges in characterizing and optimizing the access latency when erasure codes are used in distributed storage. The aim of this tutorial is to provide a review of recent progress (both theoretical and practical) on systems that employ erasure codes for distributed storage.

In this tutorial, we will first identify the key challenges and taxonomy of the research problems and then give an overview of different approaches that have been developed to quantify and model latency of erasure-coded storage. This includes recent work leveraging MDS-Reservation, Fork-Join, Probabilistic, and Delayed-Relaunch scheduling policies, as well as their applications to characterize access latency (e.g., mean, tail, asymptotic latency) of erasure-coded distributed storage systems. We bridge the gap between theory and practice, and discuss lessons learned from prototype implementation. In particular, we will discuss exemplary implementations of erasure-coded storage, illuminate key design degrees of freedom and tradeoffs. Open problems for future research will also be discussed.

Biography

Vaneet Aggarwal
Vaneet Aggarwal received the B.Tech. degree from the Indian Institute of Technology, Kanpur, India in 2005, and the M.A. and Ph.D. degrees in 2007 and 2010, respectively from Princeton University, Princeton, NJ, USA, all in Electrical Engineering. He is currently an Associate Professor at Purdue University, West Lafayette, IN, where he has been since Jan 2015. He was a Senior Member of Technical Staff Research at AT&T Labs-Research, NJ (2010-2014). Adjunct Assistant Professor at Columbia University, NY (2013-2014), and VAJRA Adjunct Professor at IISc Bangalore (2018- 2019). His current research interests are in communications and networking, cloud computing, and machine learning. Dr. Aggarwal received Princeton University’s Porter Ogden Jacobus Honorific Fellowship in 2009, the AT&T Vice President Excellence Award in 2012, the AT&T Key Contributor Award in 2013, the AT&T Senior Vice President Excellence Award in 2014, and Purdue University's Most Impactful Faculty Innovators Award in 2020. He also received the 2017 Jack Neubauer Memorial Award recognizing the Best Systems Paper published in the IEEE Transactions on Vehicular Technology, and the 2018 Infocom Workshop HotPOST Best Paper Award. He has over 220 peer-reviewed journal and conference publications in top-tier venues including IEEE Transactions on Information Theory, IEEE/ACM Transactions on Networking, IEEE JSAC, Journal of Machine Learning Research, ACM Transactions on Modeling and Performance Evaluation of Computing Systems, IEEE ISIT, IEEE ICDCS, ICCV, CVPR, AAAI, ALT, AISTATS, and ACM Multimedia. He was on the Editorial Board of IEEE Transactions on Green Communications and Networking from 2017-2020. He is currently on the Editorial Board of the IEEE Transactions on Communications and the IEEE/ACM Transactions on Networking. He has been on the TPC of top-tier conferences including ISIT, Sigmetrics, AAAI, and Mobihoc, and PC co-chair for NAS 2019 and SDDCS 2017. He has also been co-organizer for workshops at Neurips 2020, IJCAI 2020, and Infocom 2021.
Tian Lan
Tian Lan received his Ph.D. from the Department of Electrical Engineering at the ​Princeton University​ in 2010, M.S. from the Department of Electrical and Computer Engineering at the University of Toronto​ in 2005, and B.A.Sc. in Electrical Engineering from the ​Tsinghua University​ in 2003. He is currently an associate professor in the Department of Electrical and Computer Engineering at the ​George Washington University​, which he joined in 2010. His current research interest in network optimization, machine learning, and network security. His work on the fairness of multi-resource allocation received IEEE INFOCOM Best Paper Award in 2012. He also received the Securecomm Best Paper Award in 2019, GWU SEAS Faculty Recognition Award in 2018, GWU Hegarty Faculty Innovation Award in 2017, AT&T VURI Award in 2014, IEEE GLOBECOM Best Paper Award in 2009, and IEEE Signal Processing Society Best Paper Award in 2008. He has over 100 peer-reviewed journal and conference publications in top-tier venues including IEEE Transactions on Parallel and Distributed Systems, IEEE/ACM Transactions on Networking, IEEE Transactions on Mobile Computing, ACM Transactions on Modeling and Performance Evaluation of Computing Systems, IEEE ISIT, IEEE ICDCS, IEEE INFOCOM, IEEE MobiCom. He has been on the TPC of top-tier conferences including INFOCOM and ICDCS, and served as a publications/web/industrial chair for conferences such as LANMAN, SECON, CCGrid, and PAC.
Parimal Parag
Parimal Parag received the B.Tech. and M.Tech. dual degree from the Indian Institute of Technology, Madras, India in fall 2004, and the Ph.D. degree in fall 2011, from Texas A&M University, College Station, TX, USA, all in electrical engineering. He is currently an assistant professor in the department of electrical communication engineering at Indian Institute of Science, Bangalore, KA, India, where he has been since Dec 2014. He is also a co-convenor of the Centre for networked intelligence, a faculty participant at Robert Bosch centre for cyber-physical systems, and a member of the applied probability research group. He was a senior system engineer in corporate R&D at ASSIA Inc, Redwood City, CA from Oct 2011 to Nov 2014. He was at Stanford University, CA, USA and Los Alamos National Laboratory, NM, USA in the autumn of 2010 and the summer of 2007, respectively. His current research interests are in network theory, applied probability, optimization methods, and in their applications to distributed networked systems such as cloud storage and computation. His previous work includes performance evaluation, monitoring, and control of large broadband communication systems and networks. His other research interests lie in the areas of game theory, statistical signal processing, queueing theory, information theory, estimation & detection theory, combinatorics, and probability theory. Parimal received SERB early career award in 2017, graduate fellowship from Texas A&M University in 2014, silver medal for academic excellence from IIT Madras in 2003, and Indian National Talent Search Scholarship in 1996. He was co-author of the student best paper award publication at IEEE International Symposium on Information Theory 2018. He has several peer-reviewed journal and conference publications in top-tier venues including IEEE Transactions on Information Theory, IEEE/ACM Transactions on Networking, IEEE Journal on Selected Areas in Communications, IEEE Transactions on Communications, IEEE ISIT, IEEE ITW,IEEE INFOCOM, IEEE ICASSP, and IEEE VTC