Tutorial: Private Computation with Applications to Machine Learnings

Session chair(s):

Qiaosheng

Zhang

Presenter

Anoosheh

Heidarzadeh

Texas A&M University

Alex

Sprintson

Texas A&M University

Abstract

With the advent of cloud computing, many different types of our data such as medical records, financial transactions, and access logs are stored at remote servers across the globe. This data is being constantly used by a broad range of applications that perform computation remotely for a variety of purposes, including financial reports, statistical analysis, and business analytics. Many of these applications leverage Machine Learning (ML) algorithms for data analysis and decision making. In such applications, it is critical to hide the identity of the data items used in the computation from the cloud operators. We refer to this notion of privacy as data access privacy. The data access privacy provides guarantees to the data owners that the cloud provider does not know whether their data is used by various computing applications. We refer to the process of computation with private data access as private computation. The objective of private computation is to minimize the amount of data that needs to be downloaded from the remote provider(s) while guaranteeing data access privacy. Private computation generalizes classical Private Information Retrieval (PIR) problem by enabling the applications to perform remote computation on the data, and is generally more efficient than first retrieving data privately (using a PIR scheme) and then performing computation on the retrieved data.

Private computation is a fascinating topic with many breakthrough results reported over the recent years by the information and coding theory community. That said, many interesting problems in this field remain open, providing a fertile ground for future research. The goal of this tutorial is to provide a comprehensive survey of the broad range of private computation problems. We discuss both the state-of-the-art achievability schemes and converse proof techniques as well as open research problems. First, we will discuss the Private Linear Computation (PLC) problem and the Private Linear Transformation (PLT) problem whose goal is to compute a single linear combination and multiple linear combinations of a subset of data items, respectively. These problems are motivated by several ML applications including a linear transformation for dimensionality reduction and training linear models for regression or classification. We will discuss several variations of the PLC and PLT problems, which include different types of data access privacy, the presence of side information, the single-provider and multiple-provider scenarios, and the presence of colluding providers. We will also discuss the general private computation scenarios and the related open problems.

Biography

Anoosheh Heidarzadeh

Anoosheh Heidarzadeh is a Visiting Assistant Professor at the Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX, hosted by Prof. Alex Sprintson and Prof. Krishna Narayanan. Dr. Heidarzadeh received his Ph.D. degree in Electrical and Computer Engineering from the Carleton University, Ottawa, ON, in 2012. He was a Postdoctoral Research Fellow at the California Institute of Technology, Pasadena, CA, from 2013 to 2014, working under the supervision of Prof. Michelle Effros and Prof. Tracey Ho; and from 2015 to 2017, he was an Associate Research Scientist at the Texas A&M University, College Station, TX. His research interests include private computation, coded distributed computing, group testing, and compressed sensing

Alex Sprintson

Alex Sprintson is a faculty member in the Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX. He is currently serving as a rotating program director at the US National Science Foundation (NSF). Dr. Sprintson received his Ph.D. degree in Electrical Engineering from the Technion in 2003. From 2003 to 2005, he was a Postdoctoral Research Fellow at the California Institute of Technology, Pasadena, CA. His research interests include communication networks, distributed storage systems, security and privacy, wireless communication and networking, network coding and its applications, and software-defined networking. Dr. Sprintson received the Wolf Award for Distinguished Ph.D.students, the Viterbi Postdoctoral Fellowship, the TAMU College of Engineering Outstanding Contribution Award, the NSF CAREER award, and the 2021 NSF Director’s Award Superior Accomplishment. From 2013 to 2019 he served as an Associate Editor of the IEEE Transactions on Wireless Communications. He has been a member of the Technical Program Committee for the IEEE Infocom 2006-2021