Convertible Codes: Efficient Conversion of Coded Data in Large-scale Storage Systems

Abstract:
In large-scale data storage systems, failures are the norm in day-to-day operations. To protect data in the face of such failures, erasure codes (a tool from coding theory) are employed to store data in a redundant fashion. In this setting, a set of k data blocks to be stored is encoded using an [n, k] code to generate n blocks that are then stored on distinct storage devices. In a recent work, we showed that the failure rate of storage devices vary considerably over time, and that dynamically tuning the parameters n and k of the code provides significant reduction in storage cost. However, traditional codes suffer from prohibitively high resource overheads in changing the code parameters on already encoded data.

Motivated by this application, in this talk, we:
1. Present a new theoretical framework to formalize the notion of “code conversion”—the process of converting data encoded using an [n, k] code into data encoded using a code with different parameters [n’, k’], while maintaining desired decodability properties,
2. Introduce “convertible codes”, a new class of codes that enable resource-efficient conversion,
3. Prove tight bounds on two important metrics for code conversion (a) the number of nodes accessed, and (b) bandwidth consumed,
4. Present practical constructions of convertible codes for a broad range of parameters.

Bio:
Rashmi Vinayak is an assistant professor in the Computer Science department at Carnegie Mellon University. Her research interests broadly lie in computer/networked systems and information/coding theory, and the wide spectrum of intersection between the two areas. Her current focus is on fault tolerance and resource efficiency in data systems. Rashmi is a recipient of NSF CAREER Award, Tata Institute of Fundamental Research Memorial Lecture Award 2020, Facebook Distributed Systems Research Award 2019, Google Faculty Research Award 2018, Facebook Communications and Networking Research Award 2017, UC Berkeley Eli Jury Award 2016 for “outstanding achievement in the area of systems, communications, control, or signal processing”. Her work has received USENIX NSDI 2021 Community (Best Paper) Award, and IEEE Data Storage Best Paper and Best Student Paper Awards for the years 2011/2012. Rashmi received her Ph.D. from UC Berkeley in 2016, and was a postdoctoral scholar at UC Berkeley’s AMPLab/RISELab from 2016-17. During her Ph.D. studies, Rashmi was a recipient of Facebook Fellowship 2012-13, the Microsoft Research PhD Fellowship 2013-15, and the Google Anita Borg Memorial Scholarship 2015-16.
Webpage: http://www.cs.cmu.edu/~rvinayak/

May 12, 2021

12:30 pm (1h)

Remote

Rashmi Vinayak

Video