The computing industry has a power problem: the days of ideal transistor scaling are over, and chips now have more devices than can be fully powered simultaneously, limiting performance. New architecture-level solutions are needed to continue scaling performance, and specialized hardware accelerators are one such solution. While accelerators promise to provide orders of magnitude more performance per watt, several challenges have limited their wide-scale adoption. Deep learning has emerged as a sort of proving ground for hardware acceleration. With extremely regular compute patterns and wide-spread use, if accelerators can’t work here, then there’s little hope elsewhere. For accelerators to be a viable solution they must enable computation that cannot be done today and demonstrate mechanisms for performance scaling, such that they are not a one-off solution. This talk will present deep learning algorithm-hardware co-designs to answer these questions and identify the efficiency gap between standard hardware design practices and full-stack co-design to enable deep learning to be used with little restriction. To push the efficiency limits, this talk will introduce principled unsafe optimizations. A principled unsafe optimization changes how a program executes without impacting accuracy. By breaking the contract between the algorithm, architecture, and circuits, efficiency can be greatly improved. To conclude, future research directions centering around hardware specialization will be presented: accelerator-centric architectures and privacy-preserving cloud computing.
March 13 @ 12:30
12:30 pm (1h)
Discovery Building, Orchard View Room