Deep Learning has emerged as a singularly critical technology for enabling human-like intelligence in online services such as Azure, Office 365, Bing, Cortana, Skype, and other high-valued scenarios at Microsoft. While Deep Neural Networks (DNNs) have enabled state-of-the-art accuracy in many intelligence tasks, they are notoriously expensive and difficult to deploy in hyperscale datacenters constrained by power, cost, and latency. Furthermore, the escalating (and insatiable) demand for DNNs comes at an inopportune time as ideal silicon scaling (Moore’s Law) comes to a diminishing end.
At Microsoft, we have developed a new cloud architecture that’s enhanced using a post-CPU technology called FPGA (Field Programmable Gate Array). FPGAs can be viewed as programmable silicon and are being deployed into each and every new server in Microsoft’s hyperscale infrastructure. The flexibility of FPGAs combined with a novel Hardware-as-a-Service (HaaS) architecture unlocks the full potential of a completely programmable hardware and software acceleration plane.
In this talk, I’ll give a history and overview of the project, discuss the key enabling technologies behind our enhanced cloud, present opportunities to harness this technology for accelerated deep learning, and conclude with directions for future work.
Discovery Building, Orchard View Room