Memorial Library

Resource allocation for SGD via adaptive batch sizes

Stochastic gradient descent (SGD) approximates the objective function’s gradient with a constant and typically small number of examples a.k.a. the batch size of mini-batch SGD. Small batch sizes can present a significant amount of noise near the optimum. This work presents a method to grow the batch adaptively with model …

Deep Learning with Small Datasets: Tips, Tricks, and Cautionary Tales

Many modern applications of deep networks are generative or discriminatory tasks that leverage large datasets and computational power to achieve state-of-the-art results on tasks in important, wide-reaching regimes. However, some applications are more data-constrained. This talk will focus on uses of neural networks in medical imaging and remote sensing, two …

Learning Nearest Neighbor Graphs from Noisy Distance Samples

We consider the problem of learning the nearest neighbor graph of a dataset of n items. The metric is unknown, but we can query an oracle to obtain a noisy estimate of the distance between any pair of items. This framework applies to problem domains where one wants to learn …