Predictions in modern statistical inference problems can be increasingly understood in terms of discrete structures such as arrangements of objects in computer vision, phonemes in speech recognition, parses in natural language processing, or molecular structures in computational biology. For example, in image scene understanding one needs to jointly predict discrete semantic labels for every pixel, e.g., whether it describes a person, bicycle, bed, etc. In a fully probabilistic treatment, all possible alternative assignments are considered thus requiring to estimate exponentially many structures with their respective weights. To relax the exponential complexity we describe two different approaches: Dual decomposition (e.g., convex belief propagation) and predictions under random perturbations. The second approach leads us to a new approximate inference framework that is based on max-statistics which capture long-range interactions, contrasting the current framework of dual decomposition that relies on pseudo-probabilities. We demonstrate the effectiveness of our approaches on different complex tasks, outperforming the state-of-the-art results in scene understanding, depth estimation, semantic segmentation and phoneme recognition.
February 12, 2013
12:30 pm (1h)
Discovery Building, Orchard View Room
Tamir Hazan