The estimation of evolutionary trees (called phylogenies) is an essential step in biological research; however large-scale phylogeny estimation continues to be computationally challenging. Many of the current leading methods are heuristics for NP-hard optimization problems, and these methods typically have limited parallelism for improving scalability to larger numbers of species. In this talk, I will present my recent work to address this challenge through the introduction of Disjoint Tree Merger (DTM) methods: NJMerge and its successor with improved parallel efficiency, TreeMerge. As I will show, both NJMerge and TreeMerge enable divide-and-conquer phylogeny estimation pipelines that are provably statistically consistent under stochastic models of evolution. Furthermore, in an experimental study, these divide-and-conquer pipelines have been shown to achieve similar accuracy to the current leading methods while dramatically reducing memory usage and running time. I will conclude with a discussion of future research directions and related applications.
January 29 @ 12:30
12:30 pm (1h)
Discovery Building, Orchard View Room