Mixture models have found numerous applications in complex machine learning scenarios, where a single model is insufficient to explain the heterogeneous data or make meaningful predictions. Existing literature on algorithms with rigorous theoretical guarantees, however, has mostly focused on simple models, such as mixed linear regression or Gaussian mixture models. In this thesis, we contribute to this field by studying two different mixture models.
We first study the problem of learning a mixture of low-rank models, i.e. reconstructing multiple low-rank matrices from unlabeled linear measurements of each. This problem enriches mixed linear regression and low-rank matrix sensing, by bringing latent variables (i.e. unknown labels) and structural priors (i.e. low-rank structures) together into consideration. We develop an algorithm that is guaranteed to recover the unknown matrices with near-optimal sample and computational complexities under Gaussian designs; in addition, the proposed method is stable against random noise, achieving near-optimal estimation error.
The second problem that we study is learning a mixture of multiple linear dynamical systems (LDSs) from unlabeled short sample trajectories, each generated by one of the LDS models. Technical challenges include (1) the presence of latent variables (i.e. the unknown labels of trajectories), (2) the possibility that sample trajectories might have lengths much smaller than the dimension of the LDS models, and (3) the complicated temporal dependence inherent to time-series data. To tackle these challenges, we develop a two-stage meta-algorithm, which is guaranteed to efficiently cluster the trajectories and recover each ground-truth LDS model at the parametric rate. We validate our theoretical studies with numerical experiments, confirming the efficacy and practical relevance of the proposed algorithm.
Zoom Meeting: https://princeton.zoom.us/j/8329071779