This dissertation focuses on developing novel machine learning algorithms and systems for different aspects of machine learning and its applications. The first problem is portfolio optimization. Over the years, practitioners have been looking for ways to glean useful and meaningful signals from the market to help them improve financial judgments. Recently, deep learning techniques have gained much traction in this area. Here, I develop a deep learning based natural language processing model to improve the traditional mean-variance (MV) optimization framework by leveraging social media text. At a high level, the method provides a more accurate and robust estimation of two most important quantities for a better MV optimization process: mean and covariance.
The second problem aims to improve a scene-centric behavior prediction model for real world autonomous driving cars. Here, I develop a first knowledge distillation framework that enhances a student model from an expert model for multiple trajectory hypothesis predictions.
The third problem involves multi-agent zero sum games in the context of reinforcement learning (RL). Here, I develop a self-supervised learning method (predict opponent’s next move) to enhance the learning representations of the RL agents via self-play.
The fourth problem entails a long standing fundamental problem in Q learning. The traditional Bellman update when combined with large functional approximators (e.g, such as deep neural networks) is vulnerable to delusional bias, which leads to learning instability and non-optimality. Here, I develop a novel and scalable algorithmic framework known as ConQUR to mitigate the effects of delusional bias in the deep Q learning settings.
In the fifth problem, I develop a novel PoBRL (Policy Blending with RL) framework that blends two RL policies for multiple document summarization.
In the sixth problem, I consider deployment constrained RL. This problem has much value because neither pure online nor pure offline RL is practical in many real world settings. I propose an uncertainty regularized optimization method to regularize the policy learning process bias toward the high confidence regions.
To summarize, this dissertation provides algorithmic solutions for enabling machine learning systems to be more efficient and effective for many important modern day applications.
Join Zoom Meeting: https://princeton.zoom.us/j/98371893738