Posts by Collection


Tax Intern

Data Science Intern

Research Intern

Research Assistant

Research Assistant


Learning a Decision Module by Imitating Driver’s Control Behaviors
Junning Huang*, Sirui Xie*, Jiankai Xun, Qiurui Ma, Chunxiao Liu, Bolei Zhou,
The Conference on Robot Learning (CoRL), 2020
[paper] [project page] [code]

we propose a hybrid framework to learn neural decisions in the classical modular pipeline through end-to-end imitation learning. This hybrid framework can preserve the merits of the classical pipeline such as the strict enforcement of physical and logical constraints while learning complex driving decisions from data.

Evaluating Strategy Exploration in Empirical Game-Theoretic Analysis
Yongzhao Wang*, Qiurui Ma*, Michael Wellman,
Working Paper

In Empirical Game Theoretic Analysis (EGTA), game models are iteratively extended to include the Nash Equilibrium of the underlying true games. The Strategy Exploration process dictates which new strategies to add to the game models next based on current available information. We investigate the methodological considerations in evaluating different strategy exploration processes in EGTA and highlight a consistency criteria that past literatures violate.


Double Q Learning for Long-Short Derivatives Trading
Qiurui Ma,


In this project, we apply double q-learning for long and short trading on twenty years of oil derivatives. My work envolved first scraped 20 years of oil derivative data from Bloomberg and Yahoo Finance; then implemented a support-resistance line visualization tool to better analyze and feature engineer; finally implemented a double dqn module to long or short the derivative, with its performance beating the benchmark buy-and-hold strategy

Uncertainty-Aware Model-Based Reinforcement Learning in Autonomous Driving using PILCO
Qiurui Ma*, Sirui Xie*,

[Contact me for detailed design and implementation for IP reasons]

In this study, we bring uncertainty estimation to model based RL for autonomous driving. The model is parenthesized by a bayesian neural network to approximate PILCO and dropouts are used to estimate the uncertainty. We further train a multilayer perceptron as a controller, whose gradient could flow through the model network. We demonstrate that our model could output uncertainty towards its projections, and could navigate safely in complex environments.

TCA-TWAS: Identification of Cell-Type-Specific Genetic Regulation of Gene Expression for Transcriptome-Wide Association Studies
Qiurui Ma*, Duo Zhang*, Brandon Jew, Sriram Sankararaman,

[code] [poster] [presentation] [report on data simulation]

In this study, we deconvolute builk-level gene expressions into cell-type-specific gene expressions with cell-type weights using bayesian models, circumventing the centrifusion that traditional methods require to acqure cell-type specific gene expressions. We then associate specific gene expressions with phenotypes on UKBiobank blood tissue data.