The primary goal of my research is to develop statistical methods and computational tools to solve real biomedical problems. I am particularly interested in statistical methods for high-dimensional genomics data with cell-type heterogeneity, clustered structure, and missing values. My research spans cell type deconvolution, single-cell RNA sequencing data, mixed-effects models, machine learning methods, missing data, and proteomics data.
I have collaboration experience in areas such as Alzheimer's disease, psychiatry, asthma, and breast cancer. I am familiar with data analysis pipelines for whole genome/exome sequencing, bulk/single-cell RNA sequencing, DNA methylation, proteomics, etc.
2017 | University of Chicago, Chicago, IL | PhD, Biostatistics
2012 | Renmin University of China, Beijing, China | Master, Statistics
2010 | Renmin University of China, Beijing, China | Bachelor, Statistics
+: corresponding author
*: co-first author
_: PhD advisee
- Wang, J., Gamazon, E. R., Pierce, B. L., Stranger B. E., Im, H. K., Gibbons, R. D., Cox, N. J., Nicolae, D. L., Chen, L. S. (2016). Imputing gene expression in uncollected tissues within and beyond GTEx. American Journal of Human Genetics, 98(4), 697-708.
- Chen, L. S., Wang, J., Wang, X., Wang, P. (2017). A mixed-effects model for incomplete data from labeling-based quantitative proteomics experiments. Annals of Applied Statistics, 11(1), 114-138.
- Yang, F., Wang, J., the GTEx consortium, Pierce, B. L., Chen, L. S. (2017). Identifying cis-mediators for trans-eQTLs across many human tissues using genomic mediation analysis. Genome Research, 27(11), 1859-1871.
- Wang, J.*, Liu, Q.*, Pierce, B. L., Huo, D., Olopade, O. I., Ahsan, H., Chen, L. S. (2018). A meta-analysis approach with filtering for identifying gene-level gene-environment interactions. Genetic Epidemiology, 42(5): 434-446.
- Wang, J., Wang, P., Hedeker, D., Chen, L. S. (2019). Using multivariate mixed-effects selection models for analyzing batch-processed proteomics data with non-ignorable missingness. Biostatistics, 20(4), 648–665.
- Gibbons R. D., Kwan Hur, Lavigne J., Wang J., Mann J. J. (2019). Medications and suicide high dimensional empirical Bayes screening (iDEAS).Harvard Data Science Review, 1(2).
- Wang, J., Devlin, B., Roeder, K. (2020). Using multiple measurements of tissue to estimate subject- and cell-type-specific gene expression. Bioinformatics, 36(3), 782-788.
- Satterstrom, F. K.*, Kosmicki, J. A.*,Wang, J.*, ..., Devlin, B., Sanders, S. J., Roeder, K., Daly, M. J., Buxbaum, J. D. (2020). Large-scale exome sequencing study implicates both developmental and functional changes in the neurobiology of autism. Cell, 180(3), 568-584.
- Chen, S.*, Wang, J.*, Cicek, E., Roeder, K., Yu, H., Devlin, B. (2020). De novo missense variants disrupting protein-protein interactions affect risk for autism through gene co-expression and protein networks in neuronal cell types. Molecular Autism. 11(1):76.
- Tian, J._, Wang, J., & Roeder, K. (2021). ESCO: single cell expression simulation incorporating gene co-expression. Bioinformatics 37 (16), 2374-2381
- Yang, F., Gleason, K. J., Wang, J., Duan, J., He, X., Pierce, B. L., & Chen, L. S. (2021). CCmed: Cross-condition mediation analysis for identifying replicable trans-associations mediated by cis-gene expression. Bioinformatics 37 (17), 2513-2520
- Wang, J.+, Roeder, K.+, Devlin, B.+ (2021). Bayesian estimation of cell type-specific gene expression with prior derived from single-cell data.Genome Research 31 (10), 1807-1818
- Qiu Y, Wang J., Lei J, Roeder K. (2021). Identification of cell-type-specific marker genes from co-expression patterns in tissue samples. Bioinformatics 37 (19), 3228-3234
- M Cai_, M Yue, T Chen, J Liu, E Forno, X Lu, T Billiar, J Celedon, C McKennan, W Chen+, Wang J+. (2022). Robust and accurate estimation of cellular fraction from tissue omics data via ensemble deconvolution. Bioinformatics38 (11), 3004-3010.