Teaching Research Interests Publications Software Working Groups CV
Teaching
- STAT554, Categorical Data Analysis, Fall 2014/Fall 2015/Fall 2017
- STAT/IST 557, Data Mining, Fall 2011/Fall 2012/Spring 2014/Spring 2015/Fall 2016/Spring 2017
- STAT/MATH415, Introduction to Mathematical Statistics, Fall 2013/Spring 2016/Fall 2017/Fall 2019/Fall 2020/Spring 2022/Fall 2022
- STAT897D, Applied Data Mining, Fall 2012
Research Interests
- Bayesian methods;
- Data mining;
- Applications in health, environmental and social sciences.
- Current projects include the studies of HIV/AIDS epidemic, air pollution, the size estimation of hard-to-reach popultion, and text mining.
Refereed Publications [+PhD or Postdoc trainee] [*corresponding author]
- Bao, L.*, Niu, X., Mahy M. and Ghys P.D. (2023) Estimating HIV Epidemics for Sub-national Areas. To appear in Annals of Applied Statistics. arXiv:1508.06618.
- Laga, I.+, Bao, L., and Niu, X. (2023) A Correlated Network Scale-up Model: Finding the Connection Between Subpopulations. Journal of the American Statistical Association. https://doi.org/10.1080/01621459.2023.2165929
- Laga, I.+, Niu, X., # and Bao, L.* (2023) Mapping the number of female sex workers in countries across sub-Saharan Africa. Proceedings of the National Academy of Sciences. 120 (2) e2200633120. https://doi.org/10.1073/pnas.2200633120
- Sheng B.+, Li, C.+, Bao, L.* & Li, R. (2022) Probabilistic HIV Recency Classification -- A Logistic Regression without Labeled Individual Level Training Data. To appear in Annals of Applied Statistics. arXiv:2104.05150.
- Bao, L.*, Zhang, Y.+ and Niu, X. (2022) What Can We Learn from the Travelers Data in Detecting Disease Outbreaks--A Case Study of the COVID-19 Epidemic. Annals of Epidemiology. 75: 67-72. https://doi.org/10.1016/j.annepidem.2022.09.005
- Parsons, J.+, Niu, X. & Bao, L.* (2022) A Bayesian hierarchical modeling approach to combining multiple data sources: A case study in size estimation. Annals of Applied Statistics. 16(3): 1550-1562.
- Li, X.+, Zhang, A.+, Al-Zaidy, R., Baral, S., Bao, L.* and Giles, C. L. (2022) Automating document classification with distant supervision to increase the efficiency of systematic reviews: A case study on identifying studies with HIV impacts on female sex workers. PLOS ONE. doi.org/10.1371/journal.pone.0270034 14.
- Bao L., Li, C.+, Li, R., & Yang S.+ (2022) Causal Structural Learning on MPHIA Individual Dataset. Journal of the American Statistical Association. 117.540, 1642-1655; https://doi.org/10.1080/01621459.2022.2077209
- VanEvery H.+, Yang W., Su J., Olsen N., Bao L., Lu B., Wu S., Cui L., Gao X. (2022) Low density lipoprotein cholesterol and risk of rheumatoid arthritis: a prospective study. Nutrients, 14(6), 1240; https://doi.org/10.3390/nu14061240.
- Parsons, J.+ and Bao, L.* (2021) A Unified Approach for Outliers and Influential Data Detection 每 The Value of Information in Retrospect. Stat. doi.org/10.1002/sta4.442
- Laga I.+, Bao L., & Niu X. (2022) Thirty Years of The Network Scale-up Method. Journal of the American Statistical Association. 116:535, 1548-1559; https://doi.org/10.1080/01621459.2021.1935267
- Laga I.+, Niu X., & Bao L.* (2021) Modeling the Marked Presence-only Data: A Case Study of Estimating the Female Sex Worker Size in Malawi. Journal of the American Statistical Association. 117.537, 27-37; https://doi.org/10.1080/01621459.2021.1944873
- VanEvery H.+, Yang W., Olsen N., Bao L., Lu B., Wu S., Cui L., Gao X. (2021) Alcohol consumption and risk of rheumatoid arthritis: a prospective study. Nutrients, 13(7), 2231; https://doi.org/10.3390/nu13072231.
- Wu Z., Huang Z., Lichtenstein A., Liu Y., Chen S., Jin Y., Na M., Bao L., Wu S. and Gao X. (2021) The Risk of Ischemic Stroke and Hemorrhagic Stroke in Chinese adults with low density lipoprotein cholesterol concentrations<70 mg/dL. BMC Medicine, 16;19(1):142. doi: 10.1186/s12916-021-02014-4.
- Niu, X. M., Rao, A., Chen, D.+, Sheng, B.+, Weir, S., Umar, E., ... & Bao, L.* (2020). Using factor analyses to estimate the number of female sex workers across Malawi from multiple regional sources. Annals of Epidemiology, 55, 34-40. https://doi.org/10.1016/j.annepidem.2020.12.001
- Parsons, J.+, Niu, X., & Bao, L.* (2020). Evaluating the relative contribution of data sources in a Bayesian analysis with the application of estimating the size of hard to reach populations. Statistical Communications in Infectious Diseases, 12(s1): 20190020; https://doi.org/10.1515/scid-2019-0020.
- Sheng B.+, Eaton J., Mahy M. and Bao L.* (2020). Comparison of HIV Prevalence Among Antenatal Clinic Attendees Estimated from Routine Testing and Unlinked Anonymous Testing, Statistics in Biosciences, 12: 279每294; https://doi.org/10.1007/s12561-020-09265-4
- Eaton J., Brown T., Puckett R., Glaubius R., Mutai K., Bao L., Salomon J., Stover J. Mahy M., Hallett T. (2019). The Estimation and Projection Package Age-Sex Model and the r-hybrid model: new tools for estimating HIV incidence trends in sub-Saharan Africa, AIDS. 33: S235每S244.
- Datta A., Lin W., Rao A., Diouf D., Edwards J., Bao L., Louis T. and Baral S. (2018). Bayesian estimation of MSM population size in Cote d'Ivoire, Statistics and Public Policy. 6(1): 1每13. doi: 10.1080/2330443X.2018.1546634
- Huang S.+, Li J., Wu Y., Ranjbar S., Xing A., Zhao H., Wang Y., Shearer G. C., Bao L., Lichtenstein A. H., Wu S. and Gao X. (2018). Tea consumption and longitudinal change in high-density lipoprotein cholesterol concentration in Chinese adults, Journal of the American Heart Association. 7, 13, e008814.
- Cheng F.W.+, Gao X., Bao L., Mitchell D.C., Wood C., Sliwinski M.J., Smiciklas-Wright H., Still C.D., Rolston D.D.K., and Jensen G.L. (2017). Obesity as a risk factor for developing functional limitation among older adults: A conditional inference tree analysis. Obesity (Silver Spring). 25(7):1263-1269.
- Wu Z., Su X., Sheng H., Chen Y., Gao X., Bao L., Jin W. (2017) Conditional Inference Tree for Multiple Gene-Environment Interactions on Myocardial Infarction Among Chinese Men. Archives of Medical Research. doi.org/10.1016/j.arcmed.2017.12.001
- Eaton J. and Bao L. (2017). Accounting for non-sampling error in estimates of HIV epidemic trends from antenatal clinic sentinel surveillance. AIDS 31: S61-S68.
- Niu X., Zhang A.+, Brown T., Puckett R., Mahy M., Bao L.* (2017). Incorporation of hierarchical structure into EPP fitting with examples of estimating sub-national HIV/AIDS dynamics. AIDS 31: S51-S59.
- Sheng B.+, Marsh K., Slavkovic A.B., Simon Gregson, Eaton J., Bao L.* (2017). Statistical Models for Incorporating Data from Routine HIV Testing of Pregnant Women at Antenatal Clinics into HIV/AIDS Epidemic Estimates. AIDS 31: S87-S94.
- Hunter D.R., Bao L., and Poss M. (2017). Assignment of Endogeneous Retrovirus Integration Sites Using a Mixture Mode. Annals of Applied Statistics 11(2): 751-770.
- Thomas J. and Bao L. (2016). Modeling the dynamics of an HIV epidemic. Dynamic Demographic Analysis. 91-144.
- Malhotra, R., Elleder, D., Bao, L., Hunter, D. R., Poss, M., Acharya, R. (2016). A pipeline for identifying integration sites of mobile elements in the genome using next-generation sequencing. Proceedings of the 8th International Conference on Bioinformatics and Computational Biology (BICOB). 63-69.
- Li R., Dudek S.M., Kim D., Hall M.A., Bradford Y., Peissig P.L., Brilliant M.H., Linneman J.G., McCarty C.A., Bao L., and Ritchie M.D. (2016) Identification of genetic interaction networks via an evolutionary algorithm evolved Bayesian Network. Bio Data Mining, 9(18) DOI: 10.1186/s13040-016-0094-4.
- Bao L.*, Raftery A.E., Reddy A. (2015) Estimating the sizes of populations at risk of HIV infection from multiple data sources using a Bayesian hierarchical model.Statistics and Its inference. 8(2): 125每136.
- Bao L., Elleder D., Malhotra R., DeGiorgio M., Maravegias T., Horvath L., Carrel L., Gillin C., Hron T., Fabryova H., Hunter D. and Poss M. (2014) Computational and statistical analyses of insertional polymorphic endogenous retroviruses in a non-model organism. Computation. 2: 221-245.
- Bao L.*, Ye J., Hallett T.B. (2014) Incorporating incidence information within the UNAIDS estimation and projection Package framework: a study based on simulated incidence assay data. AIDS 28: S515-S522.
- Brown T., Bao L., Eaton J.W., Hogan D.R., Mahy M., March K., Mathers B.M., Puckett R. (2014) Improvements in prevalence trend fitting and incidence estimation in EPP 2013. AIDS 28: S415-S425.
- Kamath P.., Elleder D., Bao L., Cross P., Poss M. (2013) The population history of endogenous retroviral elements in mule deer (Odocoileus hemionus). Journal of Heredity, 105: 173-187.
- Bao L. (2012) A new infectious disease model for estimating and projecting HIV/AIDS epidemics. Sexually Transmitted Infections, 88: i58-i65.Bao L. (2012). A new infectious disease model for estimating and projecting HIV/AIDS epidemics. Sexually Transmitted Infections, 88: i58-i65.
- Bao L.*, Salomon J.A., Brown T., Raftery A.E., and Hogan D.R. (2012) Modelling national HIV/AIDS epidemics: revised approach in the UNAIDS estimation and projection package 2011. Sexually Transmitted Infections, 88: i3-i10.
- Clark S.J., Thomas J., and Bao L. (2012) Estimates of age-specific reductions in HIV Prevalence in Uganda: Bayesian melding estimation and probabilistic population forecast with an HIV-enabled cohort component projection model. Demographic Research 27: 743-774.
- Meila M.P. and Bao L. (2010) An exponential model for infinite rankings. Journal of Machine Learning Research, 11: 3481-3518.
pdf Technical report 529 Technical report 524
- Raftery A.E. and Bao L. (2010) Estimating and projecting trends in HIV/AIDS generalized epidemics using incremental mixture importance sampling. Biometrics, 66: 1162-1173.
pdf Technical report 560
- Bao L. and Raftery A.E. (2010) A stochastic infection rate model for estimating and projecting national HIV prevalence rates. Sexually Transmitted Infections. 86: ii93-ii99.
pdf
- Brown T., Bao L., Raftery A.E., Salomon J.A., Baggaley R.F., Stover J., and Gerland P. (2010)
EPP 2009: bringing the UNAIDS estimation and projection package into the ART era. Sexually Transmitted Infections. 86: ii3-ii10.
pdf
- Bao L., Gneiting T., Grimit E., Guttrop P. and Raftery A.E. (2010) Bias correction and Bayesian model averaging for ensemble forecasts of surface wind direction. Monthly Weather Review. 138:1811-1821.
pdf Technical report 557
- Bao L., Zhu, Z. and Ye, J.(2009) Modeling oncology gene pathways network with multiple genotypes and phenotypes via a copula method. IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology. 237-246.
pdf
- Meila M.P. and Bao L. (2008) Estimation and clustering with infinite rankings. Proceedings of the 24th Conference in Uncertainty in Artificial Intelligence. 393-402.
pdf
- Bao L., Gu H., Dunn, K.A. and Bielawski J. (2008) Likelihood Based Clustering (LiBaC) for codon models, a method for grouping sites according to similarities in the underlying process of evolution. Molecular Biology and Evolution. 25:1995-2007.
pdf
- Bao L., Gu H., Dunn K.A. and Bielawski J. (2007) Methods for selecting fixed-effect models for heterogeneous codon evolution, with comments on their application to gene and genome data. BMC Evolutionary Biology. 7 Suppl 1:S5.
pdf
- Mitnitski A, Bao L., and Rockwood K. (2007) A cross-national study of transitions in deficit counts in two birth cohorts: implications for modeling ageing. Experimental Gerontology. 42:241-246.
pdf
- Mitnitski A, Bao L., and Rockwood K. (2006) Going from bad to worse: a stochastic model of transitions in deficit accumulation, in relation to mortality. Mechanisms of Ageing and Development. 127: 490-493.
pdf
Contributed Software
-
Spectrum/EPP: Estimation and Projection Package is used to estimate and project adult HIV prevalence and incidence from surveillance data.
-
IMIS: R-package for Incremental Mixture Importance Sampling. Reference: Raftery and Bao (2010) Biometrics.
-
SizeEstimation: R-package for estimating the size of hidden population with multiple data sources. Reference: Bao, Raftery and Reddy (2015) Statistics and Its Interface.
-
Codeml_FE: A modified version of Ziheng Yang's codeml that implements 11 new fixed-effects codon models. Reference: Bao, Gu, Dunn and Bielawski (2007) BMC evolutionary Biology.
- LiBaC: The primary use is to identify positively selected sites when the process of evolution is highly heterogeneous among sites. Reference: Bao, Gu, Dunn and Bielawski (2008) Molecular Biology and Evolution.
Working Groups
- Key technical advisor of the UNAIDS Reference Group on HIV Estimates, Modelling and Projections who advises on the methods for calculating international AIDS statistics, link.
- Core project team leader of the Diagnostics Modelling Consortium who aims to utilize modelling to guide the effective use of diagnostic technologies in resource-poor settings, link.
Current students in my research group:
- Ian Laga - PhD Student - Statistics
- Saname Sanei - PhD Student - Statistics
- Ying Zhang - PhD Student - Statistics
- Wenlong Yang - PhD Student - Statistics
- Grant Thomas Hopkins, BS, Statistics
Former students in my research group:
- Ruowang Li, PhD, Bioinformatics, 2016
- Jingyi Ye, PhD, Statistics, 2017
- Lu Ou, Human Development and Family Studies, 2018
- Jacob Lee Parsons, PhD, Statistics, 2019
- Ben Sheng, PhD, Statistics, 2021
- Amy Zhang, PhD, Statistics, 2021
- Changcheng Li, postdoctoral fellow, 2019-2021
- Xiaoxiao Li - PhD Student - Statistics
- -------------------------------------
- Yuan Tang, BS, Mathematics, 2015
- Bangze Chen, BS, Statistics, 2016
- Haici Tan, BS, Statistics, 2017
- Zhiyuan Zhao, BA, Business, 2017
- Kaiyi Wu, BS, Mathematics, 2018
- Shanglun Li, BS, Statistics, 2018
- Caihui Xiao, BS, Statistics, 2018
- Jiajun Gao, BS, Statistics, 2018
- Jinjun Wang, BS, Statistics, 2019
- Qianyi Zhao, BS, Statistics, 2020
- Xinyi Yang, BS, Statistics, 2020
- Yiyang Wang, BS, Statistics, 2020
- Zhiyang Liang, BS, Statistics, 2021
- David Chen, BS, Statistics, 2021
updated on Feb 28 2022