Statistical Data Analysis and Data Science Projects

  • Analyzing Longitudinal Growth Using Mixed-Effects Models
    Graph showing longitudinal growth analysis
    Investigated growth trajectories in a seven-week strength training study across three treatment groups: weights-increasing, repetitions-increasing, and control. Applied random intercept and slope models in R to analyze linear and quadratic trends in strength improvement. Selected AR(1) covariance structure through comparisons of Unstructured, Compound Symmetry, and Toeplitz models using AIC, BIC, and likelihood ratio tests. Identified significant group-specific trends and visualized results to provide actionable insights.

  • Spatiotemporal Modeling of Air Quality and Disease Cases
    Map showing air quality and disease cases
    Examined the spatiotemporal association between air quality (PM10 levels) and an infectious disease case across Arizona counties from 2000 to 2022. Implemented Moran's I for spatial autocorrelation, ARIMA and SARIMA models for seasonal patterns, and Bayesian spatiotemporal modeling with Integrated Nested Laplace Approximation (INLA). Identified spatial heterogeneity, temporal trends, and seasonality, providing insights into environmental factors influencing disease spread. Leveraged R for data analysis, visualization, and predictive modeling.

  • Clinical Trial Analysis of Anti-Epileptic Drug Efficacy
    Graph showing longitudinal growth analysis
    Assessed the efficacy of Progabide in a placebo-controlled, double-blinded trial involving 59 participants. Modeled seizure count reductions over time using Generalized Estimating Equations (GEE) with an AR(1) covariance structure. Conducted hypothesis testing in SAS and interpreted results to provide evidence of treatment efficacy.

  • Bacterial Population Dynamics Under Simulated Microgravity
    Graph showing bacterial population dynamics
    Analyzed population dynamics and interspecies interactions of Ralstonia and Sphingomonas in a 100-day mixed-culture study under simulated microgravity and standard gravity. Modeled temporal changes in colony size distribution using linear mixed-effects models in R. Investigated species-specific growth patterns and their environmental interactions, providing insights into microbial behavior under spaceflight conditions.

  • Gender Bias and Educational Opportunity Cost Analysis (Regression Modeling)
    Graph showing wage disparity regression analysis
    Conducted a comprehensive analysis of wage disparities among programmers and engineers in Silicon Valley using the 2000 U.S. Census data. Developed and refined linear regression models to evaluate the effects of gender and educational attainment on income. Explored the economic trade-offs of pursuing higher education degrees, leveraging R for statistical modeling, data visualization, and diagnostics.

  • Bioinformatics Analysis of the Astronaut Salivary Microbiome (Team Work)
    Graph showing astronaut salivary microbiome analysis
    Analyzed the salivary microbiome of astronauts at three time points—pre-flight, during-flight, and post-flight—using 16S rRNA sequencing data. Performed differential abundance analysis, beta diversity analysis (Weighted-UniFrac), and alpha diversity analysis (Shannon Index) to investigate the impact of spaceflight on microbial communities. Identified microbial families potentially associated with mental health stressors using QIIME2 and Python. Results provided preliminary insights into spaceflight-induced microbiome changes and their implications for astronaut health.