🧭 STA 9750 – Basic Software Tools for Data Analysis (Fall 2025)

Instructor: Prof. Michael Weylandt
Format: Flipped-classroom + experiential learning using R, Quarto, and GitHub

STA 9750 introduces graduate students to data analytics in R through a project-driven format. Instead of traditional exams, students complete a sequence of GitHub-hosted mini-projects and a final capstone project that mirror real-world analytical workflows. Across the semester, assignments emphasize reproducible research, data wrangling, visualization, and clear analytical storytelling using Quarto and GitHub-based collaboration.


📊 Mini-Projects (MP00 – MP04)

  • MP00 – Course Setup: Git/GitHub + Quarto site
  • MP01 – Netflix Top 10 Analysis: data wrangling, visualization, and press-style reporting — Open MP01
  • MP02 – Housing Affordability Index: ACS + BLS + building permits; index construction and policy visualization — Open MP02
  • MP03 – NYC Tree Canopy Analysis: geospatial joins, spatial aggregation, and thematic maps — Open MP03
  • MP04 – Just the Fact(-Check)s, Ma’am!: employment revisions and statistical fact-checking — Open MP04

Each mini-project includes a GitHub-based peer review cycle focused on code clarity, reproducibility, and rubric-based feedback.


🌃 Final Project — Nightlife Analytics (NYC)

Research Question:

How does nightlife activity shape urban mobility and safety, and how did COVID-19 change night-time travel behavior in New York City?

This final project examines New York City night-time mobility (8 PM–4 AM) by integrating TLC Yellow Taxi and FHV (Uber/Lyft) trip data with Yelp nightlife venue information.
The analysis compares pre-COVID (2019), COVID (2020), and post-COVID (2023) periods to assess how nightlife-dense and non-nightlife zones experienced uneven collapse and recovery in night-time travel.

The project emphasizes reproducible workflows, zone-level aggregation using DuckDB, and data visualization to support clear interpretation of spatial and temporal patterns in urban night-time activity.

Open Final Project


🎯 Learning Objectives

  • Develop end-to-end analytical workflows in R / tidyverse
  • Produce reproducible Quarto reports integrating code, narrative, and visuals
  • Integrate multi-source public data (government, APIs, and web data)
  • Apply geospatial analysis and mapping techniques for spatial insights
  • Perform exploratory and comparative analysis across time and groups
  • Communicate results through clear data storytelling for technical and non-technical audiences
  • Practice professional GitHub workflows, including version control and peer review
  • Apply responsible AI use and uphold standards of academic integrity

📚 Resources & Policies

  • R for Data Science (Wickham & Grolemund), Happy Git with R, Veridical Data Science
  • Software: R, RStudio, Quarto (all free)
  • AI tools are permitted only for small, targeted coding questions; full assignment generation is prohibited.