Title: Forecasting the Recovery: A Capstone Analysis of US Housing Data (2005-2013): Business Statistics and Analysis Capstone Project

 

Introduction: From Data to Dollars

For my Business Statistics and Analysis Capstone, I tackled the complex dynamics of the US housing market using HUD's THADS data (2005–2013). The goal wasn't just to report what happened, but to build a robust model capable of forecasting future market values. This project showcases the power of statistical rigor in transforming messy, skewed real-world data into actionable predictive insights.

Key Analysis Highlights

  1. Measuring the Crisis Impact on Rent: I used paired t-tests (2007 vs. 2009) on Fair Market Rent (FMR) and found that the 2008 Subprime Crisis did not decrease rents; instead, the crisis accelerated the rise in mean FMR, as foreclosures moved people into the rental market. This finding directly challenges assumptions about market deflation.

  2. Modeling Market Value Drivers: I developed a Multiple Linear Regression model for single-family home values, employing log transformations (LN(VALUE), LN(FMR), LN(UTILITY) to correct for extreme skewness. The model confirmed that local FMR is the single strongest predictor of a home’s value, and quantified the penalty associated with a unit being vacant (approx 11.5% reduction).

  3. Building a Time-Lagged Forecasting Engine: The final step involved building a predictive model using 2011 features (Predictors_2011) to forecast 2013 values (VALUE_2013). This required merging datasets on the CONTROL variable and obtaining a set of beta coefficients.

  4. Validation and Risk Quantification: I validated the model's performance on a 1,000-unit holdout sample. The final measure of accuracy, the Mean Absolute Deviation (MAD), was calculated at $136,966. This single metric quantifies the average forecasting error, allowing a business analyst to understand the financial risk when utilizing the model for strategic planning.

Conclusion

This capstone demonstrated end-to-end analytical capability: from cleaning and transforming complex real-estate data to running predictive validation. The result is a statistically grounded, highly interpretable model that quantifies market drivers and provides a measure of confidence for future forecasts.

Comments

Popular posts from this blog

Google Analytics Case Study Project.

Google Data Analytics Project 2: Bike Share Case Study Project

Teach your Child Coding