Keystone Homes: Student Housing Demand & Rent Analysis
Student housing markets rarely move in isolation. Shifts in enrollment, changes in ownership, and neighborhood-level signals all interact in ways that are difficult to see without pulling data together. This project began as part of market and customer research for Keystone Homes, with the goal of understanding how student population trends, especially international enrollment, show up in rent dynamics and housing conditions across student-heavy neighborhoods.
Focusing on Boston and surrounding areas, this project treats housing data as a living system rather than a static snapshot. By combining enrollment data, rental trends, property characteristics, and housing complaint signals, Scout surfaces patterns that help explain where demand translates into pricing pressure, where quality risks emerge, and how these signals can inform housing strategy and product positioning.
Link to Github Coding Project
-
DesThe core challenge behind Scout was not simply understanding whether rents were rising, but identifying where demand pressure translates into pricing power and where it creates risk for tenant experience. Housing operators, investors, and product teams often have access to fragmented data but lack a clear way to synthesize it into localized, decision-ready insight.
This project frames housing data as an input to decision-making. The goal was to move from broad trends, such as enrollment growth, to more granular questions around neighborhood behavior, ownership patterns, and maintenance signals that directly affect pricing, acquisition, and product differentiation.cription text goes here
-
To explore these questions, I integrated multiple real-world datasets, including student enrollment figures, rental pricing data, property characteristics, and 311 housing-related complaints. The analysis began with exploratory data analysis to surface relationships and anomalies, followed by modeling approaches aligned to specific questions.
Linear regression was used to examine the relationship between international student enrollment and average rent over time. Random Forest models were applied to predict building condition and maintenance categories using features such as owner type, unit counts, assessed values, and complaint frequency. K-means clustering was used to identify neighborhoods with similar complaint profiles, revealing patterns that were not visible at the individual property level.
The emphasis throughout was on using models as tools to test assumptions and surface signals, rather than optimizing for predictive performance alone.
-
The analysis showed that enrollment trends alone do not explain rent increases, but when combined with ownership structure and neighborhood-level indicators, they provide meaningful insight into pricing dynamics and maintenance risk. Certain neighborhoods exhibited strong demand signals alongside elevated complaint activity, suggesting opportunities for differentiated offerings or proactive investment in housing quality.
From a product and strategy perspective, these insights can inform pricing decisions, market segmentation, and acquisition strategy for student housing products. Scout demonstrates how analytical outputs can be translated into practical guidance, helping stakeholders move from raw data to clearer, more confident decisions.
Product & Technical Applications
Pricing and market segmentation for student housing products
Identifying neighborhoods with unmet demand or quality risk
Supporting acquisition and investment decisions
Informing differentiated offerings for international vs domestic students
Fraud Review Prioritization Using Bayesian Networks
As AI-powered products scale, fraud detection becomes a core product responsibility, directly tied to customer trust, platform integrity, and risk management. Many fraud systems struggle not because they lack data, but because they force uncertain situations into rigid yes-or-no decisions that can harm both users and operations. This project explores how probabilistic reasoning can support fraud review workflows by surfacing confidence levels, enabling smarter prioritization, and keeping humans in the loop where judgment matters most.
By implementing Bayesian network inference, I examined how evolving evidence can be used to update risk assessments over time, helping review teams decide which cases warrant escalation rather than automating irreversible decisions. The project reframes fraud detection as a prioritization problem, focusing on how uncertainty can be surfaced and managed in trust-critical systems.
Link to Github Coding Project
-
Fraud review is fundamentally a decision-making problem under uncertainty. Automated systems must surface risk signals without overwhelming review teams or damaging user trust through excessive false positives. At the same time, missed fraud can carry significant financial and reputational consequences.
This project frames fraud detection not as a binary classification task, but as a prioritization and escalation problem. The goal is to design a system that helps human reviewers decide where to focus attention, making uncertainty visible rather than hiding it behind rigid thresholds. From a product perspective, this mirrors challenges faced by trust and safety teams, compliance operations, and platform integrity functions.
-
To explore this problem, I implemented Bayesian network inference from scratch, modeling relationships between latent variables and observed evidence. The system uses probabilistic dependencies to update beliefs as new signals are introduced, producing confidence-aware risk scores rather than deterministic outputs.
The modeling approach emphasizes transparency and interpretability. Each probability update reflects explicit assumptions encoded in the network structure, making it easier to reason about why a case is flagged and how additional evidence would change the outcome. Rather than optimizing for prediction accuracy alone, the focus was on understanding how uncertainty propagates through the system and how it can support human-in-the-loop workflows.
Linear regression was used to examine the relationship between international student enrollment and average rent over time. Random Forest models were applied to predict building condition and maintenance categories using features such as owner type, unit counts, assessed values, and complaint frequency. K-means clustering was used to identify neighborhoods with similar complaint profiles, revealing patterns that were not visible at the individual property level.
The emphasis throughout was on using models as tools to test assumptions and surface signals, rather than optimizing for predictive performance alone.
-
This project highlights how probabilistic models can improve fraud review operations by shifting focus from automated decisions to informed prioritization. By ranking cases based on evolving confidence levels, review teams can allocate attention more effectively, reducing unnecessary interventions while maintaining strong fraud coverage.
From a product perspective, this approach supports scalable trust systems that respect both operational constraints and user experience. It demonstrates how explainable, uncertainty-aware models can serve as decision support tools rather than opaque gatekeepers, reinforcing trust in automated systems while keeping humans in control where it matters most.
Product Applications
Fraud detection and review queue prioritization
Trust & safety escalation workflows
Risk scoring and compliance review systems
Human-in-the-loop decision support tools
Customer Segmentation & Churn Prediction Using Support Vector Machines
Customer segmentation is a core product decision, shaping how teams prioritize retention efforts, allocate resources, and design user experiences at scale. This project examines how Support Vector Machines can be used to segment customers, such as identifying likely churners versus retained users, while making the tradeoffs behind those decisions visible and intentional rather than implicit.
Instead of treating segmentation as a static label, the work focuses on how decision boundaries are set, adjusted, and evaluated under uncertainty. By analyzing how different modeling choices influence who gets flagged and who does not, the project highlights how classification systems directly shape downstream product actions, from targeted outreach and incentives to support prioritization and lifecycle strategy.
Link to Github Coding Project
-
Product and growth teams rely on segmentation to decide which users receive attention, incentives, or intervention. These decisions are often made using models that appear objective on the surface but embed significant judgment through thresholds, features, and optimization choices.
This project frames customer segmentation as a decision-making problem, not just a modeling task. The central challenge is balancing sensitivity and precision: flagging enough at-risk users to meaningfully reduce churn without overwhelming teams or misallocating resources. From a product perspective, this mirrors real-world tradeoffs between coverage, cost, and customer experience.
This project frames fraud detection not as a binary classification task, but as a prioritization and escalation problem. The goal is to design a system that helps human reviewers decide where to focus attention, making uncertainty visible rather than hiding it behind rigid thresholds. From a product perspective, this mirrors challenges faced by trust and safety teams, compliance operations, and platform integrity functions.
-
To explore this problem, I implemented Support Vector Machine classifiers and analyzed how margin selection, regularization, and feature representation affect segmentation outcomes. SVMs are particularly useful in this context because they make decision boundaries explicit, allowing for clear examination of how users are separated into different segments.
Rather than optimizing solely for accuracy, the analysis focused on how changing model constraints shifted false positives and false negatives. This approach emphasized interpretability and control, highlighting how seemingly small modeling decisions can materially change who is targeted for retention efforts and who is left unaddressed.
-
This project demonstrates how segmentation models can serve as decision-support tools rather than rigid classifiers. By understanding where decision boundaries lie and how confident the model is in its classifications, product teams can design more intentional lifecycle strategies, such as graduated interventions, targeted experiments, or proactive support.
From a PM perspective, the key takeaway is that segmentation systems should align with business goals and customer experience outcomes, not just statistical performance. The work reinforces the importance of transparency and judgment in systems that directly influence how customers are treated.
Product Applications
Customer churn prediction and retention prioritization
User segmentation for targeted engagement and outreach
Lifecycle management and growth experimentation
Resource allocation across customer segments