CS 1710 - Official Ballot 2024
|
||
INSTRUCTIONS | TEAM MEMBERS | VOTE FOR ONE |
|
Max Bahar
Harvard University
MS Data Science 2026
|
Ted McCulloch
Harvard University
MS Data Science 2026
|
Stefan Chu
Harvard University
MS Data Science 2026
|
Anitej Thamma
Harvard University
MS Data Science 2026
|
QUESTION | CATEGORIES | SELECT UP TO THREE |
Which demographic categories do you think have the greatest impact on voter turnout, positively or negatively? |
Party Affiliation
Democrat, Republican, etc.
|
Average Income
High, Middle, Low
|
Gender
Male, Female, Unknown
|
Spoken Language
English, Spanish, Italian, etc.
|
|
Age
Younger, Older, etc.
|
Ethnicity
European, Hispanic, Asian, etc.
|
You have made no selections, please return to previous slide to vote.
The average 2020 presidential voter turnout in Massachusetts is
Click and drag the slider above to make your guess.
Voter turnout and demographics vary across Massachusetts.
On the map to the right, you can explore key information such as voter turnout, the number of registered voters, and average household income across the state.
Click on any area to dive deeper into its demographics. You can view detailed distributions of voter age, gender, party affiliation, ethnicity, or language.
Focusing on areas and demographic characteristics associated with low voter turnout may help government officials focus their efforts.
These characteristics are associated with lower voter turnout:
by Geography: |
|
by Category: |
by Category: |
Kernel density plots allow you to visualize the density of data observations
X-axis - of the registered, voting population (of a block group), this proportion belonged to the category
Y-axis - voter turnout for a block group
Example Interpretation - The density of females is to the right of males as there tend to be more women registered to vote than men.
Dashed lines are to aid with plot readability
We find that Hampden county has the lowest voter turnout, with 10% lower voter turnout compared to other counties.
On the other hand, Barnstable and Hampshire county are tied for the highest voter turnout, with 4% higher voter turnout compared to other counties.
The diverging bar plot visualizes how voter turnout in Massachusetts counties deviates from the average 81% voter turnout.
Use the dropdowns to see the proportion of each party's voters in each county.
The bars are shaded dynamically, with lighter or darker hues indicating the proportion of the dominant party in each county.
Leveraging data and models to understand voter behavior in Massachusetts.
A Decision Tree predicts outcomes by splitting data into branches based on yes/no questions. While useful, single trees can overfit and perform poorly on unseen data.
Our team uses a Random Forest, which improves accuracy by combining decisions from multiple trees, capturing complex patterns and reducing overfitting.
We identified the most important predictors by measuring feature importance. These include:
The Random Forest model predicts voter turnout in the 2020 presidential election for each Census Block Group in Massachusetts, based on demographic data. Key findings include:
The plot to the right shows SHAP values, measuring the impact each feature has on the model's predictions.
Features at the top are most influential, while those at the bottom are less impactful.
Mean household income, unknown language, and Hispanic ethnicity are the most impactful features on average.
High variable values are shown in red, while low values are shown in blue.
For example, high household income slightly increases turnout predictions, while low income decreases them significantly.
Similarly, block groups with high proportions of voters in the unknown language category (that are likely non-English speakers) or Hispanic ethnicity tend to lower turnout predictions.
by Geography: |
You predicted that had a voter turnout of .
Our model predicted that the county had a voter turnout of .
In reality, the county had a voter turnout of .
Click on any area to see how each demographic feature affected the turnout prediction for that area.
The tooltip shows how much each feature influences the model's local prediction for that area.
The prediction always starts at a base SHAP prediction of 78.74%. Then each feature is shown positively influencing the voter turnout prediction in red or negatively in blue.
Predictions closely align with actual turnout, validating the use of demographic data for understanding voter engagement.
Max BaharHarvard UniversityMS Data Science 2026 |
Stefan ChuHarvard UniversityMS Data Science 2026 |
Ted McCullochHarvard UniversityMS Data Science 2026 |
Anitej ThammaHarvard UniversityMS Data Science 2026 |