Community Voices and AI: Exploring Public Perceptions in School Rezoning Efforts

To introduce the blog post, the picture illustrates a robot with a pencil redrawing districts on a map on the wall. — *Image source: ChatGPT. This image was generated using ChatGPT to capture the blog post’s topic.*

Dr. Nabeel Gillani provided mentorship, feedback, and methodological assistance for developing this project. In this exploratory project, Ph.D. candidate Johan Arango-Quiroga tested different local AI models to identify survey responses where community members are commenting about the use of AI in a large-scale school redistricting effort.

About the Project

The Fostering Diverse Schools initiative is an effort seeking to modernize the Winston-Salem/Forsyth County Schools district’s residential school boundaries to enhance socioeconomic diversity, improve transportation efficiency, improve the use of school space, and address poor enrollment in some schools. By using a computer program, this project redrew the district boundaries that have remained largely unchanged for nearly three decades.

Currently, the redrawing of the boundaries is in process. As part of the proposal phase, an initial set of maps were released in February and March of 2025 and community members have been providing feedback on the Fostering Diverse Schools website. So far, thousands of comments have been submitted. These responses, along with listening session comments, will be used to define community priorities and brought to the school board.

The goal of this collaboration was to better understand people’s perspectives on AI as well as testing the Msty app, a cross-platform AI tool that serves as a smart assistant and allows you to run different AI models on your local machine.

Data

My dataset included 2,615 survey responses submitted through the Fostering Diverse Schools website. The survey captured time spent completing the survey, map preferences, the role of the respondent (e.g., parent, student, staff), and associated high schools. The survey also included respondents explaining why they preferred one proposal over another and suggested changes.

Analysis

I set up Msty App and AI models (DeepSeek-r1:1.5b, GPT, and Llama 3.2) that would be used to analyze the data. Once I set up the local models, I drafted several questions to get the analysis started seeking to capture people’s perspectives on the use of AI and computing in producing first drafts of boundary proposals. I also conducted a qualitative analysis of the data following an inductive approach to identify emerging themes about the use of AI and computing methods to redraw school district boundaries.

To start the analysis with the Msty App, I formulated a set of initial questions/commands including: Identify rows that mention AI. How many people mention anything about AI? Based on the stakeholder column, what kind of stakeholder is making statements about AI? Identify whether the comments about AI are positive or negative. What are the emerging themes coming up about AI? What is the relation between stakeholder and positive and negative comments about AI? Based on the Associated/Attending schools column, what are the associated/attending schools that intersect with the AI comments?

LLAMA 3.2

In my first attempts, I asked Llama 3.2 to find any mentions of AI or computing, identify which stakeholders were commenting on AI, and identify whether these perspectives were positive or negative. After analyzing the text file, Llama 3.2 did not find explicit mentions of AI in the provided text. From a previous scan of the data, I already knew that survey respondents had included comments on AI and computing models and that there were already some negative comments about its use. I decided to convert the files from CSV, TXT, and XLS. Since the model did not identify any mentions of AI or computing, the model concluded that there were not any statements about AI to be analyzed. I then simplified the commands and ran the model per question – rather than giving several questions at once.

In one of the simplified commands where I asked Llama 3.2 to identify any specific mentions of AI, Llama 3.2’s findings included a quotation, a summary of a response, and two interpretations based on some/few responses including:

One respondent mentioned that “an ideal school should incorporate AI-powered tutoring tools to support students’ learning needs”
A respondent expressed concern about the need for teachers to receive training on integrating technology, including AI, into their teaching practices.
Some respondents suggested that schools should use AI-powered analytics to track students’ progress and identify areas where they may need support.
A few respondents raised concerns about the potential biases in AI-driven admission processes and emphasized the need for more transparent decision-making.

To confirm the quote Llama 3.2 drew from the data, I went back to the original file and manually looked for mentions of ideal schools. However, what I found was that there were only two mentions of ideal schools, and the quoted statement is not in the comments provided by respondents. This led me to wonder whether the quote drawn from the data is something that the model is coming up with. I also checked the original file and did not find comments that aligned with any of the four findings that Llama 3.2 proposed. This is a clear example of AI hallucination, a phenomena in which AI models can provide false predictions, false positives, or false negatives. Due to these results, I decided to test Deepseek-r1:1.5b.

Llama 3.2 and general themes

Before exploring Deepseek, I asked Llama 3.2 to identify general themes from the survey responses. This was Llama 3.2’s response:

Image 1: Llama 3.2 response to identifying general themes from survey responses

These themes do not seem to be totally off from some of the comments I read in the original file. However, since I was focused on understanding the perspectives survey respondents have about the role of AI and computer techniques to redraw school districts, the identification of more general themes is a potential next step for this project to follow.

Deepseek-r1:1.5b

Using the Msty App, I asked the Deepseek model to analyze the dataset. More specifically, I asked the model to identify specific mentions of AI or computing included in any of the columns. As part of DeepThink’s analysis, the model explains how it was going through each column to identify mentions of AI or computing. More specifically, the model said:

Image 2: Deepseek model response to dataset analysis

After going through this process, the Deepseek model concluded that “while the platform is educational focusing on school-related topics, it might not explicitly use AI or computer-based systems beyond basic features like upvoting and comments for engagement. So maybe there’s no direct mention of AI implementation here.”

Based on these results, I decided to give a more specific instruction to DeepSeek. I asked the model to look more closely at the column labelled “what do you like about your preferred proposals?” I asked the model to do this because I did a manual search of the original data and found that there were several mentions of AI and computing and most of the comments I skimmed were negative views of using AI to redraw district boundaries. The model’s response can be seen below. (The garbled results below seem like a formatting issue with how Mysty – the front end for the language model – is rendering the chain of thought output of the language model rather than having anything to do with the survey results or underlying data).

A screenshot of a computer

AI-generated content may be incorrect. — Image 3: Model response to closer look at specific column

I repeated this process with each column of the spreadsheet where survey respondents included their feedback. For instance, I asked the model to look at the anything else column, where I knew there were only two comments about AI and computing. However, the model did not identify the comments I have previously located.

Given these results, I decided to ask the model for other simple tasks such as 1) identifying the rows where AI and/or the word computer was mentioned and 2) conduct a word count of AI and computer. From my manual check, I had already identified the location and number of mentions of the words AI and computer. However, when using the model, the results were not successful. For the first request, the model concluded that there were no AI mentions. For the second request, the model found zero mentions of AI (see image below).

As part of the thinking Deepseeks performs, the model reminds itself that “the user is specifically asking for counts of those words regardless of context. I should check through each session to see if either word appears.” The model also said that “after going through each session line by line,” it did not see any actual occurrence of “AI” or “computer.” Based on this analysis, the model concluded that “both words are mentioned zero times.”

I also created different files that only included one column. I created one file that only included the responses in the what do you like about the preferred proposal? column and I created another file with the responses included in the what would you change about your preferred proposal? column. I used these files to test whether a simplified file would lead to different results. Despite these efforts, the model concluded that “after a thorough review without finding “AI” or “computer,” I can conclude that these words do not appear in the document.”

A screenshot of a computer survey

AI-generated content may be incorrect. — Image 4: Model response to different files

I also performed the same steps with Llama 3.2 and obtained similar results. Unlike Deepseek, Llama 3.2 does not provide an elaborate explanation of its thinking process.

Since Llama 3.2 and Deepseek did not identify mentions of AI and computer, the models logically concluded that they could not identify any perspectives about the use of AI to redraw the districts. I then decided to try GPT.

GPT

To get started with GPT, I asked the model to count AI and computer occurrences, identify the rows they appeared, and come up with potential themes around AI and computing. Unlike Llama 3.2 and Deepseek, GPT provided answers to these questions. Here are the results from the file analysis:

In terms of identifying the number of occurrences, GPT did not successfully identify these occurrences (see qualitative analysis section below). GPT results mention that the word computer appears in rows 140 and 203. I decided to do a manual check and found that the words do appear in different rows: 25 and 209.

Compared to Llama 3.2 and Deepseek, GPT successfully identified the respondents’ perspectives about AI and categorized them into four themes: “AI’s role in proposals, potential improvements, and concerns about AI implementation.” However, these themes provided a limited explanation of what survey respondents think and feel about the use of AI to redraw school districts.

Qualitative Analysis

I started my qualitative analysis by identifying the number of occurrences. Based on those occurrences, I went through the survey responses to familiarize myself with the feedback and identify emerging themes. After reading the feedback, I identified 24 responses that included comments about AI and computing. AI appears in 7 cells and is mentioned 7 times (GPT identified 17 times) whereas the word computer appears in 17 cells and is mentioned 20 times (GPT identified 2 times). Also, three major themes around the use of AI or computing methods to redraw school district boundaries emerged: 1) the explicit rejection of AI (15 comments), 2) AI needs adjustment (8 comments), and 3) various comments that do not have a negative or positive reaction to AI (3 comments). See Table below for further explanation.

Rejecting AI (n = 15)	AI needs adjustments (n = 8)	Other (n = 3)
Explicit rejection to/dislike/disagreement with using AI to redraw boundaries and how the AI generates undesirable outcomes. Some respondents claimed this AI-based approach is punishing economically successful people, the model was manipulated, disconnected from reality, or used outdated information.	AI needs human-based adjustments for different factors such as students needing more support to go from a “struggling” school to a “high caliber” one, transportation efficiency/distance, not splitting neighborhoods (e.g., Brookberry), disrupting friendships.	A participant mentioned AI to acknowledge the school ranking results. Another participant mentioned the word computer to talk about how her child had been “forced” to sit in front of a computer and wear a mask. Another participant mentioned something about the maps being hard to see from her smartphone, as they don’t use a computer to navigate the internet.

Image 9: Table with explanation of AI themes in survey responses

Conclusion

To identify perspectives about AI and computing, I conducted a combined analysis of survey responses using a qualitative approach and three AI models (Llama 3.2, DeepSeek-r1:1.5b, and GPT). As part of the qualitative analysis, I identified fifteen responses that included explicit rejection, dislike, disagreement of using AI to redraw boundaries. Those rejecting AI included comments about how AI was being manipulated, punishes economically successful people, or is disconnected from reality. Eight responses focused on the need to make adjustments to the AI-based approach. These responses focused on the need to adjust for the needs of students who are going from school with lower ranking to a higher one, increasing the transportation time, or disrupting friendships. Three additional responses that mentioned the words AI or computer, did not provide any additional understanding about how respondents feel about using AI to redraw school district boundaries.

When Llama 3.2 and DeepSeek-r1:1.5b were asked to identify any explicit mention of the words “AI” or “computer”, the models did not find anything. The results were similar when I tried different types of files (.TXT or .XLS). When Llama 3.2 found mentions of AI, the emerging themes did not align with the themes that were identified in the qualitative analysis. Unlike the other two models, GPT calculated the number of mentions of the words “AI” and “computer” as well as the rows in which these mentions appear. However, the number of occurrences and the row numbers were not accurate. GPT also provided context, but these observations did not provide a good understanding of what respondents think about using AI or computer-based approaches to change school districts.

A potential explanation of why the AI models are providing wrong answers might have to do with the issue of hallucination in AI. This is when AI generates information that sounds plausible but is actually made up. A good example of this was the fake quote that Llama 3.2 came up with about how “an ideal school should incorporate AI-powered tutoring tools to support students’ learning needs.” AI models are trained to make predictions which make answers look good because they are based on patterns; however, these predictions are not necessarily accurate and depend on the quality of the training data. This problem persists even in OpenAI’s recently launched AI reasoning models. In other words, no matter how good the instructions are, the model has defined certain themes of high importance about broader topics, in this case redrawing school district boundaries or public school needs, and based on this it prompts answers that help conceptualize the proposed themes. To address this issue, I have written down potential next steps in the next section.

Potential Next Steps

Some potential next steps that could help address the difference results obtained from AI models and the manual qualitative research includes 1) manually code a small portion of the data (~ 5%) 2) develop a codebook to run the AI models and apply it to the data and 3) test to improve intercoder reliability. Another approach is to develop a Python script to apply it to the over 2,000 responses, which might lead to more accurate results. This raises two distinct but related issues worth exploring further: the unreliability identified in the AIs themselves, and the analytical task that was originally delegated to them. Finally, another test that could be run includes feeding the models with a modified version of the file where “computer” and “AI” are replaced by other domain terms and asking the AI models if the new domain terms are present. That would help us understand if the AI model fails at this task generally or only when the subject is AI.

Community Voices and AI: Exploring Public Perceptions in School Rezoning Efforts

About the Project

Data

Analysis

Conclusion

Potential Next Steps

More Stories

Encoding Marginalia in the Dragon Prayer Book

Multimodal Telegram Message Analysis

Beyond the Ideology: Demographic Segregation of Information Sharing on Twitter