K-12 schools generally have a wealth of data available on their students that can be leveraged toward increasing college and career readiness, with the right tools. Careful data collection and analysis enables researchers, administrators, teachers, and counselors to identify how certain variables may be more likely to trigger particular student outcomes. These analyses are only useful, however, if they inform interventions for students, either enabling them to get back on track when they start to slip or—even better—to prevent them from falling off track in the first place.
Schools are not the only keepers of valuable information to support students. Community-based organizations, from after-school program providers to college access organizations, also work to support youth academically, socially, emotionally, and financially on the path to postsecondary education. The data on a single student, gathered from multiple organizations and providers and tracked over the years, will provide a more complete picture of that student’s needs and effective responses than any single piece of information could alone. By reviewing a larger dataset of students, researchers can identify patterns and implement a variety of interventions to help students complete their coursework on time, score well on tests like the ACT, and graduate college-ready. Below are a few examples of tools that can be used to share K-12 student-level data:
FAFSA Completion Tool. A strong correlation exists between FAFSA completion and college enrollment. Does your community want to know if students are filing for student financial aid, especially Pell Grants? Having reliable data on FAFSA submissions can uncover those answers. In 2012, the U.S. Department of Education developed a FAFSA Completion Tool that provides high schools with real-time data to track their students’ FAFSA completion rates, helping communities to identify low rates and form a strategy to raise them.
Summer Melt Handbook. A high percentage of students who have every intention of attending college do not actually enroll the fall after graduation. Does your community want to know how to identify these students and support them in following through with their plans? The Strategic Data Project at Harvard University’s Center for Education and Policy Research offers a guide for stakeholders who want to measure the extent of summer melt and design and implement a summer counseling initiative to mitigate the problem in their communities.
Student Achievement Predictive Data Model. Does your community want to identify at-risk students early and effectively? This data tool enables schools to harness the tremendous amount of data they possess to identify the relation of one variable to another, allowing schools and other student-serving organizations to identify the most important benchmarks students must meet in terms of college readiness. Once at-risk students are identified, school and community leaders can direct resources towards meeting those students’ needs.
This section of the guidebook explores this last tool in more depth and includes an interview with a lead researcher at Summit Education Initiative (SEI) in Akron, Ohio to shed light on how they have used predictive data modeling to support students across Summit County. We also include a data-sharing agreement template to demonstrate language you can use to share student-level data between a school district and community partners. Finally, this chapter ends with a list of additional resources where you can find more information on K-12 student-level data tools.
Akron, Ohio: How to Use a Student Achievement Predictive Model to Help At-Risk Students Graduate High School Collegeand Career-Ready
- Matt Deevers, Ph.D., Senior Research Associate, Summit Education Initiative
IHEP spoke with Matt Deevers, Senior Research Associate at Summit Education Initiative (SEI) in Akron, Ohio, to learn about the student achievement predictive model that SEI has built to identify K-12 students who are at risk of not graduating college and career-ready. Deevers describes how SEI developed a data-sharing agreement that allowed them access to student-level data without compromising student privacy. Read this interview to learn which data go into building such a model and how you can use its results to inform early and effective interventions for students.
IHEP: How did you decide that building a student achievement predictive model was the right data tool for your purpose?
Having worked in the K-12 space for 18 years, one thing I knew was that there is no shortage of data in schools. Instead, they are usually overrun with data, and they don’t have the staff available to conduct some more complex analytical research projects. I also knew that though there is an overwhelming amount of data in schools, there is sometimes a startling lack of information, and we define “information” as data that can direct action in positive ways to support student success.
What we were seeing was that there were a lot of distinct data initiatives, so we thought it was time to step in and say, “How do we connect all of these seemingly disconnected pieces of data in a way that can focus and organize school efforts?”
IHEP: How did you determine what student achievement meant in order to measure it?
Research shows us that an ACT composite score of 21 or better increases the chance that a student will earn an A or a B in their first freshman course by 50 percent, and there is a 75 percent probability that the student will get a C or better, so that is the outcome variable we use.
The Board of Regents in the state of Ohio also adopted remediation-free standards, which basically say that if you have this 22 math score, this 21 reading score, and you are accepted by a state school, then that state school cannot require you to take a remedial course—even if a test like Compass or Accuplacer recommends your placement into those courses. So the ACT composite score becomes an important common standard to judge college readiness.
IHEP: How were you able to gain access to the data you needed to build this tool?
We have a data access agreement. There are 17 public school districts in Summit County, Ohio, and there’s an annual agreement with two levels to it signed between the superintendent of the school district and Summit Education Initiative. The first level says that we have access to the district’s data for 12 months or until somebody opts out.
The second part of that agreement is a project-authorization form, approved on a project-by-project basis, where SEI promises not to actually access or pull any data unless we are specifically conducting a project on the district’s behalf.
This agreement can now be signed digitally, so that really speeds up the process.
IHEP: How did you develop this data-sharing agreement?
We actually borrowed a data agreement from another member of the StriveTogether network, Seattle. So we took that and tailored it to the nature of the work that we’d be doing.
Before we asked for all the districts to consider signing it, we met with one partner district that was familiar with the work that we were doing and was supportive of the work. The school superintendent also happened to be on our board of directors and we knew we needed buy-in at that level. Their most immediate concern was raising student achievement on a 10th grade graduation test. In Ohio, students start taking their graduation test in 10th grade, and if they pass it the first time, they don’t have to take it again. If they pass it very, very well, then it reflects positively on the school, performing at a higher level. So this early adopter partner district had targeted that as an area of need. They signed our agreement, so we did the work for them.
IHEP: How did you help this district address its concern of raising achievement on the 10th grade graduation test?
We built predictive models for them that specifically connected grade point average and some early college readiness test scores to the state test scores, so that they could pinpoint with much greater accuracy the students who were unlikely to meet with success at the highest performance levels, without additional intervention.
Once we had that successful relationship with one district, we were able to go to the other school districts. It helps that our early adopter partner district is willing to share its experiences with new districts.
IHEP: Did you encounter concerns about safeguarding students’ private information when sharing data?
The first thing that I’ll say is when you are in the course of conducting research in order to identify critical values which districts can act on, you only need personally identifiable information (PII) for a very brief moment in time, and that is only to connect disconnected datasets—like connecting attendance totals to grade point averages to state test scores to ACT scores to graduation lists. The PII is only used to aggregate the data together, and then it’s eliminated. So by the time I start doing the work at the Summit Education Initiative, the data are completely anonymous, and I think that’s an important thing to keep in mind.
When you are in the research phase, stripping data of PII as early and quickly as possible is definitely a best practice. In the example I mentioned above, where I was linking the 10th grade fall assessment with the 11th grade first grading period GPA, I stripped the information from the students who had been through the pipeline. I study the trends in the data from deidentified data, so that I can understand what critical values can then be mapped onto current live data.
When we get into the phase where we are turning the data back to the schools in real time in a way that we want them to drive action, then we use a cloud-based database solution that allows us to have record-level security. When the current live data are extracted, because they have PII on them, they are immediately uploaded to this secure cloud-based database system that allows the principals then to log onto a web portal and see the results.
IHEP: Can you describe how the tool works?
The tool allows me to take two pieces of data and put them together. One project, for example, required connecting a 10th grade fall readiness assessment with the grade point average from the first grading period of 11th grade. These are two disparate data points, but both of which are highly correlated with the ACT score that students get in the spring of 11th grade.
I took those two datasets together so that I could identify the critical values that would lead to a student succeeding on the ACT or struggling on the ACT. This is how we get blood pressure, how we know 120 over 80 matters, because we measured it in thousands and thousands of people over time, so the same thing applies here. I study the trends in the data so that I can understand what variables impact student success.
Why not harness the use of this incredible amount of data that is already being collected to channel attention and energy towards something like graduating college-ready?
IHEP: How do you decide which data points will enable you to determine which students are most in danger of not meeting an ACT composite score of 21 or better?
Again, we began with the assumption that there are plenty of data out there and we don’t need to create something new, so we set up three criteria that just guided our own work. Data needed to be: 1) commonly collected, 2) easily understood, and 3) actionable.
After the No Child Left Behind Act of 2001, we had reading and math scores from students in grades 3 through 8 and at least once in high school. We could establish a correlation between those early indicators—8th grade mathematics and high school state test scores—and the ACT. We also know that the way most schools work, students get a grade point average every quarter, and so why not harness the use of this incredible amount of data that is already being collected to channel attention and energy towards something like graduating college-ready?
So the data points where we first focused our energy were state test scores, quarterly grade point averages and cumulative grade point averages, and also attendance data because our urban school partners feel students often have a hard time staying on track because they just can’t get to school.
IHEP: How long would you say it took SEI to develop this tool, including all planning, designing, and testing phases, as well as engaging the early adopter district?
We spent about 18-24 months getting to a place where if a school calls us, concerned about which ninth graders most need attention, we can turn around within a couple of days and give them web-based secure access into their own data that enables them to see the students who are in need of the greatest attention.
However, our work is still in its infancy because some of our partners use different sites to house their students’ data. There is nothing quite automated about the work just yet. We just continue to refine and enhance it until the value proposition is so clear that the districts themselves are willing to invest a little bit of their own human resources into helping us aggregate and report their own data into our cloud system.
We hope to get to a place where the schools would task someone with automating the creation of datasets, because then it could be pushed automatically to a cloud-based database straight from the school district. Then we’d just be managers of the database without actually ever pulling down the data. Ultimately our goal is to write ourselves out of the mechanical processes, which will free up more of our time to facilitate dialogue about what to do with the data once you have them.
IHEP: How do you believe school districts can use these data to inform interventions that would increase student success?
We don’t necessarily think that all the data we collect would identify a student at risk, or that they have all the answers that could direct us to the appropriate interventions. So we developed another data tool over the past two years that is a survey of social/ emotional factors that can be used in schools.
By using this survey, we begin to create a more comprehensive understanding of each student. We believe if the data are robust enough, then paths to support that child will be made clear. We are very careful in our mission of “working with schools,” not “working on schools,” and so we try to put enough data into the school leaders’ hands in a way that we get them right to the edge of, “Okay, now we know what we need to do,” and then they can take it from there.
We believe if the data are robust enough, then paths to support that child will be made clear.
IHEP: Could you please provide an example of how these tools were used to identify at-risk students and the specific interventions the data informed?
In the spring of 2013, the early adopter partner district I mentioned earlier wanted to identify students who were at risk for not exceeding minimum standards on the state test. We combined some predictive modeling work with some of the other work that we do on the social/emotional development and positive psychology and gave these data to the district. The building principal then met with each of the department chairs and gave them the list of at-risk students who were not going to exceed minimum standards on the state tests, unless they took action.
Then the department chairs sat down with each student and said, “The good news is we have every reason to believe you are going to be successful on the state test in a few months, but we have studied these data, and we believe that with a little extra effort, you could blow the doors off this and be very, very successful and score the higher performance levels. We have structured some time that we would like to invite you to. We will take care of your lunch. We are going to do group study sessions and do some review activities and really prepare you for success, so that when the day the test comes, you won’t need to worry about how prepared you are.”
So they took our predictive model, and combined it with their own approach to working with kids that said, “We are confident in your abilities, and we want to help enhance them.”
IHEP: Does SEI involve itself in training district teachers and administrators on what to do with the data you collect?
Our Ready High School Network is made up of teams that represent each one of the high schools in the urban school district and about 50 percent of the suburban high schools. This network meets quarterly and begins each meeting with what we call a “data dip,” where we focus on one particular data point.
So, for example, each high school will be given the FAFSA data from the government that says the number of students who have started the FAFSA and the number of students who have completed the FAFSA. We will combine that with the total number of students in their 12th-grade class to say, “Here is your gap analysis.”
When one member of the network finds something that seems to be working, that member is invited to come back to the next meeting and actually showcase their work. For instance, one school felt they had a really great college visit model that included lessons for students before they go on the visit, tasks to complete while they are on the visit (e.g., a scavenger hunt), and reflective activities that followed the college visit. By sharing their model with other schools that felt their college visits were lacking, this networking experience starts to produce a more standard protocol for how to do college visits.
So it is really these network teams that we think drive the change. We don’t have the personnel to go out to every school and tell them what to do, nor do we want to, and we certainly don’t know what all the answers are. I believe that the schools know how to best meet the needs of the students in their schools. They just need a little help organizing the information and finding clarity of focus.
IHEP: Have you seen an increase in graduation rates over the past few years? Perhaps specifically graduation rates of college-ready students?
Yes, we have. I don’t claim responsibility for that, but I do claim in its celebration. We have been collecting and aggregating the college-ready rates of our students at the county level for the past three years, and what we found was that college readiness (again, which we define as students earning a 21 or higher on the ACT) has improved by five percent from 2012 to 2014. That means, in our county, there are 240 more students every year who are leaving high school prepared for success in a postsecondary program.
IHEP: Would you like to offer any last words of wisdom to CPA communities who are looking to use predictive data modeling to identify students most in need of intervention?
First, we understand that schools are facing very immediate needs in isolation from the big picture. Graduating ready for college is great, for example, but schools are worried about graduation rates, period. Schools can’t think about who will do well on the ACT, because it happens at the end of 11th grade, while they’re concerned about kids not passing their exams at the end of 9th grade. Now, often, the same kid who was at risk for not passing their 9th-grade test is the kid who is at risk for not graduating with a 21 on the ACT, and we really need to help people understand this so that our data can have more impact.
Secondly, there are such wild variations in some of the data collection habits of school districts that we’re struggling to find something that can be universally referenced, yet also still actionable.
I will give you a down-in-the-weeds example, but for people who are starting to work with school districts, this is valuable. At the end of the first nine weeks of school, kids get a grade point average. It is not their cumulative grade point average, but it is their grade point average for how they did in the last 45 days. We believe that is the most valuable of grade point averages because you get a fresh one every 45 days.
The limitation, however, is that School A might call that “quarter 1” in their data system. School B might call that “QPR1.” School C might call it “first quarter.” There are so many variations on the naming conventions for how the data get into the system, so coming up with a universal way to pull it out is quite a challenge.
So we start looking toward cumulative GPA because there’s much greater consistency around the way that that data are encoded when they’re put into the student information system. However, one of the limitations of cumulative GPA is it is considerably more stable, meaning it doesn’t move quite as much. So you could have a kid from one grading period to the next go from a 2.0 to a 3.1, and this is a huge victory that we should all celebrate. But rarely does the cumulative GPA move that much. As we try to balance the expediency of data collection with the value that that data brings to the conversation, we should select the data we use carefully.
As we try to balance the expediency of data collection with the value that that data brings to the conversation, we should select the data we use carefully.
Strategic Data Project Summer Melt Handbook: A Guide to Investigating and Responding to Summer Melt (2013: Center for Education Policy Research at Harvard University)
This handbook explains how school administrators, high school counselors, and community-based organizations can reduce “summer melt,” the phenomenon in which students who enroll in a postsecondary institution in the spring fail to attend the following fall. It offers strategies for helping districts collect data on summer melt among their students and provides various examplesofhowadistrictcandecreaseitsrateofsummermelt depending on its resources, information, and connections with local colleges or college access organizations. The handbook includes five case studies of initiatives from community organizations and schools to implement summer melt interventions, detailing costs, timelines, and results.
Data Usage and Platforms for College Access and Success (2014: National College Access Network)
This brief details different platforms available for tracking student data that can be used by programs and practitioners seeking to advance college access and success through program improvement and scaling. This resource uses survey results to compare different platforms—including SalesForce, Naviance, College Greenlight, and others—by looking at what data they are collecting, how the data are stored, platform strengths and weaknesses, and how they can best be utilized for different purposes. NCAN also describes the experiences of its members in utilizing each of the platforms.
Privacy Technical Assistance Center (U.S. Department of Education)
The Privacy Technical Assistance Center is a “one-stop” resource for education stakeholders who use student-level data. These resources are especially important in light of the increasing use of K-12 and P-20W longitudinal data. The Center provides tools and assistance—both online and offline—to help organizations and institutions maximize the quality and usefulness of student-level data without compromising students’ privacy. Its Privacy Toolkit offers a collection of up-to-date information and resources about FERPA and other legal, statistical, and policy issues. The Center also offers site visits to state and local educational agencies, informational forums, and a support center with an interactive help desk.
Student Privacy Resource Center (FERPA|SHERPA)
This website provides materials for parents, school officials, policymakers, tech companies and education service providers on how to use student data responsibly under FERPA and other student privacy regulations. This resource also includes information on the privacy laws and data security, along with policy papers and other resources concerning student privacy.
A Stoplight for Student Data Use (2014: Data Quality Campaign)
This guide provides an overview of the Family Educational Rights and Privacy Act (FERPA) and describes the scenarios in which educators and policymakers can and cannot share students’ personally identifiable information under the law. This resource should be used to understand key provisions of the law and determine when it is necessary to consult the law or an expert.