Data, Deaths, and the Damn Prison System

BOUNTY
3 min read

Steal this (note)book

If you want to reproduce the charts from this notebook, clone this repository which has everything you need including the CSV files. The data comes from DoltHub's jails data bounty. Check it out. The data analysis was done in polars with the visualization in Altair.

"An open school is a closed prison"

If I want to know pretty much anything about the schools in my area I can find the information online. The NCES runs an online database where I can search for any school by name or location. I can see at a glance how many students are enrolled in each grade, plus a breakdown by race and gender. I can see the student-to-teacher ratio. I can even see how many students get free lunch.

taylor-road

Screenshot of the NCES page for my middle school, Taylor Road.

On the other hand, I want to know the same information about a local jail it's not easy at all. No such database exists. More than likely, you'll have to visit the prison itself to get its paper records, or pray that they've been putting the information online in a format that's easy to search.

Prison deaths are even harder to track down. In 2020, Reuters News made 1500 FOIA requests to US prisons and jails to find out when and how inmates died. Later on we'll show how to use that data to make a map of the deadliest counties to be incarcerated in -- at least, for the ones we have data for.

jails-california

The deadliest counties for prisoners in Calfornia.

In the meantime, let's look at what we got.

What our bounty hunters did

jails-texas

An example of data from Texas.

In our jails data bounty we asked the question: how many jails publish information, and what precisely do they publish? We sent out our bounty hunters (paid volunteers) to find what they could.

jails-snapshots

You can see exactly where our bounty hunters found statistical inmate data (we mostly didn't scrape inmate registries). The more snapshots we found, i.e., the more inmate-demographic records, the darker the red.

The problem was, for many states, our bounty hunters didn't find anything. Thinking about why jails don't publish more records, I concluded that

  1. there's no incentive to do it and
  2. there's (probably) no law that says you have to.

(If you have any thoughts on this, please write me at alec@dolthub.com.)

Scoring the data

For the prisons that provided statistics we scored them by completeness. All the colored counties below provided us some kind of stats, but many contained a count of inmates and not much more. The more demographic information they offered, the higher the score.

jails-richness

A whole two-thirds of our records come from PDF files, which were painstakingly OCRed and tabulated. Jails make almost no effort to make their records standardized or machine-readable.

The deadliest counties (that we know about)

As mentioned earlier, Reuters's 1500 FOIA requests for records on prison deaths made it into our incidents table. We can combine this with the BJS's data on prison size (our jails table) to get a measure of the deadliest counties.

jails-deaths

We can zoom in individual states to get a peek at the distribution. For example, Florida:

jails-florida

In Florida, the deadliest county in our dataset is the one containing Miami-Dade County Boot Camp.

A sample of our snapshot data

Finally, to get a sense of what the time series data we captured looks like, here's a sample of it from some jails which have the most records. On more transparent side, some facilities have records going back to 2002,

random-sample

while others go back just a few months.

A plea bargain

By publishing this data we're hoping we can get feedback from the broader community. What we did right, what we did wrong, and what we can do better next time to track down this data.

Jails don't have to be black holes for data. But they often are, with organizations relying on webscraping or FOIAs to see what's going on inside.

Yet jails do a roll call every day. They have records on each inmate. And as a matter of spending, inmates cost around $30k per year compared to K-12 pupils, which cost about a third of that..

We also know that the Bureau of Justice Statistics (BJS) does a jail census. But with this limited data only being updated every 5-6 years, and on a volunteer basis from the jails surveyed (!), it's hardly as useful as it could be.

This leaves me with some open questions:

  1. What are the obstacles stopping us from forcing jails to release their information to a central statistical agency? And why is getting this data from schools comparatively easier?
  2. Why is data collection so rare?
  3. What's stopping us from writing laws that force public jails to publish their data?

If you have answers you'd like to share, or just want to discuss, write me at alec@dolthub.com, or click this Discord invite link and ask for @spacelove.

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.