Crime Outcome Prediction¶

Motivation:¶

Problem¶

It is widely understood that marginalized groups are disproportionately harmed by the criminal justice system. Biases play a significant part in the level of conviction for a crime, further widening the gap of societal inequality.

Solution¶

By using technology to analyze crime data, underlying prejudices and biases in outcomes for incidents of crime can be identified. Using a dataset that provides information about the case such as the location, type of crime, FBI code, and whether an arrest was made or not, trends in data can be analyzed to understand what the standard outcome of a case should be. If possible, with a dataset that provides the years sentenced, the number of years sentenced for a case can also be predicted. The goal of this project is to use previous crime data to predict and classify the outcome of a crime incident.

Impact¶

With the increasing use of AI in jury cases, it is essential to understand the biases and ethical considerations behind many arrests. Such a classifier could be used to create an unbiased, objective verdict on a criminal case that can be used as a baseline, helping improve the quality and efficiency of the criminal justice system.

A potential setback to this classifier is the data itself potentially has many discrepencies based on prejudice. For example, variables such as race, class, gender, and age are all factors that influence the outcome of a criminal case, and training on a biased dataset may produced biased outcomes.

Dataset¶

Detail¶

Link to dataset: https://www.kaggle.com/datasets/chicago/chicago-crime

Relevant columns:

  • Date
  • Address
  • Primary Type
  • Description of crime
  • Location description
  • Arrest (T/F)
  • FBI Code
  • Year
  • Community area
date primary_type description location_description arrest community_area fbi_code year
07/17/2012 11:30:00 PUBLIC PEACE VIOLATION RECKLESS CONDUCT SIDEWALK True 50 26 2012
05/24/2002 11:47:42 ROBBERY ARMED: HANDGUN PARKING LOT/GARAGE(NON.RESID.) False 50 03 2002
05/08/2005 09:20:00 BATTERY AGGRAVATED: OTHER DANG WEAPON SIDEWALK False 49 04B 2005
06/21/2007 11:30:00 BURGLARY FORCIBLE ENTRY CHA APARTMENT False 49 05 2007
09/07/2010 09:41:00 THEFT OVER $500 SCHOOL, PUBLIC, GROUNDS True 50 06 2010
12/15/2008 10:18:00 HOMICIDE FIRST DEGREE MURDER STREET False 49 01A 2008
04/21/2018 10:00:00 CRIM SEXUAL ASSAULT NON-AGGRAVATED APARTMENT False 50 02 2018
09/05/2018 12:00:00 CRIMINAL TRESPASS TO LAND CONSTRUCTION SITE True 49 26 2018
05/10/2007 03:15:00 NARCOTICS FORFEIT PROPERTY STREET True 49 26 2007
02/11/2003 01:35:00 OTHER OFFENSE VIOLATE ORDER OF PROTECTION RESIDENCE False 49 26 2003

This project seeks to use the features above to estimate the outcome of an inputted criminal case.

Potential Problems¶

As stated before, a large problem is in the bias of the dataset. Current biases in the criminal justice system influence the outcome of arrest, which undermines the credibility of this classifier. Furthermore, because it is important to consider the nuances in criminal cases, this classifier can only be used as a baseline judgement on whether or not a person should be arrested for their crime.

Method:¶

This is a classification problem: the data above can be used to predict whether a person will be arrested for their crime. With a more thorough dataset including sentencing of convicted cases, it can potentially be a regression problem where the amount of years sentenced can be predicted based on information from earlier cases. This approach offers a less-biased and efficient method to determining arrest cases by eliminating confounding variables that typically negatively affect minorities.