One of the major concerns for many cities and communities all around the world is crime. Crimes rates are rising and rising all over the country and it is getting harder and harder to combat different crimes. Police have been trying to find strategies and techniques that could help prevent crime.
Police departments provide crime reports each year. This crime report has all the neccessary information about every crime reported. If analyzed properly, artificial intelligence could predict serious violent crimes. The goal of this project is to create a model that analyzes and identifies the relationship between the neighborhoods, the type of crime and the number of incidents reported.
If successful, this model could predict the crimes that are most likely to occur in certain neighborhoods in Boston. This model could help analyze the crime data and develop models that could help the officers identify the areas where the crimes are more likely to occur. By compiling and analyzing data from multiple sources, predictive methods identify patterns and generate recommendations about where crimes are likely to occur
We will use Boston Police Department Crime Incident Reports (August 2015 - To Date to observe the following factors of crime rpeorts:
import pandas as pd
reader = pd.read_csv('crime2022.csv')
reader.drop(columns=['OFFENSE_CODE_GROUP','INCIDENT_NUMBER' ], inplace=True)
reader.head()
/var/folders/n_/qrrjhxxx3351chd0b7r6t2bc0000gn/T/ipykernel_1881/138618666.py:2: DtypeWarning: Columns (0) have mixed types. Specify dtype option on import or set low_memory=False. reader = pd.read_csv('crime2022.csv')
OFFENSE_CODE | OFFENSE_DESCRIPTION | DISTRICT | REPORTING_AREA | SHOOTING | OCCURRED_ON_DATE | YEAR | MONTH | DAY_OF_WEEK | HOUR | UCR_PART | STREET | Lat | Long | Location | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 619 | LARCENY ALL OTHERS | D4 | 167 | 0 | 2022-01-01 00:00:00 | 2022 | 1 | Saturday | 0 | NaN | HARRISON AVE | 42.339542 | -71.069409 | (42.33954198983014, -71.06940876967543) |
1 | 2670 | HARASSMENT/ CRIMINAL HARASSMENT | A7 | 0 | 2022-01-01 00:00:00 | 2022 | 1 | Saturday | 0 | NaN | BENNINGTON ST | 42.377246 | -71.032597 | (42.37724638479816, -71.0325970804128) | |
2 | 3201 | PROPERTY - LOST/ MISSING | D14 | 778 | 0 | 2022-01-01 00:00:00 | 2022 | 1 | Saturday | 0 | NaN | WASHINGTON ST | 42.349056 | -71.150498 | (42.34905600030506, -71.15049849975023) |
3 | 3201 | PROPERTY - LOST/ MISSING | B3 | 465 | 0 | 2022-01-01 00:00:00 | 2022 | 1 | Saturday | 0 | NaN | BLUE HILL AVE | 42.284826 | -71.091374 | (42.28482576580488, -71.09137368938802) |
4 | 3201 | PROPERTY - LOST/ MISSING | B3 | 465 | 0 | 2022-01-01 00:00:00 | 2022 | 1 | Saturday | 0 | NaN | BLUE HILL AVE | 42.284826 | -71.091374 | (42.28482576580488, -71.09137368938802) |
Our project seeks to use the features above to identify the areas where the different crimes are more likely to occur. District, reporting area, the date(year,month,day, hour), the street, latitude, longtitude and location provide detailed data of where and when the crimes occured. Offense description, shooting, ucr part, and offense code provide a detailed data of what crimes occured.
Our assumption is that everything on the crime reports are accurate. However, the police description in reporting are sometimes sources of inaccuracy in constructing statistical crime records. The time the incident reported could be wrong or the individual filling
We will categorize the crimes based on offense code and offense description. Doing this allows us to discover where and what crimes occured the most. This way we could see if somewhat the same level/ severity of crimes occur at the same places. This model would give a grouping of the most commited crimes with the location they were commited at. This way police could be on the look out and be prepared for a certain type of crime to occur in the area with high crime incidents.