Based on this csv data set about animals in Bloomington Animal Shelter, I would like to learn more about specifically what type of dog is more likely to be adopted based on many different features.
Citations Citation 1 Citation 2 Citation 3
import pandas as pd
'''animal shelter data set
Attributes:
intakedate: Date of admission to the shelter. (Date)
intakereason: Reason for admission to the shelter. (String)
istransfer: Whether the animal was transferred from another shelter. (Boolean)
sheltercode: Unique code assigned to the animal. (String)
animalname: Name of the animal. (String)
breedname: Breed of the animal. (String)
basecolour: Color of the animal. (String)
speciesname: Species of the animal. (String)
animalage: Age of the animal. (Integer)
sexname: Sex of the animal. (String)
location: Location of the animal. (String)
movementdate: Date of movement. (Date)
movementtype: Type of movement. (String)
istrial: Whether the animal was on trial. (Boolean)
returndate: Date of return. (Date)
returnedreason: Reason for return. (String)
deceaseddate: Date of death. (Date)
deceasedreason: Reason for death. (String)
diedoffshelter: Whether the animal died offsite. (Boolean)
puttosleep: Whether the animal was euthanized. (Boolean)
'''
# Load the CSV file
df = pd.read_csv('animal-data-1.csv')
# View the first few data dictionary
(df.head())
index | id | intakedate | intakereason | istransfer | sheltercode | identichipnumber | animalname | breedname | basecolour | ... | movementdate | movementtype | istrial | returndate | returnedreason | deceaseddate | deceasedreason | diedoffshelter | puttosleep | isdoa | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 15801 | 2009-11-28 00:00:00 | Moving | 0 | C09115463 | 0A115D7358 | Jadzia | Domestic Short Hair | Tortie | ... | 2017-05-13 00:00:00 | Adoption | 0.0 | NaN | Stray | NaN | Died in care | 0 | 0 | 0 |
1 | 1 | 15932 | 2009-12-08 00:00:00 | Moving | 0 | D09125594 | 0A11675477 | Gonzo | German Shepherd Dog/Mix | Tan | ... | 2017-04-24 00:00:00 | Adoption | 0.0 | NaN | Stray | NaN | Died in care | 0 | 0 | 0 |
2 | 2 | 28859 | 2012-08-10 00:00:00 | Abandoned | 0 | D12082309 | 0A13253C7B | Maggie | Shep Mix/Siberian Husky | Various | ... | 2017-04-15 00:00:00 | Adoption | 0.0 | NaN | Stray | NaN | Died in care | 0 | 0 | 0 |
3 | 3 | 30812 | 2013-01-11 00:00:00 | Abandoned | 0 | C1301091 | 0A13403D4D | Pretty Girl | Domestic Short Hair | Dilute tortoiseshell | ... | 2017-04-18 00:00:00 | Foster | 0.0 | 2018-05-29 00:00:00 | Stray | NaN | Died in care | 0 | 0 | 0 |
4 | 4 | 30812 | 2013-01-11 00:00:00 | Abandoned | 0 | C1301091 | 0A13403D4D | Pretty Girl | Domestic Short Hair | Dilute tortoiseshell | ... | 2018-05-29 00:00:00 | Adoption | 0.0 | NaN | Stray | NaN | Died in care | 0 | 0 | 0 |
5 rows × 24 columns
I will mainly be using the key attributes of animalage, sexname, speciesname, basecolour, breedname, istransfer, intakereason, and intakedate for machine learning methods to test if there are any natural groupings of attributes that are more likely to be adopted. Based on this number, I will make visualization graphs to help picture based on certain attributes like how much certain age, species, reasons for admitting, etc has affected certain animals to be more likely to be adopted than the other or which one is more likely to not be adopted and stay in animal shelter. I think by studying mean, median, of certain traits could definelty help make evidence for hypothesis on why certain traits are more likley to be adopted.