Analyzing Patient No-Shows at Medical Appointments¶

There is a larger problem for patient no-shows at medical appoinntments which leads to a waste of resources and delay in necessary medical care both for the patient themself and other patients. It is estmated that abour 20% of patients miss their scheduled appoints, causing a loss of revenue and potential negative impact on patient health outcomes. (Lacy et al., 2014).

By identifying factors that may have contributed to the no show can help medical facilities improve their patient attendance rates allowing for a reduce in costs.

Referemce: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1466756/

Data Set¶

Dataset: Medical Appointment No Shows Source: Kaggle Link: https://www.kaggle.com/joniarroba/noshowappointments

This data is specifically looking at data from public institutions in a Brazilian city. The appointments occurred across a 6-week period in 2016 (27 days). The appointments occurred from 29.4.2016 to 08.06.2016. the scheduled visits start from 10.11.2015 to 8.6.2016. The data population was from 62,000 patients and 81 neighborhoods.

The dataset is efficient for analyzing patient no-shows at medical appointments as it contains a large sample size of over 100,000 appointments from a 62,000 patients, providing a comprehensive view of the problem. It also includes a variety of patient characteristics such as age, gender, health conditions, and social factors, all of which have been shown to be associated with appointment attendance. Additionally, the dataset contains information on appointment reminders and scheduling details, allowing for a more nuanced analysis of the factors that influence attendance.

Data Dictionary:¶

  • PatientId: Identification of a patient
  • AppointmentID: Identification of each appointment
  • Gender: Male(0) or Female (1)
  • ScheduledDay: The day the patient registered for the appointment
  • AppointmentDay: The day of the actual appointment
  • Age: Patient Age
  • Neighbourhood: Where is the appointment
  • Scholarship: True or False (indicates whether the patient is enrolled in a Brasilian welfare program)
  • Hipertension: True or False
  • Diabetes: True or False
  • Alcoholism: True or False
  • Handycap: True or False
  • SMS_received: 1 or more messages sent to the patient as reminders for appointment
  • No-show: True or False (indicates whether the patient showed up to the appointment) 0 (show): 1 (no show)
In [11]:
import pandas as pd

pd.read_csv('Patient No Show Brazil 5:2016.csv').head()
Out[11]:
PatientId AppointmentID Gender ScheduledDay AppointmentDay Age Neighbourhood Scholarship Hipertension Diabetes Alcoholism Handcap SMS_received No-show
0 2.987250e+13 5642903 F 2016-04-29T18:38:08Z 2016-04-29T00:00:00Z 62 JARDIM DA PENHA 0 1 0 0 0 0 No
1 5.589978e+14 5642503 M 2016-04-29T16:08:27Z 2016-04-29T00:00:00Z 56 JARDIM DA PENHA 0 0 0 0 0 0 No
2 4.262962e+12 5642549 F 2016-04-29T16:19:04Z 2016-04-29T00:00:00Z 62 MATA DA PRAIA 0 0 0 0 0 0 No
3 8.679512e+11 5642828 F 2016-04-29T17:29:31Z 2016-04-29T00:00:00Z 8 PONTAL DE CAMBURI 0 0 0 0 0 0 No
4 8.841186e+12 5642494 F 2016-04-29T16:07:23Z 2016-04-29T00:00:00Z 56 JARDIM DA PENHA 0 1 1 0 0 0 No

I plan on using the data to analyze the factors that influence patient no-shows at medical appointments and to develop predictive models to identify patients who are at risk of not showing up. Doing so will enable healthcare providers to take proactive measures to reduce the number of missed appointments and improve patient outcomes.