Sleep Efficiency¶

Project Overview¶

Sleep is key player in every human's life, and each night's quality of rest impacts the day to day can impact mood, health, and productivity. According to Harvard Medical School's Division of Sleep Medicine, sleep plays a vital role in learning and memory consolidation, and lack of sleep negatively impacts attention, sound decision-making, and neural functioning1. However, the U.S. Center for Disease Control reports that 35% of the U.S. adult population are not getting the reccommended 7 hours of sleep, which may have corollary effects on the inceasing percentage of citizens developing heart disease or obesity2. While all sleep is important, the most restful and beneficial is stage 3, or deep sleep4. This stage is when the body repairs itself by regrowing tissue, builds bones and muscles, and memory consolidation occurs.

So if sleep if so important, why are so many adults consistently sleep deprived?

While there are many external factors that effect how long and well a person sleep each night, it is important to find out how to incorporate lifestyle changes to improve sleep. In this project, we hope analyze what constitutes the best quality sleep, as well as how various factors, such as timing, caffeine or alcohol consumption, and exercise frequency, work together to determine quality of sleep. In the end, we hope to use machine learning to predict a person's quality of sleep after providing their applicable external factors(bedtime, wakeup time, caffeine, etc.). It would also be interesting to provide a suggestion for that person to improve their quality of sleep, such as consuming less caffeine before bed. To do all of this, we will train using a dataset found of Kaggle by user Equilibriumm, which covers a sleep study measuring sleep efficiency3.

We will determine what conditions create the best sleep by analyzing which factors correspond with higher sleep efficiency scores and deep sleep percentages, through visualization plots. By using k nearest neighbors and sci-kit learn, We will be able to determine which category of sleep quality a person can predict to get, given they sleep with certain variables. The quality will be determined by a simple grading scale(A-F), determined by the efficiency score and deep sleep percentage found in the dataset. After generating this predicted score, a suggestion will be made by comparing sleep duration, caffeine and alcohol consumption, and exercise frequency to see what factor can be modified to see the greatest predicted increase in sleep efficiency.

Data¶

In [1]:
# reading in data and showing it
import pandas as pd

df_sleep = pd.read_csv('sleep_efficiency.csv')

df_sleep.head(5)
Out[1]:
ID Age Gender Bedtime Wakeup time Sleep duration Sleep efficiency REM sleep percentage Deep sleep percentage Light sleep percentage Awakenings Caffeine consumption Alcohol consumption Smoking status Exercise frequency
0 1 65 Female 2021-03-06 01:00:00 2021-03-06 07:00:00 6.0 0.88 18 70 12 0.0 0.0 0.0 Yes 3.0
1 2 69 Male 2021-12-05 02:00:00 2021-12-05 09:00:00 7.0 0.66 19 28 53 3.0 0.0 3.0 Yes 3.0
2 3 40 Female 2021-05-25 21:30:00 2021-05-25 05:30:00 8.0 0.89 20 70 10 1.0 0.0 0.0 No 3.0
3 4 40 Female 2021-11-03 02:30:00 2021-11-03 08:30:00 6.0 0.51 23 25 52 3.0 50.0 5.0 Yes 1.0
4 5 57 Male 2021-03-13 01:00:00 2021-03-13 09:00:00 8.0 0.76 27 55 18 3.0 0.0 3.0 No 3.0
In [2]:
data_dict = {"ID": "ID number associated with study participant",
             "Age": "Age of study participant",
             "Gender": "Gender of study participant",
             "Bedtime": "Time when participant went to bed",
             "Wakeup Time": "Time when participant woke up",
             "Sleep Duration": "Total hours participant slept",
             "Sleep Efficiency": "Score calculated by how much time the user was actually sleeping",
             "REM sleep percentage": "Percentage of total sleep spent during Rapid Eye Movement phase",
             "Deep sleep percentage": "Percentage of total sleep spent during deep sleep phase",
             "Light sleep percentage": "Percentage of total sleep spent during light sleep phase",
             "Awakenings": "Total times participant woke up in the middle of sleep cycle",
             "Caffeine consumption": "in milligrams, participant's caffeine consumption in the 24 hours prior to bedtime",
             "Alcohol consumption": "in ounces, participant's alcohol consumption in the 24 hours prior to bedtime",
             "Smoking status": "Whether or not the particpant smokes",
             "Exercise frequency": "Amount of times the participant exercised in the past 7 days"}

data_dict
Out[2]:
{'ID': 'ID number associated with study participant',
 'Age': 'Age of study participant',
 'Gender': 'Gender of study participant',
 'Bedtime': 'Time when participant went to bed',
 'Wakeup Time': 'Time when participant woke up',
 'Sleep Duration': 'Total hours participant slept',
 'Sleep Efficiency': 'Score calculated by how much time the user was actually sleeping',
 'REM sleep percentage': 'Percentage of total sleep spent during Rapid Eye Movement phase',
 'Deep sleep percentage': 'Percentage of total sleep spent during deep sleep phase',
 'Light sleep percentage': 'Percentage of total sleep spent during light sleep phase',
 'Awakenings': 'Total times participant woke up in the middle of sleep cycle',
 'Caffeine consumption': "in milligrams, participant's caffeine consumption in the 24 hours prior to bedtime",
 'Alcohol consumption': "in ounces, participant's alcohol consumption in the 24 hours prior to bedtime",
 'Smoking status': 'Whether or not the particpant smokes',
 'Exercise frequency': 'Amount of times the participant exercised in the past 7 days'}

Why this data works¶

The data is sufficient to cover my analysis because it covers the percentage of time in deep sleep, which is the final and most important stage, as well as the sleep efficiency which determines how much of the time sleeping is spent actually sleeping, which are the two most important factors for determining whether the sleep quality was good. The inclusion of caffeine and alcohol consumption, smoking status, and exercise frequency will help us explore a few of the external factors that influence sleep. These are also factors that are mostly in the participant's control, which means it could be possible to improve sleep quality by modifying corresponding behaviors.

Sources¶

  1. “Sleep, Learning, and Memory.” Sleep, Learning, and Memory | Healthy Sleep, Harvard Medical School, https://healthysleep.med.harvard.edu/healthy/matters/benefits-of-sleep/learning-memory.

  2. Centers for Disease Control and Prevention. “1 in 3 Adults Don’t Get Enough Sleep.” CDC, 1 Jan. 2016, www.cdc.gov/media/releases/2016/p0215-enough-sleep.html.

  3. Equilibriumm. “Sleep Efficiency Dataset.” Www.kaggle.com, 21 Feb. 2023, www.kaggle.com/datasets/equilibriumm/sleep-efficiency?select=Sleep_Efficiency.csv

  4. Ghacibeh, Georges. “Which Sleep Stage Is Most Important?” Www.hackensackmeridianhealth.org, Hackensack Meridian Health, 8 Feb. 2023, www.hackensackmeridianhealth.org/en/HealthU/2023/02/08/Which-Sleep-Stage-is-Most-Important#.Y_u16HbMK3A