Healthy Eating at Dining Halls¶

Motivation¶


Problem¶

With dining halls, the seemingly unlimited food makes it hard for students to make healthy choices. Further, the lack of nutritional information could missinform students about how healthy they are actually eating. High fat/oil foods in the dining hall are known to cause upset stomachs. I struggled with this myself, and was forced to manually search the page for low fat, high protein options. An individual should never have to struggle this hard to find healthy, filling foods.

Solution¶

Determining the healthy choices a student can make prior to a trip to the dining hall will make it easier for students to make healthy desisions and feel better. The goal of this project is to utilize the "Dine in on campus" website to determine healthy options for students in the dining halls

Impact¶

If this project is succesfull, it will significantly help students struggling with their weight (freshman 15, for example), or students with dietary restrictions or sensative stomachs, find amazing healthy foods in the dining halls of northeastern. Because "What's on the menu" is a service used by many other college dining halls, this project is also very expandable.

One possible negative effect that could result from this project, like all campaigns for healthy eating, could be the contribution to body dysmorphia and certain eating disorders an individual could experience. I hope to minimize this as much as possible by prioritizing foods that have good caloric ratios, meaning that their calorie to protein ratio is low.

Dataset¶


Detail¶

I will use http get requests from the Dine On Campus api to retrieve a json file of all of the foods that are on the menu for a given day at Stetson East, and parse it into a pandas dataframe. This is happening below. This data includes over 400 foods (for one day)

  • Calcium (mg)
  • Calories
  • Calories From Fat
  • Cholesterol (mg)
  • Dietary Fiber (g)
  • Iron (mg)
  • Potassium (mg)
  • Protein (g)
  • Saturated Fat (g)
  • Saturated Fat + Trans Fat (g)
  • Sodium (mg)
  • Sugar (g)
  • Total Carbohydrates (g)
  • Total Fat (g)
  • Trans Fat (g)
  • Vitamin A (RE)
  • Vitamin C (mg)
  • Vitamin D (IU)
  • Serving Size
In [2]:
import requests
import pandas as pd
In [3]:
def find_periods(date):
    '''This function takes a date as input and returns the periods associated 
    with the date from the Dine On Campus API. We need to do this because the
    API changes the period id frequently. Periods represent the time of day
    (breakfast, lunch, dinner, etc.).
    
    Args:
        date (str): The date to search for in the format YYYY-M-DD
        
    Returns:
        periods_dict (dict): The dict of periods associated with the date; 
        empty if no periods found for date
    '''

    # create an empty dictionary to store the periods
    periods_dict = dict()

    # call the API to get the json information.
    url = "https://api.dineoncampus.com/v1/location/586d05e4ee596f6e6c04b527/periods?platform=0&date=" + date
    req = requests.get(url)

    # format the response as json
    data = req.json()

    # get the periods from the json
    periods = data.get("periods")

    # for each of the periods found
    for period in periods:
        # add the name and id to the dictionary
        periods_dict[period.get("name")] = period.get("id")

    # return the final dictionary
    return periods_dict
In [9]:
def get_menu(date):
    '''Returns a dataframe containing the menu for a given date
    and all of its nutritional information.
    
    Args:
        date (str): The date to search for in the format YYYY-M-DD

    Returns:
        DataFrame: A dataframe containing the menu for the given date;
        empty if no menu found for date provided
    '''
    periods = find_periods(date)

    final_df = pd.DataFrame(columns=['food'])
    final_df = final_df.set_index('food')

    for period in periods.keys():
        url = "https://api.dineoncampus.com/v1/location/586d05e4ee596f6e6c04b527/periods/" + periods[period] + "?platform=0&date=" + date
        req = requests.get(url)
        # get req as json
        data = req.json()
        # get the catagories of food
        catagories = data.get('menu').get('periods').get('categories')

        pd_food = pd.DataFrame(columns=['food'])
        pd_food = pd_food.set_index('food')

        # for each catagory
        for catagory in catagories:
            items = catagory.get('items')
            # for each food in the catagory
            for food in items:
                name = food.get('name')
                nutrients = food.get('nutrients')
                portion = food.get('portion')
                for nutrient in nutrients:
                    nutrient_name = nutrient.get('name')
                    nutrient_value = nutrient.get('value_numeric')
                    pd_food.at[name, nutrient_name] = nutrient_value
                pd_food.at[name, 'Serving Size'] = portion
        
        final_df = pd.concat([final_df, pd_food], sort=True)

    return final_df
    
In [10]:
# get the menu from Monday, February 27, 2023
# this request takes a long time because the Dine On Campus API is slow :(
food_data = get_menu("2023-2-26")
food_data
Out[10]:
Calcium (mg) Calories Calories From Fat Cholesterol (mg) Dietary Fiber (g) Iron (mg) Potassium (mg) Protein (g) Saturated Fat (g) Saturated Fat + Trans Fat (g) Serving Size Sodium (mg) Sugar (g) Total Carbohydrates (g) Total Fat (g) Trans Fat (g) Vitamin A (RE) Vitamin C (mg) Vitamin D (IU)
food
Ham, Egg, Cheese Breakfast Pizza 190 330 120 90 1 2.6 250 18 6 5+ 1 slice 760 3+ 35 13 - 60+ 2 50
Cheese Pizza 210 200 90 25 1 1.3 120 10 4.5 5+ 1 slice 610 1 18 10 - 20+ 3 15
Scrambled Eggs 60 200 140 420 0 2 160 14 4 5 1/2 cup 160 0 1 15 0 120 0 90
Pork Sausage Patty 0 360 300 50 0 0 - 10 12 10 2 each 630 0 2 34 0 - - -
Tater Tots 0 140 70 0 1 0 270 1 1 0 1/2 cup 400 0 18 8 0 - - 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
Lucky Charms 130 140 15 0 2 - 80 3 0 0 1 cup 230 12 30 1.5 0 80 9 90
Cinnamon Toast Crunch 170 170 35 0 3 6 70 1 0.5 0 1 cup 240 12 33 4 0 - 12 110
Total Raisin Bran 40 180 5 0 5 14.5 210 3 0 0 1 cup 220 18 45 1 0 - 7 60
Peanut Butter, .75 oz, Jif 10 130 100 0 1 0.4 120 5 2 0 1 each 90 2 5 11 0 - 0 0
Jelly, Assorted 0 40 0 0 0 0 10 0 0 0+ 1 each 0 7 10 0 - - 0 0

459 rows × 19 columns

My project wishes use the information above to estimate "Healthy" choices for students to make in the dining halls

Potential Problems¶

One problem with this dataset is that we would need to strictly define what makes a food health and what doesn't. Also, some of this data is missing for some of the foods, so we will need to make assumptions about what nutrients are important and which ones are not.

The healthiness of a food is subjective, and provide a comparison, I will research industry standard "healthy foods", to compare the nutrients of the dining hall foods with those standards

Method¶

This problem could be posed as a classification question. I am looking to find which foods are considered "healthy" and which ones are not. Also, this could be posed as a Regression question. As we could rank each food in terms of its healthyness to determine which ones are considered the most healthy. A combination of both sounds like the most impactful solution.