214

A real-world problem that I hope to provide input on is the problem of how to best determine the genre of of any given song. With more genres of popular music than ever, and more influences of different genres on one another, it may be difficult to determine how exactly a song fits into a difficult genre, or if there may need to be construction of a new genre outside of pre-existing definitions. These genre definitions have an impact on musical artists at both small and large scales. At small scales, music genre classification impacts new artists who hope to be noticed online, as algorithms on platforms such as Spotify take into account both artist-determined genres and computer-generated genres when recommending songs that a user may like.[1] At the large scale, awards like the Grammies currently differentiate genres based on committees of executives, but only have a limited number of genres for awards, including pop, rock, and metal. Only relying on these genres may be excluding talented, worthy artists who do not get recognition because their music does not fit in one specific definition of a traditional genre.[2] Together, these issues show that the problem of genre classification is one that must be investigated for the future of the music industry.

In [1]:

# song genre

In [4]:

import pandas as pd
df = pd.read_csv(r'C:\Users\rdela\Code Directory\DS2500\archive\Data\features_30_sec.csv')
df.head()

Out[4]:

	filename	length	chroma_stft_mean	chroma_stft_var	rms_mean	rms_var	spectral_centroid_mean	spectral_centroid_var	spectral_bandwidth_mean	spectral_bandwidth_var	...	mfcc16_var	mfcc17_mean	mfcc17_var	mfcc18_mean	mfcc18_var	mfcc19_mean	mfcc19_var	mfcc20_mean	mfcc20_var	label
0	blues.00000.wav	661794	0.350088	0.088757	0.130228	0.002827	1784.165850	129774.064525	2002.449060	85882.761315	...	52.420910	-1.690215	36.524071	-0.408979	41.597103	-2.303523	55.062923	1.221291	46.936035	blues
1	blues.00001.wav	661794	0.340914	0.094980	0.095948	0.002373	1530.176679	375850.073649	2039.036516	213843.755497	...	55.356403	-0.731125	60.314529	0.295073	48.120598	-0.283518	51.106190	0.531217	45.786282	blues
2	blues.00002.wav	661794	0.363637	0.085275	0.175570	0.002746	1552.811865	156467.643368	1747.702312	76254.192257	...	40.598766	-7.729093	47.639427	-1.816407	52.382141	-3.439720	46.639660	-2.231258	30.573025	blues
3	blues.00003.wav	661794	0.404785	0.093999	0.141093	0.006346	1070.106615	184355.942417	1596.412872	166441.494769	...	44.427753	-3.319597	50.206673	0.636965	37.319130	-0.619121	37.259739	-3.407448	31.949339	blues
4	blues.00004.wav	661794	0.308526	0.087841	0.091529	0.002303	1835.004266	343399.939274	1748.172116	88445.209036	...	86.099236	-5.454034	75.269707	-0.916874	53.613918	-4.404827	62.910812	-11.703234	55.195160	blues

5 rows × 60 columns

Data descriptions sourced from Andrade Olteanu's explainer. This data represents all different aspects of an audio file and can show how different clips of audio correspond to different genres.

Label	Explanation
filename	name of .wav file in dataset
length	length of audio in sequence of vibrations
chroma_stft_mean	mean of the short-time Fourier transformation (frequencies as a function of time)
chroma_stft_var	variance of the short-time Fourier transformation (frequencies as a function of time)
rms_mean	mean of the Mel Spectogram (spectrum of frequencies)
rms_var	variance of the Mel Spectogram (spectrum of frequencies)
spectral_centroid_mean	mean of weighted mean of frequencies present in sound
spectral_centroid_var	variance of weighted mean of frequencies present in sound
spectral_bandwidth_mean	mean of total spectral energy
spectral_bandwidth_var	variance of total spectral energy
rolloff_mean	mean of specified percentage of total spectral energy
rolloff_var	variance of specified percentage of total spectral energy
zero_crossing_rate_mean	mean of the rate at which signal changes from positive to negative
zero_crossing_rate_var	variance of the rate at which signal changes from positive to negative
harmony_mean	mean of sound color in harmony
harmony_var	variance of sound color in harmony
perceptr_mean	mean of sound rhythm and emotion
perceptr_var	variance of sound rhythm and emotion
tempo	beats per minute of audio
mfcc[1-20]_mean	mean of small set of features which describe overall shape of spectral envelope
mfcc[1-20]_var	variance small set of features which describe overall shape of spectral envelope
label	determined genre

The machine learning method I hope to use is k-nearest neighbors clustering to see if these variables correspond to the labeled genre, or if there are new/unexpected genres that emerge.