Spotify is a music, podcast, and video streaming service with 100 million active users. The company curates playlists that are followed by millions of users. These playlists are created by a combination of algorithmic and human-driven processes. The aim of our project is to make use of machine learning algorithms to improve the effectiveness of algorithmically curated playlists and to analyze what audio features contribute to the popularity of playlists.
Spotify attempts to direct the most relevant songs to users based on their preferences, moods etc. An enhanced version of our project would include generation of user-specific playlists based on genre, mood etc. The success of a playlist depends on certain features which need to be determined. We are using two datasets for our project. The key dataset is the set of audio features of tracks and playlists obtained from Spotify API. Additional features can be added from Million Song Dataset, a freely available collection of audio features and metadata about 1 million popular tracks. Some of the important audio features are loudness, energy, danceabililty, beats per minute etc. An extended and time-series version of these audio features can be obtained by processing 30 seconds raw audio obtained using Spotify api which will be targeted in the later phase of the project.