Creating a better revenue model for MBTA

Massachusetts Bay Transportation Authority, a.k.a. MBTA, is the public transit agency operating most transit in the Greater Boston area, including busses, subways, and trains. The MBTA operates with high-level averages of revenue data, but does not have access to a detailed model of fares across different routes, times and dates, modes of transit, passenger profiles, and other characteristics. The goal of this project is to create a more granular cost model using existing passenger transaction data.

Power of Words: Lyric-based music recommendation

The goal of this project is to leverage the rich content of song lyrics to connect each song with relatable concepts such as moods, occasions, and themes. A direct application of this automatic tagging system would be to produce playlists associated with different emotions or serve specific purposes (after break-up songs, holiday music, party mix, et cetera). An initial target for final product would be a collection of moods and topics that a user can select to retrieve an associated list of songs.

Spotify playlist prediction

Spotify is a music, podcast, and video streaming service with 100 million active users. The company curates playlists that are followed by millions of users. These playlists are created by a combination of algorithmic and human-driven processes. The aim of our project is to make use of machine learning algorithms to improve the effectiveness of algorithmically curated playlists and to analyze what audio features contribute to the popularity of playlists.

Social media engagement for cosmetic brands

Tribe Dynamics is a San Francisco based startup that measures social media engagement for cosmetic brands. Online content creation led by beauty bloggers is one of the key predictors of offline revenue in this industry. This project focuses on investigating how hashtag usage spreads across a social network of instagrammers who post about beauty products. The goal of the project is to model probabilistically each person’s propensity to use a hashtag based on whether their friends also use the hashtag, and to determine the characteristics of a successful marketing campaign using hashtags.

Data Collection, Management and Cleaning

The City of Como project is a collaboration with Fluxedo, an Italian startup working in partnership with the municipality of Como to model human dynamic flow in the city.  The overall aim of the project is to integrate multiple and diverse data sources to build a picture of the way people live and move around the city. Using historical telecom and social media data along with other geolocated data, the team will form a coherent picture of the daily movements of different demographic groups throughout Como, dependent on the day, time, and other factors such as weather and events.

Sentiment Analysis and Predictive Models

Moleskine’s philosophy is culture, travel, memory, imagination and personal identity. The goal of this project is to find influencers by looking at users' interactions and to target them across different social platforms. For example, we will look at how people connect in Twitter and create a weighted graph using both following numbers and @mentions. We will look at all platforms and cluster groups of posts by trending topics using LDA. This can be applied to all sources of media. We will then try to identify if trending topics and influencers are common across social platforms.