Student Research Projects

Spring 2015 Projects

MBTA CAPSTONE - An attempt to improve the MBTA through Data.

The MBTA serves 4.8 million people throughout the Boston metro area and facilitates approximately 1.3 million trips each weekday. Aggregated entry and exit data is collected for each rail station at 15-minute intervals. Since commuting is one of the most habitual acts a metropolitan citizen performs, this data provides excellent means to predict ridership throughout the week.

Boston Globe Subscriber Conversion

The typical cyber-life of a BostonGlobe user starts with anonymous visits- from casually visiting the site, to ultimately becoming a subscriber. The Boston Globe would like to understand the idiosyncrasies and patterns of a subscriber and use that knowledge to increase subscriptions conversion rates.

Spring 2014 Projects

LAAMPS Simulation of a Microgravity Shear Cell

Taiyo Wilson | Advisor: Chris Rycroft

Granular materials exhibit interesting properties that lie somewhere between the solid and liquid phases. Low density granular materials flow freely like fluids, while high density materials 'jam' like solids. This work exploits these novel properties of granular materials to create more comfortable, adaptable, less expensive prosthetic sockets for amputees.

Predictive Models of Crop Yield in the Corn Belt

Benjamin Cook, Charles Hornbaker | Advisors: Luke Bornn, Pavlos Protopapas

Unprecedented amounts of data are available on modern farms. From yield monitors to weather sensors to infrared imaging, farmers are able to keep track of every detail on their farms. However, most farmers are not taking advantage of this data. In this project we developed and assessed stochastic procedures for estimating crop yield in a maize field with an eye toward making in-season forecasts.

A Century of Corn: Harvesting US Crop Yield Data

Benjamin Cook, Ryan King, Charles Hornbaker, Conor Myhrvold | Advisor: Hanspeter Pfister

A visual exploration of corn yield trends.  The tool allows users to see trends at the national, state or county level using data from US Agricultural statistics between 1910 to 2013.

Driven Data: Data Science Competitions to Save the World

Isaac Slavit, Peter Bull | Advisor: Pavlos Protopapas

A website that runs kaggle-style competitions for non-profits. Non-profits have all sorts of technical issues and predictive questions that never get answered because they don't have the funds or the expertise to do the work.

Finch: A library for local search and stochastic optimization in Go.

Daniel Newman | Advisor: Pavlos Protopapas

Finch is a stochastic optimization library using Go 1.2, a language initially developed by Google. It includes methods such as Hill Climbing, Simulated Annealing etc. All of the functions were built to be as general as possible, so that they are useful for a wide variety of problems.

Structural Models With Optimization

Li Xiang | Advisor: Yaron Singer

A study on estimating structural models in general, optimization problems with ambiguous boundary conditions and random matrix theory. Many market equilibrium problems are modeled with these ambiguous boundary conditions in recent years and optimization techniques were developed to solve these problems.


Binary Recommendation

Aymen Jaffry | Advisors: David Parkes, Pavlos Protopapas

Matrix Factorization is a Collaborative Filtering technique that is widely used to implement Recommender Systems. As its name suggests, it predicts how much a user may like an item based on a set of known item ratings. Ratings can be binary (i.e. 1 if the user likes the item, 0 otherwise) or discrete, as in the Netflix Prize (movies are graded on a 1-5 scale). This technique can be associated with external information that measures the similarity between users and items, a hybrid recommender system associating Matrix Factorization and item similarities applied on binary “ratings.”

Fall 2013 Projects

A Modifiable University Ranking System in D3

Connor Myhrvold | Advisor: Hanspeter Pfister

A university ranking program that re-weights and sorts universities based on user-input ranking component weights using QS / US News & World Report 2012 World Universities ranking list.

Parallel Cellular Dynamics Evaluation

Ryan King | Advisors: Lance Munn, Pavlos Protopapas

This project uses a CUDA to run real time simulation of tumor growth, allowing for accurate information about the size and extent of tumor growth that can help physicians save lives.

Market Modeling and Computation Based on Random Utility Model

Muxi Li | Advisor: Professor David Parkes

Demand and supply analysis is a fundamental topic in Economics. By establishing proper mathematical models, one can have the flexibility to do market analysis including market share prediction, price elasticity estimation, new product introduction analysis, etc.


This is Equity Forecast: Analyzing Global Capital Markets

Brandon Sim and Rajiv Tarigopula | Advisor: Pavlos Protopapas

This report draws on several famous studies in financial theory and economics and attempts to accurately forecast prices and volatility of various securities in the U.S. stock market. 

Indian Online Matrimony Data Exploration

Nikhil Sud | Advisor: Pavlos Protopapas

India has a long tradition of arranged marriage, in which parents or extended family help pick a spouse for eligible children. With the advent of the Internet in India, Internet entrepreneurs founded matrimonial services that are similar to American dating sites but focus on educational and family background rather than personality-based matching criteria.  By downloading and analyzing the aggregate data from insights into the above mentioned sociological biases could be provided. The data provided answers to interesting questions with associated policy ramifications, such as do biases reduce among more educated users, or more urban users?  



Sentiment Analysis for Finance

Aymen Jaffry | Advisor: Pavlos Protopapas

This project uses natural language processing techniques, applies them to process text data and extracts meaningful information. A sentiment analysis tool was created to improve some simple financial strategies.