“Amyloid positivity” is a key risk indicator of Alzheimer’s disease. Amyloid status is considered to be positive when Amyloid Beta (A) protein, also referred to as amyloid plaque, is accumulated in the brain with sufficient density to meet a threshold. The goal of this capstone project is to use machine learning and other advanced analytics approaches to construct a model that predicts whether a single individual is amyloid positive or negative. The potential for this project is that your deliverables are integrated into Biogen’s Alzheimer’s treatment pipeline.
Spring 2017 Projects
The Como project will focus on the city of Como, a small medieval town beautifully located on Lake Como in Northern Italy, with a large walking area in the downtown district and along the lakeshore. The project consists of collecting and analyzing data about the city and the way people live and move in it by integrating multiple and diverse data sources.
The problems to be addressed are:
- Providing a reliable estimate of the overall picture of people density
- Predicting the impact of future events positioned in time and space
- Given a constrained budget and a cost model for sensors deployment
The project will explore the popularity and success of different Moleskine products co-branded with other famous brands (also known as special editions) and launched during specific periods of time. The main field of analysis is measuring the impact of different products on social media channels and correlating that to sales.
Potentially hazardous objects (PHOs) are currently defined based on parameters that measure the object's potential to make threatening and close approaches to the Earth. To be considered a PHO, objects generally have an Earth minimum orbit intersection distance (MOID) of 0.05 AU or less and an absolute magnitude (H) of 22.0 or brighter (a rough indicator of large size). In this project students will develop the full pipeline which includes data management, algorithmic development and probabilistic predictions of impact.
Legendary is a leading film production company, with 43 Feature films released, 6 films currently in production and 13 billion box office until 2015. Identifying the correct search terms to find social media posts about an entity or concept is a highly challenging task. For instance, the word Fargo may refer to a place (in North Dakota), a TV show, a movie, or a bank (Wells Fargo). The student team analysed 4 million tweets to produce a text-query generation & optimization system. The search index query, constructed from combinations of text tokens constrained to simple logical operators, returns a highly pure set of text documents relevant to a property, such as a film, and also provide a characterization of the query bias.
TripAdvisor users write reviews and upload photos from their various restaurant visits. These photos can be categorized/analysed so they can reveal information about the restaurant's menu, dishes, pricing, etc. The first step in this analysis is the classification of photos into simple, broad groups: food, drinks, menus, inside and outside photos of the establishment. Students' goal for this project was to build an image classifier using Convolutional Neural Networks and images aquired by the students themselves.
Airbnb is a global marketplace of rentals of apartments that reach 190 countries and 34,000 cities. In Airbnb, citizens insert their rental offers and rent their own apartments to other citizens, thereby defining a parallel market to traditional offers based upon hotels. We propose to integrate data from Airbnb with data from other sources, including open data, census information, real estate, information about the district, about the house interiors, social sources such as Instagram and Twitter, etc., so as to develop a new scoring system for Airbnb offers, similar to the hotel star system.
Nester is a platform where companies can find the best designs for a project, using Kaggle. Kaggle is a platform that hosts machine learning competitions where companies and researchers post data and pose challenges. Data scientists from all over the world compete to answer the questions and to produce the best results, in effect, crowdsourcing the most efficient technique or solution to the questions.
Through Nester, companies post brief design challenges. Designers then propose solutions and vote for other people's projects. Experts refine projects. Companies give feedback refine and select the best ideas. Finally, users pledge for their favorite product and we have a Winner!
Market exposure is a key concept in quantitative finance. This is classically measured by estimating a beta coefficient in a linear equation where beta (exposure) expresses the returns of the market. Returns with low exposure to the market are desired, as they are not affected by downturns. This exposure modeling can be generalized to multiple factors and the exposures to factors are used to determine if a strategy or asset is protected enough from changes in certain risk factors, and to purchase hedges that cancel out this risk exposure.
The MBTA serves 4.8 million people throughout the Boston metro area and facilitates approximately 1.3 million trips each weekday. Aggregated entry and exit data is collected for each rail station at 15-minute intervals. Since commuting is one of the most habitual acts a metropolitan citizen performs, this data provides excellent means to predict ridership throughout the week.
The typical cyber-life of a BostonGlobe user starts with anonymous visits- from casually visiting the site, to ultimately becoming a subscriber. The Boston Globe would like to understand the idiosyncrasies and patterns of a subscriber and use that knowledge to increase subscriptions conversion rates.