IACS seminars are generally held every other Friday during the academic year, and are free and open to the public. No registration required. Unless otherwise noted, all seminar will take place at Harvard University Maxwell Dworkin G115
Speaker: Michael Bronstein, Radcliffe Institute; Università della Svizzera italiana (Switzerland); and, Tel Aviv University (Israel)
In the past decade, deep learning methods have achieved unprecedented performance on a broad range of problems in various fields from computer vision to speech recognition. So far research has mainly focused on developing deep learning methods for Euclidean-structured data. However, many important applications have to deal with non-Euclidean structured data, such as graphs and manifolds. Such data are becoming increasingly important in computer graphics and 3D vision, sensor networks, drug design, biomedicine, high energy physics, recommendation systems, and web applications. The adoption of deep learning in these fields has been lagging behind until recently, primarily since the non-Euclidean nature of objects dealt with makes the very definition of basic operations used in deep networks rather elusive. In this talk, Dr. Bronstein will introduce the emerging field of geometric deep learning on graphs and manifolds, overview existing solutions and applications, and outline the key difficulties and future research directions.
February 16, 2018
Speaker: Isabelle Stanton, Software Engineer, Google
Google Search is one of the most widely used data products in the world. While it has been in constant development since 1997, Search is by no means a solved problem. Even the question of what a good set of search results for a query has a constantly evolving answer. Dr. Stanton’s talk will focus on some of the challenges Search faces - defining quality metrics, dealing with noisy and sparse data, design choices for a data system at this scale, as well as what to do when you just don't like the output of a system.
March 2, 2018
Speaker: Ken Koedinger, Professor of Human–Computer Interaction and Psychology, Carnegie Mellon University
Big data and machine learning appear to be revolutionizing many fields. Is education one of them? Unlike our universe or the quantum structure of particles, how people learn is a question that
seems much closer to our direct observation. So close, one might wonder why data is needed and whether self-reflection is sufficient to understand learning. Koedinger’s first goal is to convince you that self-reflection is not sufficient. His second is to provide you with examples of educational data mining and how it has provided insights into how people learn (e.g., slowly and incrementally) and fostered improvements in human learning outcomes (e.g., 2x more effective learning). Koedinger will emphasize that explanatory models of data are critical for such insights and outcomes and that disciplinary expertise, but not just data science, must be brought to bear. He will illustrate the role of disciplinary expertise in the psychology of learning and in the educational subject-matter domain, and the role of explanatory models in the form of symbolic computational models of learning that can be taught competencies like algebra, grammar, and chemistry.
March 23, 2018
Speaker: Francesca Dominici, Professor of Biostatistics, HSPH & Co-Director of the Harvard Data Science Initiative (HDSI)
What if I told you I had evidence of a serious threat to American national security – a terrorist attack in which a jumbo jet will be hijacked and crashed every 12 days. Thousands will continue to die unless we act now. This is the question before us today – but the threat doesn’t come from terrorists. The threat comes from climate change and air pollution.
Researchers have developed an artificial neural network model that uses on-the-ground air-monitoring data and satellite-based measurements to estimate daily pollution levels across the continental U.S., breaking the country up into 1-square-kilometer zones. They have paired that information with health data contained in Medicare claims records from the last 12 years, and for 97% of the population aged 65 or older. They have also developed statistical methods and computational efficient algorithms for the analysis over 460 million health records.
Their research shows that short and long term exposure to air pollution is killing thousands of senior citizens each year. Their data science platform is telling us that federal limits on the nation’s most widespread air pollutants are not stringent enough.
This type of data is the sign of a new era for the role of data science in public health, and also for the associated methodological challenges. For example, with enormous amounts of data, the threat of unmeasured confounding bias is amplified, and causality is even harder to assess with observational studies. Dr. Dominici will discuss these and other challenges.
Dean's Lecture on Computational Science & Engineering (CSE)
Speaker: David Spergel, Charles Young Professor of Astronomy, Princeton University & Founding Director, CCA, Flatiron Institute
Images of the cosmic microwave background, the leftover heat from the big bang, are the universe’s baby picture. Embedded in this picture is information about the universe’s age, origin, composition and fate. Our observations have revealed a remarkably simple, yet strange universe. A simple model with only five basic numbers can describe the basic statistical properties of the universe which describes the positions and properties of billions of galaxies and millions of independent points on the sky. While the model is simple, it implies that atoms make up only 5% of the universe and bulk of the universe is composed of mysterious dark matter and dark energy. Dr. Spergel will review past measurements and look forward to future observations that could determine the properties of the dark energy and deepen our understanding of the universe’s beginnings and ultimate fate.
April 20, 2018
Speaker: Ben Shneiderman, Professor of Computer Science, University of Maryland--College Park
Event Analytics is rapidly emerging as a new topic to extract insights from the growing set of temporal event sequences that come from medical histories, e-commerce patterns, social media log analysis, cybersecurity threats, sensor nets, online education, sports, etc. This talk reviews a decade of research on visualizing and exploring temporal event sequences to view compact summaries of thousands of patient histories represented as time-stamped events, such as strokes, vaccinations, or admission to an emergency room.
Dr. Shneiderman’s work on EventFlow supports point events, such as heart attacks or vaccinations and interval events such as medication episodes or long hospitalizations. Demonstrations cover visual interfaces to support hospital quality control analysts who ensure that required procedures were carried out and clinical researchers who study treatment patterns that lead to successful outcomes. He will show how domain-specific knowledge and problem-specific insights can lead to sharpening the analytic focus so as to enable more successful pattern and anomaly detection.
Co-sponsored with the Harvard Data Science Initiative (HDSI).