Courses across Harvard
NOTE: Students enrolling in the Secondary Field or Master's degree programs should not assume that courses listed here will qualify as "domain electives." Any course not carrying an asterisk requires faculty approval to be included in a CSE plan of study.
See IACS courses.
See IACS courses.
See IACS courses.IACS courses.
Complex numbers. Multivariate calculus: partial differentiation, directional derivatives, techniques of integration and multiple integration. Vectors: dot and cross products, parameterized curves, line and surface integrals. Vector calculus: gradient, divergence and curl, Green’s, Stokes’ and Gauss’ theorems, including orthogonal curvilinear coordinates. Applications in electrical and mechanical engineering.
Linear algebra: matrices, determinants, eigenvalues, eigenvectors, Markov processes. Optimization and least-squares analysis. Ordinary differential equations. Infinite series and Fourier series. Orthogonality and completeness. Introduction to partial differential equations. Applications in electrical and mechanical engineering.
Introductory statistical methods for students in the applied sciences and engineering. Random variables and probability distributions; the concept of random sampling, including random samples, statistics, and sampling distributions; the Central Limit Theorem and its role in statistical inference; parameter estimation, including point estimation and maximum likelihood methods; confidence intervals; hypothesis testing; simple linear regression; and multiple linear regression. Introduction to more advanced techniques as time permits.
Complex Analysis: complex numbers, functions, mapping, differentiation, integration, branch cuts, series expansions, residue theory. Fourier Analysis: Fourier series, Fourier and Laplace transforms, applications to differential equations and data analysis.
Ordinary differential equations: power series solutions; special functions; eigenfunction expansions. Review of vector calculus. Elementary partial differential equations: separation of variables and series solutions; comparison of elliptic, parabolic and hyperbolic systems. Brief introduction to nonlinear dynamical systems and to numerical methods.
Many complex physical problems defy simple analytical solutions or even accurate analytical approximations. Scientific computing can address certain of these problems successfully, providing unique insight. This course introduces some of the widely used techniques in scientific computing through examples chosen from physics, chemistry, and biology. The purpose of the course is to introduce methods that are useful in applications and research and to give the students hands-on experience with these methods.
Abstracting the essential components and mechanisms from a natural system to produce a mathematical model, which can be analyzed with a variety of formal mathematical methods, is perhaps the most important, but least understood, task in applied mathematics. This course approaches a number of problems without the prejudice of trying to apply a particular method of solution. Topics drawn from mechanics, biology, economics and the behavioral sciences.
Introduction to basic mathematical ideas and computational methods for solving deterministic and stochastic optimization problems. Topics covered: linear programming, integer programming, branch-and-bound, branch-and-cut, Markov chains, Markov decision processes. Emphasis on modeling. Examples from business, society, engineering, sports, e-commerce. Exercises in AMPL, complemented by Maple or Matlab.
Introduction to methods for developing accurate approximate solutions for problems in the sciences that cannot be solved exactly, and integration with numerical methods and solutions. Topics include: approximate solution of integrals, algebraic equations, nonlinear ordinary differential equations and their stochastic counterparts, and partial differential equations. Introduction to "sophisticated" uses of Matlab.
- AM 202 Physical Mathematics II*
Theory and techniques for finding exact and approximate analytical solutions of partial differential equations: eigenfunction expansions, Green functions, variational calculus, transform techniques, perturbation methods, characteristics, selected nonlinear PDEs, introduction to numerical methods.
See IACS courses.
See IACS courses.
The course will introduce Bayesian analysis, maximum entropy principles, hidden Markov models and pattern theory. These concepts will be used to understand information processing in biology. The relevant biological background will be covered in depth.
Advanced techniques for modeling and solving large and difficult optimization problems as well as the core theory and geometry of linear inequalities, integer programming and combinatorial optimization. Topics covered: geometry and theory of linear programming, solving large scale optimization problems using column and constraint generation, network flows, computational complexity, basic integer programming models and algorithms, paths and trees, matchings, integrality of polyhedra, and matroids. Emphasis will be on developing an understanding of the core theory and solution methods. Exercises and the class project will involve developing and implementing optimization algorithms, possibly using standard solvers such as AMPL.
ASTRO 193. Noise and Data Analysis in Astrophysics
How to design experiments and get the most information from noisy, incomplete, flawed, and biased data sets. Basic of Probability theory; Bernoulli trials: Bayes theorem; random variables; distributions; functions of random variables; moments and characteristic functions; Fourier transform analysis; Stochastic processes; estimation of power spectra: sampling theorem, filtering; fast Fourier transform; spectrum of quantized data sets. Weighted least mean squares analysis and nonlinear parameter estimation. Bootstrap methods. Noise processes in periodic phenomena. Image processing and restoration techniques. The course will emphasize a Bayesian approach to problem solving and the analysis of real data sets.
This course will assess the relationships between sequence, structure and function in complex biological networks as well as progress in realistic modeling of quantitative, comprehensive functional-genomics analyses. Topics will include algorithmic, statistical, database, and simulation approaches and practical applications to biotechnology, drug discovery and genetic engineering. Future opportunities and current limitations will be critically assessed. Problem sets and a course project will emphasize creative, hands-on analyses using these concepts.
In-depth study of genomics: models of evolution and population genetics; comparative genomics: analysis and comparison; structural genomics: protein structure, evolution and interactions; functional genomics, gene expression, structure and dynamics of regulatory networks.
An introduction to modern theories of the structure of matter, including the principles of quantum mechanics, the electronic structure of atoms and molecules, chemical bonding, and atomic and molecular spectra. The course will offer an introduction to the practical aspects of modern computational quantum chemistry methods such as density functional theory.
- CHEM 253. Modeling Matter at Nanoscale: An Introduction to Theoretical and Computational Approaches new!
Essentials of modeling the structure of matter at the nanoscale. Material properties and connections to the mesoscale. Intended for advanced undergraduate students or beginning graduate students in Chemistry, Physics, Applied Physics and the Life Sciences. Prerequisite: Applied Mathematics 21a and 21b; Mathematics 21a and 21b, or equivalent preparation in calculus and differential equations; Physical Sciences 1 or equivalent preparation in chemical bonding and fundamental principles; Physical Sciences 2 or Physics 11a, and Physical Sciences 3 or Physics 11b.
Design and analysis of efficient algorithms and data structures. Algorithm design methods, graph algorithms, approximation algorithms, and randomized algorithms are covered.
- CS 146. Advanced Computer Architecture
Review of the fundamental structures in modern processor design. Topics include computer organization, memory system design, pipelining, and other techniques to exploit parallelism. Emphasis on a quantitative evaluation of design alternatives and an understanding of timing issues.
Covers the fundamental concepts of database and information management. Data models: relational, object-oriented, and other; implementation techniques of database management systems, such as indexing structures, concurrency control, recovery, and query processing; management of unstructured data; terabyte-scale databases.
Introduction to key design principles and techniques for visualizing data. Covers design practices, data and image models, visual perception, interaction principles, tools from various fields, and applications. Introduces programming of interactive visualizations.
Introduction to artificial intelligence, focusing on problems of perception, reasoning under uncertainty, and especially machine learning. Supervised learning algorithms. Decision trees. Ensemble learning and boosting. Neural networks, multi-layer perceptrons and applications. Support vector machines and kernel methods. Clustering and unsupervised learning. Probabilistic methods, parametric and non-parametric density estimation, maximum likelihood and maximum a posteriori estimates. Bayesian networks and graphical models: representation, inference and learning. Hidden Markov models. Markov decision processes and reinforcement learning. Computational learning theory.
The interplay between economic thinking and computational thinking as it relates to electronic commerce, social networks, collective intelligence and networked systems. Topics covered include: game theory, peer production, reputation and recommender systems, prediction markets, crowd sourcing, network influence and dynamics, auctions and mechanisms, privacy and security, matching and allocation problems, computational social choice and behavioral game theory. Emphasis will be given to core methodologies, with students engaged in theoretical, computational and empirical exercises.IACS courses.
A quantitative theory of the resources needed for computing and the impediments to efficient computation. The models of computation considered include ones that are finite or infinite, deterministic, randomized, quantum or nondeterministic, discrete or algebraic, sequential or parallel.
Covers topics related to algorithms for big data, especially related to networks. Themes include compression, cryptography, coding, and information retrieval related to the World Wide Web. Requires a major final project.
- CS 226r. Efficient Algorithms*
Important algorithms and their real-life applications. Topics include combinatorics, string matching, wavelets, FFT, computational algebra number theory and geometry, randomized algorithms, search engines, page rankings, maximal flows, error correcting codes, cryptography, parallel algorithms.
- CS 228. Computational Learning Theory*
Possibilities of and limitations to performing learning by computational agents. Topics include computational models, polynomial time learnability, learning from examples and learning from queries to oracles. Applications to Boolean functions, automata and geometric functions.
- CS 246. Advanced Computer Architecture*
Similar to CS 146, with the exception that students enrolled in CS 246 are expected to undertake a substantial course project.CS 262. Introduction to Distributed Computing*
Examination of the special problems associated with distributed computing such as partial failure, lack of global knowledge and protocols that function in the face of these problems. Emphasis on causal ordering, event and RPC-based systems.
This course is an introduction to several modern parallel computing approaches and languages. Covers programming models, hardware architectures, multi-threaded programming, GPU programming with CUDA, cluster computing with MPI, cloud computing, and map-reduce using Hadoop and Amazon’s EC2. Students will complete readings, programming assignments, and a final project.
Advanced statistical machine learning and probabilistic data analysis. Topics include: Markov chain Monte Carlo, variational inference, Bayesian nonparametrics, text topic modeling, unsupervised learning, dimensionality reduction and visualization. Requires a major final project.
An overview of modern computational tools with applications to the Earth Sciences. Introduction to the MATLAB programming and visualization environment. Topics include: statistical and time series analysis, visualization of two- and three-dimensional data sets, tools for solving linear/differential equations, parameter estimation methods. Labs emphasize applications of the methods and tools to a wide range of data in Earth Sciences.
Chemical transport models: principles, numerical methods. Inverse models: Bayes’ theorem, optimal estimation, Kalman filter, adjoint methods. Analysis of environmental data: visualization, time series analysis, Monte Carlo methods, statistical assessment. Students prepare projects and presentations.
- EPS 231. Climate Dynamics
Climate and climate variability phenomena and mechanisms using a hierarchical modeling approach. Basics: El Niño and thermohaline circulation, abrupt, millennial and glacial-interglacial variability, equable climates.
- EPS 270r. Structural Interpretation of Seismic Data
Methods of interpreting complex geologic structures imaged in 2- and 3-dimensional seismic reflection data. Methods of integrated geologic and remote sensing data will be described. Students will complete independent projects analyzing seismic data on workstations.
An introduction to multiple regression techniques with focus on economic applications. Discusses extensions to discrete response, panel data, and time series models, as well as issues such as omitted variables, missing data, sample selection, randomized and quasi-experiments, and instrumental variables. Aims to provide students with an understanding of and ability to apply econometric and statistical methods using computer packages.
Topics include elements of statistical decision theory and related experimental evidence; some game theory and related experimental evidence; maximum likelihood; logit, normal, probit, and ordered probit regression models; panel data models with random effects; omitted variable bias and random assignment; incidental parameters and conditional likelihood; demand and supply.
This course explores discontinuous changes in the economic position of groups and countries and presents mathematical and computer simulation models designed to illuminate these changes. Among the examples are: growth/decline of trade unions; segregation of groups; changes in corporate work culture; growth of social pathologies in neighborhoods; Malthusian concerns about the environment. Among the models are: nonlinear simulations; neural networks; finite automata; evolutionary stable strategies; causal conjunctures; agent-based simulations; genetic algorithms. Primary emphasis is on using models and computer programs to analyze the substantive examples rather than on mathematics.
Introduces theories of inference underlying most statistical methods and how new approaches are developed. Examples include discrete choice, event counts, durations, missing data, ecological inference, time-series cross sectional analysis, compositional data, causal inference, and others.
This course introduces Geographical Information Systems and their applications. GIS is a combination of software and hardware with capabilities for manipulating, analyzing and displaying spatially referenced information. The course will meet two times a week. Every week, there will be a lecture and discussion as well as a laboratory exercise where students will work with GIS software on the computer.
Graduate-level version of Gov. 1002. Meets with Gov. 1002, introduces theories of inference underlying most statistical methods and how new approaches are developed.
Computational approaches and algorithms for contemporary problems in systems biology, with a focus on models of biological systems, including regulatory network discovery and validation. Topics include (1) genotypes, regulatory factor binding and motif discovery, whole genome RNA expression; (2) Regulatory networks: discovery, validation, data integration, protein-protein interactions, signaling, whole genome chromatin immunoprecipitation analysis; (3) Experimental design: model validation, interpretation of interventions. Computational methods discussed include directed and undirected graphical models such as Bayesian networks, factor graphs, Dirichlet processes, and topic models. Multidisciplinary team oriented final research project.
Covers the algorithmic and machine learning foundations of computational biology, combining theory with practice. Principles of algorithm design, influential problems and techniques, and analysis of large-scale biological datasets. Topics include (a) genomes: sequence analysis, gene finding, RNA folding, genome alignment and assembly, database search; (b) networks: gene expression analysis, regulatory motifs, biological network analysis; (c) evolution: comparative genomics, phylogenetics, genome duplication, genome rearrangements, evolutionary theory. These are coupled with fundamental algorithmic techniques including: dynamic programming, hashing, Gibbs sampling, expectation maximization, hidden Markov models, stochastic context-free grammars, graph clustering, dimensionality reduction, Bayesian networks.
Experimental functional genomics, computational prediction of gene function, and properties and models of complex biological systems. Primarily critical reading and discussion.
Provides in-depth quantitative understanding of evolutionary and population genetics, comparative genomics, and structural genomics and proteomics. Each module consists of a series of lectures, a journal club discussion of high impact publications, and a lecture providing clinical correlates. Homework assignments and final projects aim to develop understanding of genomic data from evolutionary principles.
Introduction to the basic concepts underlying dynamical simulations of proteins and nucleic acids. Basic definitions of components that form biological systems used to develop physical models that describe the dynamics of biomolecules. Topics include classical statistical thermodynamics for calculation of macroscopic observables, normal-mode analyses of protein dynamics, and thermodynamic perturbation theory. Emphasizes actual techniques and algorithms used for such calculations.
Applies analysis of signals and noise in linear systems, sampling, and Fourier properties to magnetic resonance (MR) imaging acquisition and reconstruction. Provides adequate foundation for MR physics to enable study of RF excitation design, efficient Fourier sampling, parallel encoding, reconstruction of non-uniformly sampled data, and the impact of hardware imperfections on reconstruction performance. Surveys active areas of MR research. Assignments include Matlab-based work with real data. Includes visit to a scan site for human MR studies.
Fundamentals of digital signal processing with particular emphasis on problems in biomedical research and clinical medicine. Basic principles and algorithms for data acquisition, imaging, filtering, and feature extraction. Laboratory projects provide practical experience in processing physiological data, with examples from cardiology, speech processing, and medical imaging.
Introduction to bioinformatics, the collection of principles and computational methods used to upgrade the information content of biological data generated by genome sequencing, proteomics, and cell-wide physiological measurements of gene expression and metabolic fluxes. Fundamentals from systems theory presented to define modeling philosophies and simulation methodologies for the integration of genomic and physiological data in the analysis of complex biological processes. Various computational methods address a broad spectrum of problems in functional genomics and cell physiology. Application of bioinformatics to metabolic engineering, drug design, and biotechnology also discussed.
Explores and illustrates theory underlying computational approaches to solving problems in evolutionary biology. Begins with components of evolutionary theory and inferential logic of evolution by natural selection. Emphasizes development of analytical skills needed to judge the computational and algorithmic implications and requirements of evolutionary models. Examples drawn from current research in evolutionary biology: whole-genome species comparison, phylogenetic tree construction, molecular evolution, homology and development, optimization and evolvability, heritability, disease evolution, detecting selection in human populations, and evolution of language. Extensive laboratory exercises in model-building and analyzing evolutionary data.
Follows trends in modern brain theory, focusing on local neuronal circuits as basic computational modules. Explores the relation between network architecture, dynamics, and function. Introduces tools from information theory, statistical inference, and the learning theory for the study of experience-dependent neural codes. Specific topics: computational principles of early sensory systems; adaptation and gain control in vision, dynamics of recurrent networks; feature selectivity in cortical circuits; memory; learning and synaptic plasticity; noise and chaos in neuronal systems.
- OEB 125. Molecular Ecology and Evolution
A survey of theory and applications of DNA technologies to the study of evolutionary, ecological and behavioral processes in natural populations. Topics to be covered will span a variety of hierarchical levels, timescales, and taxonomic groups, and will include the evolution of genes, genomes and proteins; the neutral theory of molecular evolution and molecular clocks; population genomics and phylogenetic principles of speciation and phylogeography; metagenomics of microbial communities; relatedness and behavioral ecology; molecular ecology of infectious disease; and conservation genetics.
Theory and practice of systematics, emphasizing issues associated with homology statements and alignments, methods of tree reconstruction, and hypothesis evaluation. The course combines theoretical considerations, paying special attention to algorithmic aspects of phylogenetics, with the use of different computer programs for conducting evolutionary and phylogenetic analyses.
Mathematical theory, experimental data, and history of ideas in the field, including analytical methods to study genetic variation with applications to evolution, demographic history, agriculture, health and disease. Includes lectures, problem sets, and student presentations.
- OEB 252. Coalescent Theory
The mathematics and computation of ancestral inference in population genetics. Theory relates observable genetic data to factors of evolution such as mutation, genetic drift, migration, natural selection, and population structure.
- BIO 111. Introduction to Programming in SAS
Provides an overview in the use of SAS to prepare data for statistical analysis. The focus is on database management and programming problems. Basic issues in each of these areas are discussed in the context of introducing the specific skills required to use SAS effectively.
- BIO 113. Introduction to Data Management and Programming in SAS
Provides intensive instruction in the use of SAS to prepare data for statistical analysis. The focus is on database management and programming problems. Basic issues in each of these areas are discussed in the context of teaching the specific skills required to use SAS effectively.
- BIO 503. Programming and Statistical Modeling in R
This course is an introduction to R, a powerful and flexible statistical language and environment that also provides more flexible graphics capabilities than other popular statistical packages. The course will introduce students to the basics of using R for statistical programming, computation, graphics, and modeling. We will start with a basic introduction to the R language, reading and writing data, and graphics. We then discuss writing functions in R and tips on programming in R. Finally, the latter part of the course will focus on using R to fit some important statistical models, including basic linear regression, generalized linear models and survival analysis. We can provide an introduction to analysis of genomics data in Bioconductor should there be interest among students. Our goal is to get students up and running with R such that they can use R in their research and are in a good position to expand their knowledge of R on their own. Course notes are written such that they provide students with a useful reference manual on R. Basic knowledge of statistics at the level of a basic understanding of linear regression is required.
- BIO 504. Geographical Information Using ArcGIS
This course introduces Geographic Information Systems (GIS) and their applications. GIS is a combination of software and hardware with capabilities for manipulating, analyzing and displaying spatially referenced information. Emphasis on learning practical skills using ArcGIS software.
- BIO 505. Database Design and Use for Health Research
Essential concepts needed to design, implement, and use a database using Oracle Express. Principles of relational database structures and objects, Structured Query Language (SQL), security concepts, schema design, referential integrity, and basic database administration. Students will learn to produce reports and datasets that can be imported into a statistical analysis package. Special emphasis on studies that incorporate high dimensional genetics and genomic data.
- BIO 508. Genomic Data Manipulation
Introduction to genomic data, computational methods for interpreting these data, and a survey of current functional genomics research. Covers biological data processing, programming for large datasets, high-throughput data (sequencing, proteomics, expression, etc.), and related publications. This course is targeted at students in experimental biology programs with an interest in understanding how available genomic techniques and resources can be applied in their research.
- BIO 509. Introduction to Statistical Computing Environments
Acquaints students with statistical computing environments under Windows and Linux systems. Taught in a computing lab, the course consists of lectures, demonstrations and hands-on exercises. Example topics include R, SAS, LaTeX, and library resources.
- BIO 510, 511. Programming I and II
I introduces general computer programming to students with little prior programming experiences. Taught in a computing lab, the course consists of lectures, demonstrations and hands-on exercises. Example topics include language syntax, flow control, and basic data structures. II introduces advanced computer programming topics to students who have taken the Programming I course or have basic programming experiences in computer languages.Taught in a computing lab, the course consists of lectures, demonstrations and hands-on exercises. Example topics include algorithm design, parsing text data, object-oriented programming, and extension libraries.
- BIO 512, 513. Introductory and Advanced Computational Biology and Bioinformatics
Basic problems, technology platforms, algorithms and data analysis approaches in computational biology. Algorithms covered include dynamic programming, hidden Markov model, Gibbs sampler, clustering and classification methods. Students will explore current topics in computational biology in a seminar format with a focus on interpretation of -omics data. They will develop skills necessary for independent research using computational biology.
- BIO 514. Introduction to Data Structures and Algorithms
Introduction to the data structures and computer algorithms that are relevant to statistical computing. The implementation of data structures and algorithms for data management and numerical computations are discussed.
The course will cover basic technology platforms, data analysis problems and algorithms in computational biology. Topics include sequence alignment and search, high throughput experiments for gene expression, transcription factor binding and epigenetic profiling, motif finding, RNA/protein structure prediction, proteomics and genome-wide association studies. Computational algorithms covered include hidden Markov model, Gibbs sampler, clustering and classification methods.
- STAT 123. Applied Quantitative Finance on Wall Street
An introduction to modern financial derivative markets and the
probabilistic and statistical techniques used to navigate them.
Methodology will largely be motivated by real problems from the
financial industry. Topics include: interest-rates; forward and futures
contracts; option markets and probabilistic valuation methods;
interest-rate derivatives and structured notes; electronic trading and
performance evaluation. Designed for those seeking an understanding of
the quantitative challenges on Wall Street and the probabilistic
toolkit developed to address them.
An introduction to time series models and associated methods of data analysis and inference. Auto regressive (AR), moving average (MA), ARMA, and ARIMA processes, stationary and non-stationary processes, seasonal processes, auto-correlation and partial auto-correlation functions, identification of models, estimation of parameters, diagnostic checking of fitted models, forecasting, spectral analysis, and transfer function models.
An introduction to major statistics packages used in academics and industry (SAS and R). Will discuss data entry and manipulation, implementing standard analyses and graphics, exploratory data analysis, simulation-based methods, and new programming methods.
An introductory course in stochastic processes. Topics include Markov chains, branching processes, Poisson processes, birth and death processes, Brownian motion, martingales, introduction to stochastic integrals, and their applications.
Random variables, measure, representations. Families of distributions:
Multivariate Normal, conjugate, marginals, mixtures. Conditional
distributions and expectation. Convergence, laws of large numbers,
central limit theorems, and martingales.
Inference: frequency, Bayes, decision analysis, foundations. Likelihood, sufficiency, and information measures. Models: Normal, exponential families, multilevel, and non-parametric. Point, interval and set estimation; hypothesis tests. Computational strategies, large and moderate sample approximations.
Brownian motion, Martingales, Central limit theorems and Stein’s method, Poisson random measures, Approximations (Delta method, Edgeworth, etc.), Inequalities, Elements of stochastic integrals.
Basic Bayesian models, followed by more complicated hierarchical and mixture models with nonstandard solutions. Includes methods for monitoring adequacy of models and examining sensitivity of models.
This is a graduate-level class aimed to equip the PhD, Masters, and motivated undergraduate students with practical distributed computing and visualization tools for scientifically rigorous work on data-intensive problems. In this class, we will build a way of thinking about quantitative problems regardless of their seeming complexity. The class covers all stages of a data analysis problem: (1) setting up the problem, (2) designing a method to solve it, (3) implementing the related computation, (4) presenting the findings. We will uncover the components of each stage, demonstrate the interaction between all stages (1-4), and discuss their practical implementation.
Hands-on introduction to network statistics, with applications to social, biological and communication networks. Topics in sampling designs and inference. Modeling network evolution. Processes on networks. Critical literature review, in class-presentations, and final projects.