Amazon logo When you click the Amazon logo to the left of any citation and purchase the book (or other media) from, MIT OpenCourseWare will receive up to 10% of this purchase and any other purchases you make during that visit. This will not increase the cost of your purchase. Links provided are to the US Amazon site, but you can also support OCW through Amazon sites in other regions. Learn more.

Who Should Take this Course?

The course is required for students in the Master's Program in Medical Informatics, but is open to other graduate students and advanced undergraduates.


Below are some examples of projects developed in this course, by domain:

  1. Anonymization of Databases
    • Using Boolean Reasoning to Anonymize Databases
  2. Diagnostic Models
    • Using Patient-Reportable Clinical History Factors to Predict Myocardial Infarction
    • A Genetic Algorithm to Select Variables in Logistic Regression: Example in the Domain of Myocardial Infarction
  3. Prognostic Models
    • Development and Evaluation of Models to Predict Death and Myocardial Infarction Following Coronary Angioplasty and Stenting
    • Major Complications after Angioplasty in Patients with Chronic Renal Failure: A Comparison of Predictive Models

There is no required textbook for this course, however, the following textbooks are recommended for reference:

Amazon logo Hastie, T., R. Tibshirani, and J. H. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. (Springer Series in Statistics.) New York, NY: Springer Verlag, July 30, 2003. ISBN: 9780387952840 .

Amazon logo Duda, Richard O., Peter E. Hart, and David G. Stork. Pattern Classification. 2nd ed. New York, NY: John Wiley & Sons, October 2000. ISBN: 9780471056690 .


Familiarity with SAS and MATLAB® will be helpful, and consultation with respective user manuals may be necessary.


30% Homework

Homework's are due at the end of each module. They may require programming, and all code must be included. They are to be solved individually. Homework's received after the deadline may be subject to substantial grade penalty. No homework's will be accepted after the solutions have been handled.

30% Midterm

The midterm will contain topics from decision analysis and machine learning. There will be a strict time limit of 1.5 h. Students should bring class notes, homework solutions, and readings.

40% Final Project

The final project has to be developed for this course and should be done individually. Although the project may contain parts developed previously for another purpose, it is essential that substantial effort be demonstrated into developing a project specifically for this class. A final report in the form of a 5+ page paper is expected, as well as a demonstration of the implementation in the form of a 15 minute presentation. Students are advised to discuss their projects with the instructors ahead of time. Examples of final reports will be handled during the course.

MATLAB® is a trademark of The MathWorks, Inc.