Announcements
Feb 22nd, 2018: Course site is up. Potential students please come back and check
…..
Course Instructor Course Assistants Class Time and Location
Office hour: right after the class
Course Description
This course is a graduate course and is primarily project-oriented. This course teaches how to Artificial Intelligence for IT Operations (AIOps) for the Internet. The high-level objectives for these systems are that for the targeted networks/applications in the Internet:
1) What happened in the past can be reconstructed automatically and accurately;
2) What’s going on now can be detected/inferred accurately to trigger automatic mitigation or suggest immediate actions to the operators;
3) What will happen in the future can be predicted with high confidence.
Below figure illustrates how AIOps can help IT Operations:
AIOps is at the intersection of AI, Business, and IT Operations, as illustrated in the below figure.
Through case studies based on recent research papers in top conferences, this course will cover the latest research progress in AIOps, and how the latest techniques (time series analysis, machine learning, deep learning, big data systems, streaming systems) can be applied to the field of AIOps.
Grading Policies
Attendance: 10%; Two Assignments (30%); One Project: 60%
Course Information
Statistical Data Mining Tutorials
by Andrew Moore.
《Deep Learning》 by Ian Goodfellow, Yoshua Bengio, Aaron Courville
《Site Reliability Engineering –How Google Runs Production Systems》, by Betsy Beyer, Chris Jones, Jennifer Petoff & Niall Richard Murphy