Anomaly Detection
Summary
Learn how to use statistics and machine learning to detect anomalies in data. As a fundamental part of data science and AI theory, the study and application of how to identify abnormal data can be applied to supervised learning, data analytics, financial prediction, and many more industries. Understanding the theory and intuition behind these methods is an essential part of the modern developer's and researcher’s tools and knowledge base.
This course provides you with practical knowledge of the following skills:
- The theory and methods used for anomaly detection from beginning to advanced levels
- Derive depth-based and proximity-based detection models
- Use many types of data from real-time streaming to high-dimensional abstractions
- Implement these types of models using a collection of Python* labs
The course is structured around eight weeks of lectures and exercises. Each week requires approximately two hours to complete.
Prerequisites
Python programming
Calculus
Linear algebra
Statistics
Week 1
Get started with understanding why and how to detect anomalies in data.
- Define various types of anomalies
- Discuss the applications of anomaly detection
- Explain the statistics and mathematics required
Week 2
Learn how to build upon probability theory and geometry to identify anomalies.
- Describe probabilistic models for anomaly detection
- Apply extreme value analysis and angle-based techniques
- Use Python to perform anomaly detection on one- and two-dimensional data
Week 3
See how to use linear models instead of probabilistic and geometric models.
- Apply linear regression models and principal component analysis
- Use support vectors machines (SVMs) for anomaly detection
Week 4
Explore how to use additional methods based on distance to identify abnormal data.
- Describe proximity-based methods and the local outlier factor (LOF)
- Apply the k-nearest neighbors (KNN) algorithm and k-means clustering
Week 5
Learn how to work with difficult problems that involve high-dimensional data.
- Understand the difficulties with high-dimensional problems
- Apply the subspace method with feature bagging and the isolation forest algorithm
Week 6
Find out how to use supervised learning models and how to work with classifications.
- Implement cost-sensitive learning algorithms
- Apply adaptive resampling and boosting methods
Week 7
Explore how to classify temporal and streaming data.
- Implement statistical process control
- Apply streaming anomaly detection using autoregressive models
Week 8
Measure the performance of an anomaly detection system.
- Evaluate different techniques and types of anomaly detection
- Perform analysis on a wide variety of data detection