Introduction to Machine Learning
- Applications of machine learning
- Supervised Versus Unsupervised Learning
- Machine Learning Algorithms
- Regression
- Classification
- Clustering
- Recommender System
- Anomaly Detection
- Reinforcement Learning
Regression
- Simple & Multiple Regression
- Least Square Method
- Estimating the Coefficients
- Assessing the Accuracy of the Coefficient Estimates
- Assessing the Accuracy of the Model
- Post Estimation Analysis
- Other Considerations in the Regression Models
- Qualitative Predictors
- Extensions of the Linear Models
- Potential Problems
- Bias-variance trade off [under-fitting/over-fitting] for regression models
Resampling Methods
- Cross-Validation
- The Validation Set Approach
- Leave-One-Out Cross-Validation
- k-Fold Cross-Validation
- Bias-Variance Trade-Off for k-Fold
- The Bootstrap
Model Selection and Regularization
- Subset Selection [Best Subset Selection, Stepwise Selection, Choosing the Optimal Model]
- Shrinkage Methods/ Regularization [Ridge Regression, Lasso & Elastic Net]
- Selecting the Tuning Parameter
- Dimension Reduction Methods
- Principal Components Regression
- Partial Least Squares
Classification
-
Logistic Regression
-
The Logistic Model cost function
-
Estimating the Coefficients
-
Making Predictions
-
Odds Ratio
-
Performance Evaluation Matrices
-
[Sensitivity/Specificity/PPV/NPV, Precision, ROC curve etc.]
-
Multiple Logistic Regression
-
Logistic Regression for >2 Response Classes
-
Regularized Logistic Regression
-
-
Linear Discriminant Analysis
-
Using Bayes’ Theorem for Classification
-
Linear Discriminant Analysis for p=1
-
Linear Discriminant Analysis for p >1
-
-
Quadratic Discriminant Analysis
-
K-Nearest Neighbors
-
Classification with Non-linear Decision Boundaries
-
Support Vector Machines
-
Optimization Objective
-
The Maximal Margin Classifier
-
Kernels
-
One-Versus-One Classification
-
One-Versus-All Classification
-
-
Comparison of Classification Methods
ANN Structure
-
Biological neurons and artificial neurons
-
Non-linear Hypothesis
-
Model Representation
-
Examples & Intuitions
-
Transfer Function/ Activation Functions
-
Typical classes of network architectures
Feed forward ANN.
-
Structures of Multi-layer feed forward networks
-
Back propagation algorithm
-
Back propagation - training and convergence
-
Functional approximation with back propagation
-
Practical and design issues of back propagation learning
Deep Learning
-
Artificial Intelligence & Deep Learning
-
Softmax Regression
-
Self-Taught Learning
-
Deep Networks
-
Demos and Applications
Getting Started with R
-
Introduction to R
-
Basic Commands & Libraries
-
Data Manipulation
-
Importing & Exporting data
-
Graphical and Numerical Summaries
-
Writing functions
Regression
-
Simple & Multiple Linear Regression
-
Interaction Terms
-
Non-linear Transformations
-
Dummy variable regression
-
Cross-Validation and the Bootstrap
-
Subset selection methods
-
Penalization [Ridge, Lasso, Elastic Net]
Classification
-
Logistic Regression, LDA, QDA, and KNN,
-
Resampling & Regularization
-
Support Vector Machine
-
Resampling & Regularization
Note:
-
For ML algorithms, case studies will be used to discuss their application, advantages & potential issues.
-
Analysis of different data sets will be performed using R
Basic knowledge of statistical concepts is desirable.
21 hours (usually 3 days including breaks)