Machine Learning Courses, Machine Learning Training in Pune, India

Home

Machine Learning Internals

Machine learning is the science of getting computers to act without being explicitly programmed. In the past decade, machine learning has given us self-driving cars, practical speech recognition, effective web search, and a vastly improved understanding of the human genome. Machine learning is so pervasive today that you probably use it dozens of times a day without knowing it. Many researchers also think it is the best way to make progress towards human-level AI. This trainig session provides a deep dive into machine learning, datamining, and statistical pattern recognition. Topics include: (i) Supervised learning (parametric/non-parametric algorithms, support vector machines, kernels, neural networks). (ii) Unsupervised learning (clustering, dimensionality reduction, recommender systems, deep learning). (iii) Best practices in machine learning (bias/variance theory; innovation process in machine learning and AI). The course will also draw from numerous case studies and applications, so that you'll also learn how to apply learning algorithms to building smart robots (perception, control), text understanding (web search, anti-spam), computer vision, medical informatics, audio, database mining, and other areas.

Key Skills:

Measuring and Tuning performance of ML algorithms
You'll learn about not only the theoretical underpinnings of learning, but also gain the practical know-how needed to quickly and powerfully apply these techniques to new problems
Most effective machine learning techniques
You will learn how to Prototype and then productionize
Best practices in innovation as it pertains to machine learning and AI

Prerequisites:

Experience in Programming
An understanding of Intro to Statistics would be helpful.
A familiarity with Probability Theory, Calculus, Linear Algebra and Statistics is required

Introduction

Model selection
Supervised learning
Discovering graph structure
Types of machine learning
Machine learning: what and why?
Parametric vs non-parametric models
No free lunch theorem
Linear regression
Some basic concepts in machine learning
Discovering clusters
Classification
Regression
Matrix completion
Logistic regression
Parametric models for classification and regression
The curse of dimensionality
Overfitting
Unsupervised learning
Discovering latent factors
A simple non-parametric classifier: K-nearest neighbors

Machine Learning for Predictive Data Analytics

Predictive Data Analytics Tools
How Does Machine Learning Work?
The Road Ahead
What Can Go Wrong with Machine Learning?
The Predictive Data Analytics Project Lifecycle: CRISP-DM
What Is Machine Learning?
What Is Predictive Data Analytics?

Data to Insights to Decisions

Different Types of Data
Different Types of Features
Designing the Analytics Base Table
Designing and Implementing Features
Assessing Feasibility
Converting Business Problems into Analytics Solutions
Case Study: Motor Insurance Fraudmotor
Implementing Features
Handling Time

Data Exploration

Outliers
Handling Missing Values
Handling Outliers
Missing Values
Irregular Cardinality
Handling Data Quality Issues
The Data Quality Report
The Normal Distribution
Identifying Data Quality Issues
Getting to Know the Data
Advanced Data Exploration
Measuring Covariance and Correlation
Visualizing Relationships Between Features
Binning
Data Preparation
Normalization

Information-based Learning

Shannon’s Entropy Model
Handling Continuous Descriptive Features
Decision Trees
Predicting Continuous Targets
Extensions and Variations
Fundamentals
Information Gain
Big Idea
Standard Approach: The ID Algorithm
Tree Pruning
Alternative Feature Selection and Impurity Metrics

Similarity-based Learning

Standard Approach: The Nearest Neighbor Algorithm
Predicting Continuous Targets
Fundamentals
Other Measures of Similarity
Extensions and Variations
Data Normalization
Feature Space
Big Idea
Measuring Similarity Using Distance Metrics
Feature Selection
Handling Noisy Data
Efficient Memory Search

Probability-based Learning

Big Idea
Smoothing
Extensions and Variations
Bayes’ Theorem
Bayesian Networks
Continuous Features: Probability Density Functions
Continuous Features: Binning
Bayesian Prediction
Conditional Independence and Factorization
Fundamentals
Standard Approach: The Naive Bayes Model

Error-based Learning

Setting the Learning Rate Using Weight Decay
Error Surfaces
Multinomial Logistic Regression
Modeling Non-linear Relationships
Handling Categorical Descriptive Features
Interpreting Multivariable Linear Regression Models
Simple Linear Regression
Big Idea
Handling Categorical Target Features: Logistic Regression
Extensions and Variations
Fundamentals
Choosing Learning Rates and Initial Weights
Standard Approach: Multivariable Linear Regression with Gradient Descent
Gradient Descent
Multivariable Linear Regression
Measuring Error

Evaluation

Performance Measures: Prediction Scores
Designing Evaluation Experiments
Evaluating Models after Deployment
Performance Measures: Multinomial Targets
Extensions and Variations
Fundamentals
Performance Measures: Continuous Targets
Performance Measures: Categorical Targets
Big Idea
Standard Approach: Misclassification Rate on a Hold-out Test Set

Software Tools

Matlab
H2O
Spark ML/Mlib
Octave

Linear regression

Regularization effects of big data
Bayesian inference when ?^2 is unknown *
Model specification
Numerically stable computation *
Computing the posterior
Geometric interpretation
Convexity
Connection with PCA *
Maximum likelihood estimation (least squares)
Bayesian linear regression
Derivation of the MLE
Computing the posterior predictive
EB for linear regression (evidence procedure)
Ridge regression
Basic idea
Robust linear regression *
Introduction

Logistic regression

Residual analysis (outlier detection) *
Generative vs discriminative classifier
Multi-class logistic regression
Online learning and regret minimization
Iteratively reweighted least squares (IRLS)
Quasi-Newton (variable metric) methods
Newton’s method
Bayesian logistic regression
A Bayesian view
Laplace approximation
l2 regularization
Gaussian approximation for logistic regression
Approximating the posterior predictive
Derivation of the BIC
Steepest descent
Introduction
MLE
Model specification
Online learning and stochastic optimization
Dealing with missing data
Fisher’s linear discriminant analysis (FLDA) *
Model fitting
Stochastic optimization and risk minimization
Pros and cons of each approach
The LMS algorithm
Logistic regression
The perceptron algorithm

Generalized linear models and the exponential family

Introduction
Maximum entropy derivation of the exponential family *
Ordinal probit regression *
Generalized linear mixed models *
Examples
semi-parametric GLMMs for medical data
The pointwise approach
Computational issues
Application to domain adaptation
ML and MAP estimation
Learning to rank *
ML/MAP estimation using gradient-based optimization
Multinomial probit models *
Probit regression
Bayesian inference
Generalized linear models (GLMs)
Other kinds of prior
Basics
The exponential family
The pairwise approach
Log partition function
Loss functions for ranking
Bayes for the exponential family *
Definition
Hierarchical Bayes for multi-task learning
Application to personalized email spam filtering
The listwise approach
Latent variable interpretation
MLE for the exponential family
Multi-task learning

Directed graphical models (Bayes nets)

Chain rule
Markov and hidden Markov models
Introduction
Graph terminology
Naive Bayes classifiers
d-separation and the Bayes Ball algorithm (global Markov properties)
Learning with missing and/or latent variables
Conditional independence
Other Markov properties of DGMs
Inference
Genetic linkage analysis *
Directed Gaussian graphical models *
Influence (decision) diagrams *
Learning
Graphical models
Directed graphical models
Markov blanket and full conditionals
Plate notation
Learning from complete data
Conditional independence properties of DGMs

Latent linear models

Other estimation principles *
Principal components analysis (PCA)
Probabilistic PCA
Using EM
The FastICA algorithm
FA is a low rank parameterization of an MVN
Fitting FA models with missing data
Unidentifiability
Choosing the number of latent dimensions
Partial least squares
Singular value decomposition (SVD)
EM for factor analysis models
Mixtures of factor analysers
EM algorithm for PCA
Model selection for FA/PPCA
Supervised PCA (latent factor regression)
Canonical correlation analysis
PCA for categorical data
Maximum likelihood estimation
PCA for paired and multi-view data
Inference of the latent factors
Classical PCA: statement of the theorem
Model selection for PCA
Independent Component Analysis (ICA)
Factor analysis

Kernels

Smoothing kernels
Kernels for comparing documents
The kernel trick
SVMs for classification
Kernelized ridge regression
SVMs for regression
Linear kernels
Kernel machines
Introduction
Comparison of discriminative kernel methods
Kernelized nearest neighbor classification
Using kernels inside GLMs
A probabilistic interpretation of SVMs
Kernel functions
RBF kernels
Mercer (positive definite) kernels
Kernel density estimation (KDE)
Kernel PCA
String kernels
LVMs, RVMs, and other sparse vector machines
Kernels for building generative models
Choosing C
Kernelized K-medoids clustering
Pyramid match kernels
Kernel regression
Kernels derived from probabilistic generative models
Locally weighted regression
Summary of key points
Support vector machines (SVMs)
From KDE to KNN
Matern kernels

Clustering

Agglomerative clustering
The Dirichlet process
Clustering datapoints and features
Graph Laplacian
Evaluating the output of clustering methods *
Dirichlet process mixture models
Multi-view clustering
Spectral clustering
Applying Dirichlet processes to mixture modeling
Biclustering
Fitting a DP mixture model
Choosing the number of clusters
Measuring (dis)similarity
Affinity propagation
Introduction
From finite to infinite mixture models
Bayesian hierarchical clustering
Normalized graph Laplacian
Hierarchical clustering
Divisive clusterin

Introduction to Deep learning

Learning image features using d convolutional DBNs
Deep generative models
Information retrieval using deep auto-encoders (semantic hashing)
Deep directed networks
Data visualization and feature discovery using deep auto-encoders
Deep Boltzmann machines
Applications of deep networks
Stacked denoising auto-encoders
Learning audio features using d convolutional DBNs
Greedy layer-wise learning of DBNs
Deep neural networks
Deep belief networks
Deep multi-layer perceptrons
Deep auto-encoders
Introduction
Handwritten digit classification using DBNs

Machine Learning Training

Anika Technologies

Technology Consulting

Training Courses

Subscribe