Master statistics & machine learning: intuition, math, code

Udemy

Develop an understanding of the deep concepts associated with python and MATLAB for operations involving statistics and machine learning.

Online

₹ 3499

particular

details

                                    Medium of instructions
                                    English

                                    Mode of learning
                                    Self study

                                    Mode of Delivery
                                    Video and Text Based

Introductions

[Important] Getting the most out of this course
About using MATLAB or Python
Statistics guessing game!
Using the Q&A forum
(optional) Entering time-stamped notes in the Udemy video player

Math prerequisites

Should you memorize statistical formulas?
Arithmetic and exponents
Scientific notation
Summation notation
Absolute value
Natural exponent and logarithm
The logistic function
Rank and tied-rank

Important: Download course materials

Download materials for the entire course!

What are (is?) data?

Is "data" singular or plural?!?!!?!
Where do data come from and what do they mean?
Types of data: categorical, numerical, etc
Code: representing types of data on computers
Sample vs. population data
Samples, case reports, and anecdotes
The ethics of making up data

Visualizing data

Bar plots
Code: bar plots
Box-and-whisker plots
Code: box plots
"Unsupervised learning": Boxplots of normal and uniform noise
Histograms
Code: histograms
"Unsupervised learning": Histogram proportion
Pie charts
Code: pie charts
When to use lines instead of bars
Linear vs. logarithmic axis scaling
Code: line plots
"Unsupervised learning": log-scaled plots

Descriptive statistics

Descriptive vs. inferential statistics
Accuracy, precision, resolution
Data distributions
Code: data from different distributions
"Unsupervised learning": histograms of distributions
The beauty and simplicity of Normal
Measures of central tendency (mean)
Measures of central tendency (median, mode)
Code: computing central tendency
"Unsupervised learning": central tendencies with outliers
Measures of dispersion (variance, standard deviation)
Code: Computing dispersion
Interquartile range (IQR)
Code: IQR
QQ plots
Code: QQ plots
Statistical "moments"
Histograms part 2: Number of bins
Code: Histogram bins
Violin plots
Code: violin plots
"Unsupervised learning": asymmetric violin plots
Shannon entropy
Code: entropy
"Unsupervised learning": entropy and number of bins

Data normalizations and outliers

Garbage in, garbage out (GIGO)
Z-score standardization
Code: z-score
Min-max scaling
Code: min-max scaling
"Unsupervised learning": Invert the min-max scaling
What are outliers and why are they dangerous?
Removing outliers: z-score method
The modified z-score method
Code: z-score for outlier removal
"Unsupervised learning": z vs. modified-z
Multivariate outlier detection
Code: Euclidean distance for outlier removal
Removing outliers by data trimming
Code: Data trimming to remove outliers
Non-parametric solutions to outliers
Nonlinear data transformations
An outlier lecture on personal accountability

Probability theory

What is probability?
Probability vs. proportion
Computing probabilities
Code: compute probabilities
Probability and odds
"Unsupervised learning": probabilities of odds-space
Probability mass vs. density
Code: compute probability mass functions
Cumulative distribution functions
Code: cdfs and pdfs
"Unsupervised learning": cdf's for various distributions
Creating sample estimate distributions
Monte Carlo sampling
Sampling variability, noise, and other annoyances
Code: sampling variability
Expected value
Conditional probability
Code: conditional probabilities
Tree diagrams for conditional probabilities
The Law of Large Numbers
Code: Law of Large Numbers in action
The Central Limit Theorem
Code: the CLT in action
"Unsupervised learning": Averaging pairs of numbers

Hypothesis testing

IVs, DVs, models, and other stats lingo
What is an hypothesis and how do you specify one?
Sample distributions under null and alternative hypotheses
P-values: definition, tails, and misinterpretations
P-z combinations that you should memorize
Degrees of freedom
Type 1 and Type 2 errors
Parametric vs. non-parametric tests
Multiple comparisons and Bonferroni correction
Statistical vs. theoretical vs. clinical significance
Cross-validation
Statistical significance vs. classification accuracy

The t-test family

Purpose and interpretation of the t-test
One-sample t-test
Code: One-sample t-test
"Unsupervised learning": The role of variance
Two-samples t-test
Code: Two-samples t-test
"Unsupervised learning": Importance of N for t-test
Wilcoxon signed-rank (nonparametric t-test)
Code: Signed-rank test
Mann-Whitney U test (nonparametric t-test)
Code: Mann-Whitney U test
Permutation testing for t-test significance
Code: permutation testing
"Unsupervised learning": How many permutations?

Confidence intervals on parameters

What are confidence intervals and why do we need them?
Computing confidence intervals via formula
Code: compute confidence intervals by formula
Confidence intervals via bootstrapping (resampling)
Code: bootstrapping confidence intervals
"Unsupervised learning:" Confidence intervals for variance
Misconceptions about confidence intervals

Correlation

Motivation and description of correlation
Covariance and correlation: formulas
Code: correlation coefficient
Code: Simulate data with specified correlation
Correlation matrix
Code: correlation matrix
"Unsupervised learning": average correlation matrices
"Unsupervised learning": correlation to covariance matrix
Partial correlation
Code: partial correlation
The problem with Pearson
Nonparametric correlation: Spearman rank
Fisher-Z transformation for correlations
Code: Spearman correlation and Fisher-Z
"Unsupervised learning": Spearman correlation
"Unsupervised learning": confidence interval on correlation
Kendall's correlation for ordinal data
Code: Kendall correlation
"Unsupervised learning": Does Kendall vs. Pearson matter?
The subgroups correlation paradox
Cosine similarity
Code: Cosine similarity vs. Pearson correlation

Analysis of Variance (ANOVA)

ANOVA intro, part1
ANOVA intro, part 2
Sum of squares
The F-test and the ANOVA table
The omnibus F-test and post-hoc comparisons
The two-way ANOVA
One-way ANOVA example
Code: One-way ANOVA (independent samples)
Code: One-way repeated-measures ANOVA
Two-way ANOVA example
Code: Two-way mixed ANOVA

Regression

Introduction to GLM / regression
Least-squares solution to the GLM
Evaluating regression models: R2 and F
Simple regression
Code: simple regression
"Unsupervised learning": Compute R2 and F
Multiple regression
Standardizing regression coefficients
Code: Multiple regression
Polynomial regression models
Code: polynomial modeling
"Unsupervised learning": Polynomial design matrix
Logistic regression
Code: Logistic regression
Under- and over-fitting
"Unsupervised learning": Overfit data
Comparing "nested" models
What to do about missing data

Statistical power and sample sizes

What is statistical power and why is it important?
Estimating statistical power and sample size
Compute power and sample size using G*Power

Clustering and dimension-reduction

K-means clustering
Code: k-means clustering
"Unsupervised learning:" K-means and normalization
"Unsupervised learning:" K-means on a Gauss blur
Clustering via dbscan
Code: dbscan
"Unsupervised learning": dbscan vs. k-means
K-nearest neighbor classification
Code: KNN
Principal components analysis (PCA)
Code: PCA
"Unsupervised learning:" K-means on PC data
Independent components analysis (ICA)
Code: ICA

Signal detection theory

The two perspectives of the world
d-prime
Code: d-prime
Response bias
Code: Response bias
F-score
Receiver operating characteristics (ROC)
Code: ROC curves
"Unsupervised learning": Make this plot look nicer!

A real-world data journey

Note about the code for this section
Introduction
MATLAB: Import and clean the marriage data
MATLAB: Import the divorce data
MATLAB: More data visualizations
MATLAB: Inferential statistics
Python: Import and clean the marriage data
Python: Import the divorce data
Python: Inferential statistics
Take-home messages

Bonus section

About deep learning
Bonus content

Popular Courses

Popular Platforms

Popular Searches

Master statistics & machine learning: intuition, math, code

Online

₹ 3499

Quick Facts

Course overview

The highlights

Program offerings

Course and certificate fees

Fees information

certificate availability

certificate providing authority

Who it is for

What you will learn

The syllabus