Reinforcement Learning beginner to master AI in Python

Udemy

Brochure

Enquire

Beginner

Online

₹ 3099

particular

details

                                    Medium of instructions
                                    English

                                    Mode of learning
                                    Self study

                                    Mode of Delivery
                                    Video and Text Based

Welcome module

[IMPORTANT] English captions available for sections 1-4
Welcome
Course Structure
Environment setup [Important]
Setup - Mac

The Markov decision process (MDP)

The Markov decision process (MDP)
Types of Markov decision process
Trajectory vs episode
Reward vs Return
Discount factor
Policy
State values v(s) and action values q(s,a)
Bellman equations
Solving a Markov decision process
Setup - MDP in code
MDP in code - Part 1
MDP in code - Part 2

Dynamic Programming

Introduction to Dynamic Programming
Value iteration
Setup - Value iteration
Coding - Value iteration 1
Coding - Value iteration 2
Coding - Value iteration 3
Coding - Value iteration 4
Coding - Value iteration 5
Policy iteration
Setup - Policy iteration
Coding - Policy iteration 1
Policy evaluation
Coding - Policy iteration 2
Policy Improvement
Coding - Policy iteration 3
Coding - Policy iteration 4
Policy iteration in practice
Generalized Policy Iteration (GPI)

Monte Carlo methods

Monte Carlo methods
Solving control tasks with Monte Carlo methods
On-policy Monte Carlo control
Setup - On-policy Monte Carlo control
Coding - On-policy Monte Carlo control 1
Coding - On-policy Monte Carlo control 2
Coding - On-policy Monte Carlo control 3
Setup - Constant alpha Monte Carlo
Coding - Constant alpha Monte Carlo
Off-policy Monte Carlo control
Setup - Off-policy Monte Carlo control
Coding - Off-policy Monte Carlo 1
Coding - Off-policy Monte Carlo 2
Coding - Off-policy Monte Carlo 3

Temporal difference methods

Temporal difference methods
Solving control tasks with temporal difference methods
Monte Carlo vs temporal difference methods
SARSA
Setup - SARSA
Coding - SARSA 1
Coding - SARSA 2
Q-Learning
Setup - Q-Learning
Coding - Q-Learning 1
Coding - Q-Learning 2
Advantages of temporal difference methods

N-step bootstrapping

N-step temporal difference methods
Where do n-step methods fit?
Effect of changing n
N-step SARSA
N-step SARSA in action
Setup - n-step SARSA
Coding - n-step SARSA

Continuous state spaces

Setup - Classic control tasks
Coding - Classic control tasks
Working with continuous state spaces
State aggregation
Setup - Continuous state spaces
Coding - State aggregation 1
Coding - State aggregation 2
Coding - State aggregation 3
Tile coding
Coding - Tile coding 1
Coding - Tile coding 2
Coding - Tile coding 3

Brief introduction to neural networks

Function approximators
Artificial Neural Networks
Artificial Neurons
How to represent a Neural Network
Stochastic Gradient Descent
Neural Network optimization

Deep SARSA

Deep SARSA
Neural Network optimization (Deep Q-Network)
Experience Replay
Target Network
Coding - Deep SARSA 1
Coding - Deep SARSA 2
Coding - Deep SARSA 3
Coding - Deep SARSA 4
Coding - Deep SARSA 5
Coding - Deep SARSA 6
Coding - Deep SARSA 7
Coding - Deep SARSA 8
Coding - Deep SARSA 9
Coding -Deep SARSA 10

Deep Q-Learning

Deep Q-Learning
Setup - Deep Q-Learning
Coding - Deep Q-Learning 1
Coding - Deep Q-Learning 2
Coding - Deep Q-Learning 3

REINFORCE

Policy gradient methods
Representing policies using neural networks
Policy performance
The policy gradient theorem
REINFORCE
Parallel learning
Entropy regularization
REINFORCE 2
Coding - REINFORCE 1
Coding - REINFORCE 2
Coding - REINFORCE 3
Coding - REINFORCE 4
Coding - REINFORCE 5

Advantage Actor - Critic (A2C)

A2C
Setup - A2C
Coding - A2C 1
Coding - A2C 2
Coding - A2C 3
Coding - A2C 4

Outro

Looking back
Next steps

Popular Courses

Popular Platforms

Popular Searches

Reinforcement Learning beginner to master AI in Python

Beginner

Online

₹ 3099

Quick Facts

Course and certificate fees

Fees information

certificate availability

certificate providing authority

The syllabus

Welcome module

The Markov decision process (MDP)

Dynamic Programming

Monte Carlo methods

Temporal difference methods

N-step bootstrapping

Continuous state spaces

Brief introduction to neural networks

Deep SARSA

Deep Q-Learning

REINFORCE

Advantage Actor - Critic (A2C)

Outro

Articles

Popular Articles

Latest Articles

Similar Courses

Courses of your Interest

More Courses by Udemy

Trending Courses

Popular Courses

Popular Platforms

Learn more about the Courses

Download the Careers360 App on your Android phone

Thank You!