DeepMind x UCL RL Lecture Series – Policy-Gradient and Actor-Critic methods [9/13]



Research Scientist Hado van Hasselt covers policy algorithms that can learn policies directly and actor critic algorithms that …

source

Leave a Reply

Your email address will not be published. Required fields are marked *