Practical_RL: Reinforcement learning for seq2seq (pytorch, tensorflow, theano)

Jan. 14, 2018, 3:59 a.m. By: Kirti Bakshi


Over the last few decades, Machine Learning methods have gone through a steep ascension. One can teach an algorithm to comprehend, find objects on images, translate natural language as well as even generate text and voice, or retrieve information from the internet at superhuman or near levels if given enough labelled data. The only downside that comes ahead is, not every problem can be framed as learning X -> y transition that approximates some reference labels.

About This Course:

If one finds themselves say, learning to, play some new game, navigate in urban surroundings, design landing pages, ride a bicycle, or, even, build reinforcement learning agents, in this very course, you don’t just normally memorize textbooks that come with examples for every possible situation of optimal muscular contractions. The common idea that lies within these problems is that they can be solved by the method of trial and error: trying out ideas and sticking to those that affect less on the bad side.

One more common thing is that these problems can be, solved automatically to a various extent. So, here, what we’re going to exactly do is train machines to do the creative search for solution throughout this course.

The main focus of the MOOC is the practical aspect of training for life-size problems such “machines”, called Reinforcement Learning (RL) algorithms.

The Things on the menu include:

  • foundations of RL,

  • practical algorithms,

  • engineering “hacks”,

  • case studies,

  • fresh & crunchy articles.

The schedule features a variety of stuff that ranges from robotics and games through chatbots to finance. The course is taught on-campus at HSE(Russian) and has been maintained to be friendly to online students (both English as well as Russian).

What does one need to know/have in order to benefit from this course?

The course assumes that the learner knows the following:

  • algebra, calculus (vectors, matrices, basic integrals)

  • probability (Bayes theorem, expectation, variance)

  • optimization (gradient descent)

  • basic machine learning (linear models, decision trees)

  • coding (python, numpy, sklearn)

One more thing to know is that this course will have a tight connection with methods of deep learning. There’s no strict requirement to have experience in neural networks as the course will be giving a crash course on them with the use of Theano and Lasagne, but knowing how to go with neural networks will definitely come in handy time to time.

The goal is to bring in students the prominent area of modern research in artificial intelligence: reinforcement learning. The reinforcement learning is more about how humans learn in reality and differ much from both supervised and unsupervised learning.


Optimization for the curious: For all the materials that haven't been covered in detail there are links to more information and related materials.

Practicality first: Everything essential to solving reinforcement learning problems is worth mentioning. The course goes with covering from tricks and heuristics.

Git-course: Noticed a typo in a formula? Made the code more readable? Made a version for an alternative framework? Found a useful link? Know a way to make the course better? You have it here.

Apart from being a part of the course, you can also contribute to it:

There are many ways in which can contribute to the course, a few of them are:

  • Pull-request URL's to great learning materials to the ./week*/ files;

  • Spot bugs and create issues or better - resolve them and submit pull-requests;

  • Translate assignments to different frameworks and versions (tensorflow, pytorch, rllab, py2/3 compatibility, etc) via pull-requests;

  • Answer questions and give advice in the chatroom if you happen to know the answer;

For More Information: GitHub

Link To The Lecture Slides: Click Here

The Online Student Survival Guide: Click Here