Sebastian Ruder Profile picture
Sep 13, 2018 10 tweets 4 min read Twitter logo Read on Twitter
David Silver on Principles for Reinforcement Learning at the #DLIndaba2018. Important principles that are not only applicable to RL, but to ML research in general. E.g. leaderboard-driven research vs. hypothesis-driven research (see the slides below).
Principle 2. How an algorithm scales is more important than its starting point. Avoid performance ceilings. Deep Learning is successful because it scales so effectively.
Principles are meant to be controversial. I would argue that sample efficiency is at least as important.
Principle 3. Generality (how your algorithm performs on other tasks) is super important. Key is to design a diverse set of challenging tasks.
This. We should evaluate on out of distribution data and new tasks.
Principle 4. Use agent's experience rather than human expertise. Don't rely on engineered features or heuristics.
Hmm. Maybe true in the setting where you can sample an infinite number of experiences but domain expertise and inductive biases are important when data is limited.
Principle 5. State should be built as the state of the model, i.e. an RNN's hidden state and not defined in terms of environment. No constraint on state. Only agent's subjective view of world matters. Should not reason about external reality, which is limiting.
Principle 6. Agents influence the stream of data and experiences. Agents should have access to features to control steam stream of environment. Focus should not just be on maximizing reward but also building control of the stream.
Principle 7. Value functions efficiently summarize the state of the world and the future. Multiple value functions allow us to model multiple aspects of the world. Can help to control the stream.
Principle 8. Imagining the future (what will happen next) can be used for planning. Same RL algorithms can be applied to learn from imagined experience (as in Alphago using MCTS and the value function).
Principle 9. Leverage strong function approximators. Algorithmic complexity can be pushed into neural network architecture (even MCTS, hierarchical control, etc. can be modeled with a NN).
This necessitates more tools to understand what our models actually learn.
Principle 10. Meta learning is the way to go. Not even the architecture is handcrafted anymore. Everything is learned end to end. The neural network takes care of everything with as little human input as possible.
Inductive bias should still be useful, though.

• • •

Missing some Tweet in this thread? You can try to force a refresh

Keep Current with Sebastian Ruder

Sebastian Ruder Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!


Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @seb_ruder

Jul 20, 2018
#Repl4NLP at #ACL2018 panel discussion:
Q: Given that the amount of data and computing power is rapidly increasing, should we just quit working on models altogether?
Yejin: Sounds like a good idea for the companies. The more data the better. Please create more data.
Meg: Different people have different strengths. People say: “We should all care about ethics”. Geek out about what you love. Apply yourself to what you. Lots of other things that come to bear besides just working with data, e.g. sociology, psychology, maths, etc.
Important to focus on what you really love. Work with people that have complimentary and different interests.
Yoav: Personally don’t work on huge data. If some company would like to train a huge LM on the entire web, that’d be great to have and analyze.
Read 39 tweets
Jun 5, 2018
All-star panel at the generalization in deep learning workshop at @NAACLHLT #Deepgen2018
: "We should have more inductive biases. We are clueless about how to add inductive biases so we do dataset augmentation, create pseudo training data to encode those biases. Seems like a strange way to go about doing things."
Yejin Choi: Language specific inductive bias is necessary to push NLG. Inductive bias as architectural choices. Current biases are not good at going beyond the sentence-level but language is about more than a sentence. We require building a world model.
Read 22 tweets
Mar 31, 2018
1/ People (mostly people working with Computer Vision) say that CV is ahead of other ML application domains by at least 6 months - 1 year. I would like to explore why this is, if this is something to be concerned about, and what it might take to catch up.
2/ I can’t speak about other application areas, so I will mostly compare CV vs. NLP. This is just a braindump, so feel free to criticize, correct, and disagree.
3/ First, is that really true? For many specialized applications, where task or domain-specific tools are required, such as core NLP tasks (parsing, POS tagging, NER) comparing to another discipline is not meaningful.
Read 16 tweets

Did Thread Reader help you today?

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!


0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy


3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!