UP | HOME

Marasović 2018 – NLP’s generalization problem, and how researchers are tackling it

Notes for MarasovicGradient2018NLP

1 questions

  1. How should we measure how well our models perform on unseen inputs. It's not enough to test on the same distribution as train.
  2. How should we modify our models

2 direction 1: inductive bias

  • what priors on structure should we build into our models?
  • models fail, because they:
    • learn language passively
    • do not learn anything about the underlying world that language is used to describe

2.1 potential solutions

  • use RL to directly optimize METEOR, CIDEr, etc.
  • use human in the loop training

3 direction 2: common sense

  • models lack social and phyiscal common sense

4 direction 3: generalizing to unseen distributions and tasks

  • we are concerned with how well a model is able to extrapolate
  • so, trained on one task, how well is the model able to perform another unrelated task
  • this is important, because we will never have enough annotated data for all the tasks in the world

Bibliography

  • [MarasovicGradient2018NLP] Marasović, NLP’s generalization problem, and how researchers are tackling it, The Gradient, (2018).

Created: 2021-09-14 Tue 21:43