[WIP] LAMB optimizer by francoishernandez · Pull Request #1460 · OpenNMT/OpenNMT-py

francoishernandez · 2019-06-05T14:55:24Z

[DO NOT MERGE]

This is a WIP on implementing LAMB optimizer from BERT. It apparently allows to scale training on huge batches. There are some ambiguities : different algorithms between v1 and v2/v3 of the paper, some blurry definitions and no official implementation yet (a few ones are out there but differ on a few points), no clear learning_rate schedule in the paper despite detailed experiments, etc.
Also, there might be some significant tuning to do in order to find appropriate values for our tasks.
I open this PR for future work, when we'll have more elements.

The current version here is based on https://github.com/cybertronai/pytorch-lamb, which itself is based on torch.optimizers.Adam.

alphadl · 2019-07-16T00:48:02Z

LGTM

francoishernandez added 3 commits May 23, 2019 14:52

WIP LAMB Optimizer

8e1b645

add a few comments

c2ee86b

fix flake

9dd286f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] LAMB optimizer#1460

[WIP] LAMB optimizer#1460
francoishernandez wants to merge 3 commits intoOpenNMT:masterfrom
francoishernandez:lamb_optimizer

francoishernandez commented Jun 5, 2019 •

edited

Loading

Uh oh!

alphadl commented Jul 16, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

francoishernandez commented Jun 5, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alphadl commented Jul 16, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

francoishernandez commented Jun 5, 2019 •

edited

Loading