Speaking of Adafactor: why can't I select the Adafactor scheduler if the optimizer is Adafactor? I g

Speaking of Adafactor: why can't I select the Adafactor scheduler if the optimizer is Adafactor? I get an error and it suggests to use constant with warmup optimizer.
Was this page helpful?