View source on GitHub
|
Optimization package definition.
Modules
adafactor_optimizer module: Adafactor optimizer.
base_config module: Base configurations to standardize experiments.
configs module
ema_optimizer module: Exponential moving average optimizer.
lamb module: Layer-wise Adaptive Moments (LAMB) optimizer.
lars module: Layer-wise adaptive rate scaling optimizer.
legacy_adamw module: Adam optimizer with weight decay that exactly matches the original BERT.
lr_cfg module: Dataclasses for learning rate schedule config.
lr_schedule module: Learning rate schedule classes.
math module: This module provides access to the mathematical functions defined by the C standard.
oneof module: Config class that supports oneof functionality.
opt_cfg module: Dataclasses for optimizer configs.
optimizer_factory module: Optimizer factory class.
slide_optimizer module: SLIDE optimizer.
Classes
class AdafactorConfig: Configuration for Adafactor optimizer.
class AdagradConfig: Configuration for Adagrad optimizer.
class AdamConfig: Configuration for Adam optimizer.
class AdamExperimentalConfig: Configuration for experimental Adam optimizer.
class AdamWeightDecayConfig: Configuration for Adam optimizer with weight decay.
class AdamWeightDecayExperimentalConfig: Configuration for Adam optimizer with weight decay.
class BaseOptimizerConfig: Base optimizer config.
class ConstantLrConfig: Configuration for constant learning rate.
class CosineDecayWithOffset: A LearningRateSchedule that uses a cosine decay with optional warmup.
class CosineLrConfig: Configuration for Cosine learning rate decay.
class DirectPowerDecay: Learning rate schedule follows lr * (step)^power.
class DirectPowerLrConfig: Configuration for DirectPower learning rate decay.
class EMAConfig: Exponential moving average optimizer config.
class ExponentialDecayWithOffset: A LearningRateSchedule that uses an exponential decay schedule.
class ExponentialLrConfig: Configuration for exponential learning rate decay.
class ExponentialMovingAverage: Optimizer that computes an exponential moving average of the variables.
class LAMBConfig: Configuration for LAMB optimizer.
class LARSConfig: Layer-wise adaptive rate scaling config.
class LinearWarmup: Linear warmup schedule.
class LinearWarmupConfig: Configuration for linear warmup schedule config.
class LrConfig: Configuration for lr schedule.
class OptimizationConfig: Configuration for optimizer and learning rate schedule.
class OptimizerConfig: Configuration for optimizer.
class OptimizerFactory: Optimizer factory class.
class PiecewiseConstantDecayWithOffset: A LearningRateSchedule that uses a piecewise constant decay schedule.
class PolynomialDecayWithOffset: A LearningRateSchedule that uses a polynomial decay schedule.
class PolynomialLrConfig: Configuration for polynomial learning rate decay.
class PolynomialWarmUp: Applies polynomial warmup schedule on a given learning rate decay schedule.
class PolynomialWarmupConfig: Configuration for linear warmup schedule config.
class PowerAndLinearDecay: Learning rate schedule with multiplied by linear decay at the end.
class PowerAndLinearDecayLrConfig: Configuration for DirectPower learning rate decay.
class PowerDecayWithOffset: Power learning rate decay with offset.
class PowerDecayWithOffsetLrConfig: Configuration for power learning rate decay with step offset.
class RMSPropConfig: Configuration for RMSProp optimizer.
class SGDConfig: Configuration for SGD optimizer.
class SGDExperimentalConfig: Configuration for SGD optimizer.
class SLIDEConfig: Configuration for SLIDE optimizer.
class StepCosineDecayWithOffset: Stepwise cosine learning rate decay with offset.
class StepCosineLrConfig: Configuration for stepwise learning rate decay.
class StepwiseLrConfig: Configuration for stepwise learning rate decay.
class WarmupConfig: Configuration for lr schedule.
Functions
register_optimizer_cls(...): Register customize optimizer cls.
View source on GitHub