Batch normalization: specifies if batch normalization should be used. During training, the distribution of each layer's inputs can change as updates are made to values in previous layers. This phenomenon known as covariate shift occurs when models experience saturating nonlinearities in layer values. Batch normalization can address this by normalizing values of mini batches within the model, potentially allowing for higher learning rates and introducing regularization.