Model Training Settings
Last updated
Last updated
The Model Training Settings popup is where you define the settings to train with when running your model as part of the overall workflow:
This popup appears as the last step of the Data Wizard and when you click Run in the Modeling Tool.
The main elements of this screen are as follows:
Name and Model Path: allows you to specify a unique name and path to store your model. PerceptiLabs will generate a sub directory in the model's location using the specified model name. The model will be saved to a model.json file within that directory every time you save the model.
Epochs: sets the number of epochs to perform. One epoch corresponds to the number of iterations it takes to go through the entire dataset one time. The higher the number, the better the model will learn your training data. Note that training too long may overfit your model to your training data.
Batch size: the number of samples that the algorithm should train on at a time, before updating the weights in the model. Higher values will speed up the training and may make your model generalize better. However, too high of values may prevent your model from learning the data.
Loss: specifies which loss function to apply.
Learning rate: sets the learning rate for the algorithm. The value must be between 0 and 1 (default is 0.001). The higher the value, the quicker your model will learn. If the value is too high, training can skip over good local minimas. If the value is too low, training can get stuck in a poor local minima.
Save checkpoint every epoch: when enabled, saves a training checkpoint every epoch.
Optimizer: specifies which optimizer algorithm to use for the model. The optimizer continually tries new weights and biases during training until it reaches its goal of finding the optimal values for the model to make accurate predictions. Optimizers available in PerceptiLabs' Training components include: ADAM, Stochastic gradient descent (SGD), Adagrad, Momentum, and RMSprop.
Beta1: optimizer-specific parameter. See the TensorFlow Optimizers page for optimizer-specific definitions.
Beta2: optimizer-specific parameter. See the TensorFlow Optimizers page for optimizer-specific definitions.
Shuffle: randomizes the order to train the data on, to make the model more robust.
After configuring these settings click one of the following:
Run model: starts training the model and displays the Statistics View where you can see how training is progressing.
Customize: displays the Modeling Tool where you can view and edit the model's architecture.
The following loss functions are available in the Model Training Settings popup:
Cross entropy: Cross-entropy loss, or log loss, measures the performance of a classification model whose output is a probability value between 0 and 1. Cross-entropy loss increases as the predicted probability diverges from the actual label. So predicting a probability of .012 when the actual observation label is 1 would be bad and result in a high loss value. A perfect model would have a log loss of 0. See https://ml-cheatsheet.readthedocs.io/en/latest/loss_functions.html for more information.
Quadratic: Also known as Mean Squared Error, is often used for regression tasks where the loss is based on the mean squared difference between the predicted value(s) and the label(s)
Dice: Dice loss is often used for binary segmentation tasks, where it measures how much the items/objects in the image overlap. This is useful compared to pixel accuracy as it then becomes robust to class imbalance in cases where the objects you are trying to segment are over or under represented compared to the background. PerceptiLabs automatically ignores the background channel when using the Dice loss for this exact reason.