Real-time monitoring of training jobs using built-in and custom rules
In this section, you will use Debugger capabilities to monitor a job with built-in and custom rules to detect sub-optimal training conditions such as LossNotDecreasing
and ExplodingGradients
.
SageMaker provides a set of built-in rules to identify common training issues such as class_imbalance
, loss_no_decreasing
, and overfitting
.
Note
The complete list of SageMaker built-in rules can be accessed here: https://docs.aws.amazon.com/sagemaker/latest/dg/debugger-built-in-rules.html.
The following code sample shows how to configure built_in
rules with SageMaker Debugger:
#Specify the rules you want to run built_in_rules=[ #Check for loss not decreasing during training and stop the training job. Rule.sagemaker( rule_configs.loss_not_decreasing...