The loss functions in PyTorch

*My post explains optimizers in PyTorch.

A loss function is the function which can get the mean(average) of the sum of the losses(differences) between a model’s predictions and true values(train or test data) to optimize a model during training or to evaluate how good a model is during testing. *Loss function is also called Cost Function or Error Function.

There are popular loss functions as shown below:

(1) L1 Loss:

can compute the mean(average) of the sum of the absolute losses(differences) between a model’s predictions and true values(train and test data).
‘s formula:
is used for a regression model.
is also called Mean Absolute Error(MAE).
is L1Loss() in PyTorch.
‘s pros:
- It’s less sensitive to outliers and anomalies.
- The losses can be easily compared because they are just made absolute so the range of them is not big.
‘s cons:

(2) L2 Loss:

can compute the mean(average) of the sum of the squared losses(differences) between a model’s predictions and true values(train and test data).
‘s formula:
is used for a regression model.
is also called Mean Squared Error(MSE).
is MSELoss() in PyTorch
‘s pros:
- All squared losses can be differentiable.
‘s cons:
- It’s sensitive to outliers and anomalies.
- The losses cannot be easily compared because they are squared so the range of them is big.

(3) Huber Loss:

can do the similar computation of either L1 Loss or L2 Loss depending on the absolute losses(differences) between a model’s predictions and true values(train and test data) compared with delta which you set.
*Memos:
- delta is 1.0 basically.
- Be careful, the computation is not exactly same as L1 Loss or L2 Loss according to the formulas below.
‘s formula. *The 1st one is L2 Loss-like one and the 2nd one is L1 Loss-like one:
is used for a regression model.
is HuberLoss() in PyTorch.
with delta of 1.0 is same as Smooth L1 Loss which is SmoothL1Loss() in PyTorch.
‘s pros:
- It’s less sensitive to outliers and anomalies.
- All losses can be differentiable.
- The losses can be more easily compared than L2 Loss because only small losses are squared so the range of them is smaller than L2 Loss.
‘s cons:
- The computation is more than L1 Loss and L2 Loss because the formula is more complex than them.

(4) BCE(Binary Cross Entropy) Loss:

can compute the mean(average) of the sum of the losses(differences) between a model’s binary predictions and true binary values(train and test data).
s’ formula:
is used for Binary Classification. *Binary Classification is the technology to classify data into two classes.
is also called Binary Cross Entropy or Log(Logarithmic) Loss.
is BCELoss() in PyTorch.
*Memos:

(5) Cross Entropy Loss:

can compute the mean(average) of the sum of the losses(differences) between a model’s predictions and true values(train and test data). *A loss is between 0 and 1.
s’ formula:
is used for Multiclass Classification and Computer Vision.
*Memos:
- Multiclass Classification is the technology to classify data into multiple classes.
- Computer vision is the technology which enables a computer to understand objects.
is CrossEntropyLoss() in PyTorch.

Source link
lol

The loss functions in PyTorch

By stp2y

Leave a Reply Cancel reply