The activation functions in PyTorch (5)

The activation functions in PyTorch (5)


Buy Me a Coffee

*Memos:

  • My post explains Step function, Identity and ReLU.
  • My post explains Leaky ReLU, PReLU and FReLU.
  • My post explains ELU, SELU and CELU.
  • My post explains GELU, Mish, SiLU and Softplus.
  • My post explains Vanishing Gradient Problem, Exploding Gradient Problem and Dying ReLU Problem.

(1) Tanh:

  • can convert an input value(x) to the output value between -1 and 1. *0 and 1 are exclusive.
  • ‘s formula is y = (ex – ex) / (ex + ex).
  • is also called Hyperbolic Tangent Function.
  • is Tanh() in PyTorch.
  • is used in:
    • RNN(Recurrent Neural Network). *RNN in PyTorch.
    • LSTM(Long Short-Term Memory). *LSTM() in PyTorch.
    • GRU(Gated Recurrent Unit). *GRU() in PyTorch.
    • GAN(Generative Adversarial Network).
  • s’pros:
    • It normalizes input values.
    • The convergence is stable.
    • It mitigates Exploding Gradient Problem.
    • It mitigates Dying ReLU Problem. *0 is still produced for the input value 0 so Dying ReLU Problem is not completely avoided.
  • s’cons:
    • It causes Vanishing Gradient Problem.
    • It’s computationally expensive because of exponential and complex operation.
  • ‘s graph in Desmos:

Image description

(2) Softsign:

  • can convert an input value(x) to the output value between 1 and -1. *1 and -1 are exclusive.
  • ‘s formula is y = x / (1 + |x|).
  • is Softsign() in PyTorch.
  • ‘s pros:
    • It normalizes input values.
    • The convergence is stable.
    • It mitigates Exploding Gradient Problem.
    • It mitigates Dying ReLU Problem. *0 is still produced for the input value 0 so Dying ReLU Problem is not completely avoided.
  • ‘s cons:
    • It causes Vanishing Gradient Problem.
  • ‘s graph in Desmos:

Image description

(3) Sigmoid:

  • can convert an input value(x) to the output value between 0 and 1. *0 and 1 are exclusive.
  • ‘s formula is y = 1 / (1 + ex).
  • is Sigmoid() in PyTorch.
  • is used in:
    • Binary Classification Model.
    • Logistic Regression.
    • LSTM.
    • GRU.
    • GAN.
  • ‘s pros:
    • It normalizes input values.
    • The convergence is stable.
    • It mitigates Exploding Gradient Problem.
    • It avoids Dying ReLU Problem.
  • ‘s cons:
    • It causes Vanishing Gradient Problem.
    • It’s computationally expensive because of exponential operation.
  • ‘s graph in Desmos:

Image description

(4) Softmax:

  • can convert input values(xs) to the output values between 0 and 1 each and whose sum is 1(100%):
    *Memos:

    • *0 and 1 are exclusive.
    • If input values are [5, 4, -1], then the output values are [0.730, 0.268, 0.002] which is 0.730(73%) + 0.268(26.8%) + 0.002(0.2%) = 1(100%).
  • ‘s formula is:
    Image description
  • is Softmax() in PyTorch.
  • is used in:
    • Multi-Class Classification Model.
  • ‘s pros:
    • It normalizes input values.
    • The convergence is stable.
    • It mitigates Exploding Gradient Problem.
    • It avoids Dying ReLU Problem.
  • ‘s cons:
    • It causes Vanishing Gradient Problem.
    • It’s computationally expensive because of exponential and complex operation.
  • ‘s graph in Desmos:

Image description

Top comments (0)



Source link
lol

By stp2y

Leave a Reply

Your email address will not be published. Required fields are marked *

No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.