Use tanh instead of exp in sigmoid function

less than 1 minute read

If the data is large, we encounter overflow in sigmoid function.

\(f(x) = \dfrac{1}{1 + e^{-ax}} \ (a>0)\)

A simple trick to avoid overflow is to use tanh.

\(\dfrac{1}{1+e^{-ax}}=\dfrac{1}{2}\dfrac{2e^{\frac{1}{2}ax}}{e^{\frac{1}{2}ax}+e^{-\frac{1}{2}ax}} =\dfrac{1}{2}(1+\tanh(\frac{1}{2}ax))\)

In your code,

y = 1 / (1 + numpy.exp(-x))

will become

y = numpy.tanh(x * 0.5) * 0.5 + 0.5


Reference 1
Reference 2

Updated: