# Softmax without Overflow

Overflow problems are common in neural network-like structures.

\(S = \dfrac{e^{x - K}}{\sum_i e^{x - K}}\)

The result is inveriant even if we add/subtract constant \(K\), because softmax function uses the sum of \(e\) to normalize the result. We need to choose \(K\). In the example below, \(K = \max (x)\) is used, but any number should be fine.

```
def softmax(x):
exp_x = np.exp(x)
return exp_x/np.sum(exp_x, axis=1, keepdims=True)
```

will become

```
def softmax(x):
e = np.exp(x - np.max(x))
if e.ndim == 1:
return e / np.sum(e, axis=0)
else: # dim = 2
return e / np.sum(e, axis=1, keepdims=True)
```

You may need to use this `e = np.exp(x - np.max(x, axis=1)[:, np.newaxis])`

.