------------------------------------------------------------------------------------------
function advantages disadvantages
------------------------------------------------------------------------------------------
saturated
sigmod smooth not zero-centered
exp() is a bit compute expensive
------------------------------------------------------------------------------------------
smooth saturated
tanh zero-centered exp() is a bit compute expensive
------------------------------------------------------------------------------------------
not saturated in pos regime not zero-centered
ReLU computationally efficient dead ReLU
speed converge
------------------------------------------------------------------------------------------
inherit ReLU not zero-centered
Leaky-ReLU not 'die'
------------------------------------------------------------------------------------------
universal approximator for convex-fun
maxout Generalizes ReLU and Leaky ReLU double parameters
not saturate, not die!
------------------------------------------------------------------------------------------
All benefits of ReLU
ELU Closer to zero mean outputs exp() is a bit compute
Adds some robustness to noise
------------------------------------------------------------------------------------------