diff options
Diffstat (limited to 'microposts/neural-nets-activation.org')
-rw-r--r-- | microposts/neural-nets-activation.org | 24 |
1 files changed, 24 insertions, 0 deletions
diff --git a/microposts/neural-nets-activation.org b/microposts/neural-nets-activation.org new file mode 100644 index 0000000..aee7c2d --- /dev/null +++ b/microposts/neural-nets-activation.org @@ -0,0 +1,24 @@ +#+title: neural-nets-activation + +#+date: <2018-05-09> + +#+begin_quote + What makes the rectified linear activation function better than the + sigmoid or tanh functions? At present, we have a poor understanding of + the answer to this question. Indeed, rectified linear units have only + begun to be widely used in the past few years. The reason for that + recent adoption is empirical: a few people tried rectified linear + units, often on the basis of hunches or heuristic arguments. They got + good results classifying benchmark data sets, and the practice has + spread. In an ideal world we'd have a theory telling us which + activation function to pick for which application. But at present + we're a long way from such a world. I should not be at all surprised + if further major improvements can be obtained by an even better choice + of activation function. And I also expect that in coming decades a + powerful theory of activation functions will be developed. Today, we + still have to rely on poorly understood rules of thumb and experience. +#+end_quote + +Michael Nielsen, +[[http://neuralnetworksanddeeplearning.com/chap6.html#convolutional_neural_networks_in_practice][Neutral +networks and deep learning]] |