diff options
author | Yuchen Pei <me@ypei.me> | 2021-06-18 12:58:44 +1000 |
---|---|---|
committer | Yuchen Pei <me@ypei.me> | 2021-06-18 12:58:44 +1000 |
commit | 147a19e84a743f1379f05bf2f444143b4afd7bd6 (patch) | |
tree | 3127395250cb958f06a98b86f73e77658150b43c /microposts/neural-nets-activation.org | |
parent | 4fa26fec8b7e978955e5630d3f820ba9c53be72c (diff) |
Updated.
Diffstat (limited to 'microposts/neural-nets-activation.org')
-rw-r--r-- | microposts/neural-nets-activation.org | 24 |
1 files changed, 24 insertions, 0 deletions
diff --git a/microposts/neural-nets-activation.org b/microposts/neural-nets-activation.org new file mode 100644 index 0000000..aee7c2d --- /dev/null +++ b/microposts/neural-nets-activation.org @@ -0,0 +1,24 @@ +#+title: neural-nets-activation + +#+date: <2018-05-09> + +#+begin_quote + What makes the rectified linear activation function better than the + sigmoid or tanh functions? At present, we have a poor understanding of + the answer to this question. Indeed, rectified linear units have only + begun to be widely used in the past few years. The reason for that + recent adoption is empirical: a few people tried rectified linear + units, often on the basis of hunches or heuristic arguments. They got + good results classifying benchmark data sets, and the practice has + spread. In an ideal world we'd have a theory telling us which + activation function to pick for which application. But at present + we're a long way from such a world. I should not be at all surprised + if further major improvements can be obtained by an even better choice + of activation function. And I also expect that in coming decades a + powerful theory of activation functions will be developed. Today, we + still have to rely on poorly understood rules of thumb and experience. +#+end_quote + +Michael Nielsen, +[[http://neuralnetworksanddeeplearning.com/chap6.html#convolutional_neural_networks_in_practice][Neutral +networks and deep learning]] |