From ff0ab387f61ea0d35a73d599356794a41d694abb Mon Sep 17 00:00:00 2001 From: Yuchen Pei Date: Wed, 9 May 2018 14:16:03 +0200 Subject: added a mpost --- microposts/neural-nets-activation.md | 6 ++++++ 1 file changed, 6 insertions(+) create mode 100644 microposts/neural-nets-activation.md diff --git a/microposts/neural-nets-activation.md b/microposts/neural-nets-activation.md new file mode 100644 index 0000000..a0d7a20 --- /dev/null +++ b/microposts/neural-nets-activation.md @@ -0,0 +1,6 @@ +--- +date: 2018-05-09 +--- +> What makes the rectified linear activation function better than the sigmoid or tanh functions? At present, we have a poor understanding of the answer to this question. Indeed, rectified linear units have only begun to be widely used in the past few years. The reason for that recent adoption is empirical: a few people tried rectified linear units, often on the basis of hunches or heuristic arguments. They got good results classifying benchmark data sets, and the practice has spread. In an ideal world we'd have a theory telling us which activation function to pick for which application. But at present we're a long way from such a world. I should not be at all surprised if further major improvements can be obtained by an even better choice of activation function. And I also expect that in coming decades a powerful theory of activation functions will be developed. Today, we still have to rely on poorly understood rules of thumb and experience. + +Michael Nielsen, [Neutral networks and deep learning](http://neuralnetworksanddeeplearning.com/chap6.html#convolutional_neural_networks_in_practice) \ No newline at end of file -- cgit v1.2.3