blob: aee7c2de94a36423d9f43f46654612cf86e0c316 (
plain) (
blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
|
#+title: neural-nets-activation
#+date: <2018-05-09>
#+begin_quote
What makes the rectified linear activation function better than the
sigmoid or tanh functions? At present, we have a poor understanding of
the answer to this question. Indeed, rectified linear units have only
begun to be widely used in the past few years. The reason for that
recent adoption is empirical: a few people tried rectified linear
units, often on the basis of hunches or heuristic arguments. They got
good results classifying benchmark data sets, and the practice has
spread. In an ideal world we'd have a theory telling us which
activation function to pick for which application. But at present
we're a long way from such a world. I should not be at all surprised
if further major improvements can be obtained by an even better choice
of activation function. And I also expect that in coming decades a
powerful theory of activation functions will be developed. Today, we
still have to rely on poorly understood rules of thumb and experience.
#+end_quote
Michael Nielsen,
[[http://neuralnetworksanddeeplearning.com/chap6.html#convolutional_neural_networks_in_practice][Neutral
networks and deep learning]]
|