diff options
Diffstat (limited to 'microposts/random-forests.org')
-rw-r--r-- | microposts/random-forests.org | 24 |
1 files changed, 24 insertions, 0 deletions
diff --git a/microposts/random-forests.org b/microposts/random-forests.org new file mode 100644 index 0000000..f52c176 --- /dev/null +++ b/microposts/random-forests.org @@ -0,0 +1,24 @@ +#+title: random-forests + +#+date: <2018-05-15> + +[[https://lagunita.stanford.edu/courses/HumanitiesSciences/StatLearning/Winter2016/info][Stanford +Lagunita's statistical learning course]] has some excellent lectures on +random forests. It starts with explanations of decision trees, followed +by bagged trees and random forests, and ends with boosting. From these +lectures it seems that: + +1. The term "predictors" in statistical learning = "features" in machine + learning. +2. The main idea of random forests of dropping predictors for individual + trees and aggregate by majority or average is the same as the idea of + dropout in neural networks, where a proportion of neurons in the + hidden layers are dropped temporarily during different minibatches of + training, effectively averaging over an emsemble of subnetworks. Both + tricks are used as regularisations, i.e. to reduce the variance. The + only difference is: in random forests, all but a square root number + of the total number of features are dropped, whereas the dropout + ratio in neural networks is usually a half. + +By the way, here's a comparison between statistical learning and machine +learning from the slides of the Statistcal Learning course: |