diff options
Diffstat (limited to 'microposts/random-forests.org')
| -rw-r--r-- | microposts/random-forests.org | 24 | 
1 files changed, 24 insertions, 0 deletions
| diff --git a/microposts/random-forests.org b/microposts/random-forests.org new file mode 100644 index 0000000..f52c176 --- /dev/null +++ b/microposts/random-forests.org @@ -0,0 +1,24 @@ +#+title: random-forests + +#+date: <2018-05-15> + +[[https://lagunita.stanford.edu/courses/HumanitiesSciences/StatLearning/Winter2016/info][Stanford +Lagunita's statistical learning course]] has some excellent lectures on +random forests. It starts with explanations of decision trees, followed +by bagged trees and random forests, and ends with boosting. From these +lectures it seems that: + +1. The term "predictors" in statistical learning = "features" in machine +   learning. +2. The main idea of random forests of dropping predictors for individual +   trees and aggregate by majority or average is the same as the idea of +   dropout in neural networks, where a proportion of neurons in the +   hidden layers are dropped temporarily during different minibatches of +   training, effectively averaging over an emsemble of subnetworks. Both +   tricks are used as regularisations, i.e. to reduce the variance. The +   only difference is: in random forests, all but a square root number +   of the total number of features are dropped, whereas the dropout +   ratio in neural networks is usually a half. + +By the way, here's a comparison between statistical learning and machine +learning from the slides of the Statistcal Learning course: | 
