diff options
Diffstat (limited to 'posts/2019-01-03-discriminant-analysis.org')
-rw-r--r-- | posts/2019-01-03-discriminant-analysis.org | 9 |
1 files changed, 9 insertions, 0 deletions
diff --git a/posts/2019-01-03-discriminant-analysis.org b/posts/2019-01-03-discriminant-analysis.org index 34c16bf..a0ada73 100644 --- a/posts/2019-01-03-discriminant-analysis.org +++ b/posts/2019-01-03-discriminant-analysis.org @@ -23,6 +23,7 @@ under CC BY-SA and GNU FDL./ ** Theory :PROPERTIES: :CUSTOM_ID: theory + :ID: 69be3baf-7f60-42f2-9184-ee8840eea554 :END: Quadratic discriminant analysis (QDA) is a classical classification algorithm. It assumes that the data is generated by Gaussian @@ -69,6 +70,7 @@ be independent. *** QDA :PROPERTIES: :CUSTOM_ID: qda + :ID: f6e95892-01cf-4569-b01e-22ed238d0577 :END: We look at QDA. @@ -94,6 +96,7 @@ sample for each class. *** Vanilla LDA :PROPERTIES: :CUSTOM_ID: vanilla-lda + :ID: 5a6ca0ca-f385-4054-9b19-9cac69b1a59a :END: Now let us look at LDA. @@ -127,6 +130,7 @@ nearest neighbour classifier. *** Nearest neighbour classifier :PROPERTIES: :CUSTOM_ID: nearest-neighbour-classifier + :ID: 8880764c-6fbe-4023-97dd-9711c7c50ea9 :END: More specifically, we want to transform the first term of (0) to a norm to get a classifier based on nearest neighbour modulo $\log \pi_i$: @@ -160,6 +164,7 @@ $A \mu_i$ (again, modulo $\log \pi_i$) and label the input with $i$. *** Dimensionality reduction :PROPERTIES: :CUSTOM_ID: dimensionality-reduction + :ID: 70e1afc1-9c45-4a35-a842-48573e077b36 :END: We can further simplify the prediction by dimensionality reduction. Assume $n_c \le n$. Then the centroid spans an affine space of dimension @@ -195,6 +200,7 @@ words, the prediction does not change regardless of =n_components=. *** Fisher discriminant analysis :PROPERTIES: :CUSTOM_ID: fisher-discriminant-analysis + :ID: 05ff25da-8c52-4f20-a0ac-4422f19e10ce :END: The Fisher discriminant analysis involves finding an $n$-dimensional vector $a$ that maximises between-class covariance with respect to @@ -232,6 +238,7 @@ $a = c V_x D_x^{-1} V_m$ with $p = 1$. *** Linear model :PROPERTIES: :CUSTOM_ID: linear-model + :ID: feb827b6-0064-4192-b96b-86a942c8839e :END: The model is called linear discriminant analysis because it is a linear model. To see this, let $B = V_m^T D_x^{-1} V_x^T$ be the matrix of @@ -256,6 +263,7 @@ This is how scikit-learn implements LDA, by inheriting from ** Implementation :PROPERTIES: :CUSTOM_ID: implementation + :ID: b567283c-20ee-41a8-8216-7392066a5ac5 :END: This is where things get interesting. How do I validate my understanding of the theory? By implementing and testing the algorithm. @@ -279,6 +287,7 @@ The result is *** Fun facts about LDA :PROPERTIES: :CUSTOM_ID: fun-facts-about-lda + :ID: f1d47f43-27f6-49dd-bd0d-2e685c38e241 :END: One property that can be used to test the LDA implementation is the fact that the scatter matrix $B(X - \bar x)^T (X - \bar X) B^T$ of the |