minor edit

author: Yuchen Pei <me@ypei.me> 2018-12-02 16:31:49 +0100
committer: Yuchen Pei <me@ypei.me> 2018-12-02 16:31:49 +0100
commit: 530ec9b303cf4689bd09c84c9cec44239eb8ae70 (patch)
tree: 66790ed0b6db5754cd1220ad7c6f2df337b9eefb /posts
parent: 2f1e508e6e15bdd2e45b2e03d14845f5f46bc261 (diff)
1 files changed, 14 insertions, 14 deletions
diff --git a/posts/2018-12-02-lime-shapley.md b/posts/2018-12-02-lime-shapley.md
index 88b1b20..add6881 100644
--- a/posts/2018-12-02-lime-shapley.md
+++ b/posts/2018-12-02-lime-shapley.md
@@ -8,7 +8,7 @@ comments: true
 In this post I explain LIME (Ribeiro et. al. 2016), the Shapley values
 (Shapley, 1953) and the SHAP values (Lundberg-Lee, 2017).
 
-Shapley values {#Shapley values}
+Shapley values
 --------------
 
 A coalitional game $(v, N)$ of $n$ players involves
@@ -37,14 +37,14 @@ $i$.
 The Shapley values satisfy some nice properties which are readily
 verified, including:
 
--   []{#Efficiency}**Efficiency**.
+-   **Efficiency**.
     $\sum_i \phi_i(v) = v(N) - v(\emptyset)$.
--   []{#Symmetry}**Symmetry**. If for some $i, j \in N$, for all
+-   **Symmetry**. If for some $i, j \in N$, for all
     $S \subset N$, we have $v(S + i) = v(S + j)$, then
     $\phi_i(v) = \phi_j(v)$.
--   []{#Null player}**Null player**. If for some $i \in N$, for all
+-   **Null player**. If for some $i \in N$, for all
     $S \subset N$, we have $v(S + i) = v(S)$, then $\phi_i(v) = 0$.
--   []{#Linearity}**Linearity**. $\phi_i$ is linear in games. That is
+-   **Linearity**. $\phi_i$ is linear in games. That is
     $\phi_i(v) + \phi_i(w) = \phi_i(v + w)$, where $v + w$ is defined by
     $(v + w)(S) := v(S) + w(S)$.
 
@@ -61,7 +61,7 @@ Conversely, according to Strumbelj-Kononenko (2010), it was shown in
 Shapley\'s original paper (Shapley, 1953) that these four properties
 together with $v(\emptyset) = 0$ defines the Shapley values.
 
-LIME {#LIME}
+LIME
 ----
 
 LIME (Ribeiro et. al. 2016) is a model that offers a way to explain
@@ -100,7 +100,7 @@ of perturbed samples $x_i = h_x(S_i)$ in the $X$ space, and then fits a
 linear model $g$ using $1_{S_i}$ as the input samples and $f(h_x(S_i))$
 as the output samples:
 
-[]{#Problem}**Problem**(LIME). Find $w = (w_1, w_2, ..., w_n)$ that
+**Problem**(LIME). Find $w = (w_1, w_2, ..., w_n)$ that
 minimises
 
 $$\sum_i (w \cdot 1_{S_i} - f(h_x(S_i)))^2 \pi_x(h_x(S_i))$$
@@ -142,11 +142,11 @@ $\sum_{i = 1}^n w_i = f(x) - f(h_x(\emptyset)) = v(N) - v(\emptyset)$.
 
 Then the problem becomes a weighted linear regression:
 
-[]{#Problem}**Problem**. minimises
+**Problem**. minimises
 $\sum_{S \subset N} (w(S) - v(S))^2 q(s)$ over $w$ subject to
 $w(N) = v(N) - v(\emptyset)$.
 
-[]{#Claim}**Claim** (Charnes et. al. 1988). The solution to this problem
+**Claim** (Charnes et. al. 1988). The solution to this problem
 is
 
 $$w_i = {1 \over n} (v(N) - v(\emptyset)) + \left(\sum_{s = 1}^{n - 1} {n - 2 \choose s - 1} q(s)\right)^{-1} \sum_{S \subset N: i \in S} \left({n - s \over n} q(s) v(S) - {s - 1 \over n} q(s - 1) v(S - i)\right). \qquad (-1)$$
@@ -157,7 +157,7 @@ $$q(s) = c {n - 2 \choose s - 1}^{-1}$$
 
 for any constant $c$, then $w_i = \phi_i(v)$ are the Shapley values.
 
-[]{#Remark}**Remark**. Don\'t worry about this specific choice of $q(s)$
+**Remark**. Don\'t worry about this specific choice of $q(s)$
 when $s = 0$ or $n$, because $q(0)$ and $q(n)$ do not appear on the
 right hand side of (-1). Therefore they can be defined to be of any
 value. A common convention of the binomial coefficients is to set
@@ -167,7 +167,7 @@ $q(0) = q(n) = \infty$.
 In Lundberg-Lee (2017), $c$ is chosen to be $1 / n$, see Theorem 2
 there.
 
-[]{#Proof}**Proof**. The Lagrangian is
+**Proof**. The Lagrangian is
 
 $$L(w, \lambda) = \sum_{S \subset N} (v(S) - w(S))^2 q(s) - \lambda(w(N) - v(N) + v(\emptyset)).$$
 
@@ -219,7 +219,7 @@ $$\sum_{S \subset N} s q(s) v(S) = \sum_{S: i \in S} (s q(s) v(S) + (s - 1) q(s
 
 Plugging this back into (6) we get the desired result. $\square$
 
-SHAP {#SHAP}
+SHAP
 ----
 
 The SHAP paper (Lundberg-Lee 2017) is not clear in its definition of the
@@ -266,12 +266,12 @@ x_i, & \text{if }i \in S; \\
 \mathbb E_{\mu_i} z_i, & \text{otherwise.}
 \end{cases}$$
 
-How about DeepLIFT? {#How about DeepLIFT?}
+How about DeepLIFT?
 -------------------
 
 TODO
 
-References {#References}
+References
 ----------
 
 -   Charnes, A., B. Golany, M. Keane, and J. Rousseau. "Extremal
author	Yuchen Pei <me@ypei.me>	2018-12-02 16:31:49 +0100
committer	Yuchen Pei <me@ypei.me>	2018-12-02 16:31:49 +0100
commit	530ec9b303cf4689bd09c84c9cec44239eb8ae70 (patch)
tree	66790ed0b6db5754cd1220ad7c6f2df337b9eefb /posts
parent	2f1e508e6e15bdd2e45b2e03d14845f5f46bc261 (diff)