patched lime model diff from paper

author: Yuchen Pei <me@ypei.me> 2018-12-03 09:21:29 +0100
committer: Yuchen Pei <me@ypei.me> 2018-12-03 09:21:29 +0100
commit: bfb344527a1628a43fd10b71a4f034fd11c818d7 (patch)
tree: 8e2e19e891a3fc3d1e6317e685660515e951468f /posts
parent: 52fbfc6b9bf7aae292466e2e7b67d3d419da59dd (diff)
1 files changed, 12 insertions, 0 deletions
diff --git a/posts/2018-12-02-lime-shapley.md b/posts/2018-12-02-lime-shapley.md
index 5ccf701..0e80c88 100644
--- a/posts/2018-12-02-lime-shapley.md
+++ b/posts/2018-12-02-lime-shapley.md
@@ -123,6 +123,18 @@ The LIME model has a more general framework, but the specific model
 considered in the paper is the one described above, with a Lasso for
 feature selection.
 
+One difference between our account here and the one in the LIME paper
+is: the dimension of the data space may differ from $n$ (see Section 3.1 of that paper).
+But in the case of text data, they do use bag-of-words (our $X$) for an "intermediate"
+representation. So my understanding is, in their context, there is an
+"original" data space (let's call it $X'$). And there is a one-one correspondence
+between $X'$ and $X$ (let's call it $r: X' \to X$), so that given a 
+sample $x' \in X'$, we can compute the output of $S$ in the local model 
+with $f(r^{-1}(h_{r(x')}(S)))$.
+As an example, in the example of $X$ being the bag of words, $X'$ may be 
+the embedding vector space, so that $r(x') = A^{-1} x'$, where $A$ 
+is the word embedding matrix.
+
 Shapley values and LIME
 -----------------------
author	Yuchen Pei <me@ypei.me>	2018-12-03 09:21:29 +0100
committer	Yuchen Pei <me@ypei.me>	2018-12-03 09:21:29 +0100
commit	bfb344527a1628a43fd10b71a4f034fd11c818d7 (patch)
tree	8e2e19e891a3fc3d1e6317e685660515e951468f /posts
parent	52fbfc6b9bf7aae292466e2e7b67d3d419da59dd (diff)