Importance Sampling

1 Problem statement:

We want to draw samples from \(p(x)\), but this is hard.
We have the ability to query \(p(x)\), for any \(x\)
We have the ability to sample from \(q(x)\)

2 Solution outline:

Sample from \(q(x)\) and weight the samples by how important they are \(p(x)\)

3 Solution:

Sample \(q(x)\) \(n\) times
Assign each sample the weight \(p(x)/q(x)\)
Normalize weights across samples to sum to 1. Take from these samples (with replacement) according to their weights

4 Comments:

Consider a small slice \(\mathbf{x} = [a, a + da]\). About \(q(\mathbf{x})\) samples in step 2 will have value in \(\mathbf{x}\). If, in step 3, we took uniformly at random from the step \(2\) samples, we would have \(q(\mathbf{x})\) samples with value \(\mathbf{x}\). Instead, we should weight the value \(\mathbf{x}\) by \(p(\mathbf{x})/q(\mathbf{x})\). Then, the weighted proportion of samples with value \(\mathbf{x}\) is: \[ \frac{\text{# samples} = \mathbf{x}}{n} \times \frac{p(\mathbf{x})}{q(\mathbf{x})} = q(\mathbf{x}) \times \frac{p(\mathbf{x})}{q(\mathbf{x})} = p(\mathbf{x}) \]

Note that this is also useful when \(p(x)\) is very sparse in an area. Then, sampling directly from \(p(x)\) might result in 0 samples for a sparse region. So instead, we should sample from \(q(x)\) and re-weight.

5 computing expectation

For a random variable \(X\) with pdf \(p(x)\), we can compute an expectation over \(p(x)\) by sampling over \(q(x)\) and weighting by \(p(x)\). This is written: \[ \mathbb{E}[X] = \mathbb{E}_{X\sim q} \left[X \frac{p(X)}{q(X)}\right] \]

5.1 example

Say that we are trying to estimate the expectation of \(f(X)\), where \(X\) is a discrete r.v. with pdf \(p(x)\). Then, we draw samples from the proposal distribution \(q(x)\). \[ \hat{\mathbb{E}}[X] = \frac{1}{n}\sum_{i=1}^{n} \frac{p(x_i)}{q(x_i)} f(x_i) \approx \sum_{x\in X} \frac{p(x)}{q(x)} f(x) q(x) \] where \(\frac{p(x)}{q(x)}\) is called the likelihood ratio and the approximation holds when the number of samples \(n\) is large so that \(\frac{|\{x_i = x\}|}{n} \approx q(x)\)