Randomized LASSO¶
This selection algorithm allows the researcher to form a model after observing the subgradient of this optimization problem
where \(\omega \sim N(0,\Sigma)\) is Gaussian randomization with a covariance specified by the user. Data splitting is (asymptotically) a special case of this randomization mechanism.
[1]:
import numpy as np
from selectinf.randomized.api import lasso
from selectinf.tests.instance import gaussian_instance
np.random.seed(0) # for reproducibility
X, y = gaussian_instance(n=100,
p=20,
s=5,
signal=3,
equicorrelated=False,
rho=0.4,
random_signs=True)[:2]
n, p = X.shape
n, p
[1]:
(100, 20)
Randomization mechanism¶
By default, isotropic Gaussian randomization is chosen with variance chosen based on mean diagonal of \(X^TX\) and the standard deviation of \(y\).
[2]:
L = lasso.gaussian(X, y, 2 * np.diag(X.T.dot(X)) * np.std(y))
signs = L.fit()
active_set = np.nonzero(signs != 0)[0]
active_set
[2]:
array([ 1, 6, 17, 18])
We see that variables [1,6,17,18]
are chosen here.
Inference¶
For inference, the user can in principle choose any target jointly normal with \(\nabla \ell(\beta^*;X,y) = X^T(X\beta^*-y)\) where \(\beta^*\) is the population minimizer under the model \((X_i,y_i) \overset{IID}{\sim} F\).
For convenience, we have provided some targets, though our functions expect boolean representation of the active set.
[3]:
from selectinf.randomized.lasso import selected_targets
active_bool = np.zeros(p, np.bool)
active_bool[active_set] = True
(observed_target,
cov_target,
cov_target_score,
alternatives) = selected_targets(L.loglike, np.ones(n), active_bool)
Given our target \(\widehat{\theta}\) and its estimated covariance \(\Sigma\) as well as its joint covariance \(\tilde{\Gamma}\) with \(\nabla \ell(\beta^*; X,y)\) we use th linear decomposition
We have arranged things so that (pre-selection) \(N\) is uncorrelated (and asympotically independent of) \(\widehat{\theta}\).
We can then form univariate tests of \(H_{0,j}:\theta_j=0\) based on this conditional distribution. As the form is unknown, we approximate it using MCMC with ndraw
steps after a burnin of burnin
steps.
[4]:
observed_target.shape
[4]:
(4,)
[5]:
Xsel_inv = np.linalg.pinv(X[:, active_set])
np.testing.assert_allclose(observed_target, Xsel_inv.dot(y))
[6]:
dispersion = np.linalg.norm(y - X[:, active_set].dot(Xsel_inv.dot(y)))**2 / (n - len(active_set))
np.testing.assert_allclose(cov_target, dispersion * Xsel_inv.dot(Xsel_inv.T))
np.testing.assert_allclose(cov_target_score, - X.T.dot(X)[:,active_set].dot(cov_target).T, rtol=np.inf, atol=1.e-10) # some zeros so relative
[7]:
pivots, pvals, intervals = L.summary(observed_target,
cov_target, # \Sigma
cov_target_score, # \tilde{\Gamma}
alternatives,
ndraw=10000,
burnin=2000,
compute_intervals=True)
[8]:
pvals
[8]:
array([0.04349979, 0.0516205 , 0.00783708, 0.44920772])
[9]:
intervals
[9]:
array([[-29.28009003, -15.89881596],
[ 30.51031835, 37.58445178],
[-25.87190582, -8.52983107],
[ -3.73877411, 7.06250691]])