randomized.lasso

Module: randomized.lasso

Inheritance diagram for selectinf.randomized.lasso:

digraph inheritance52eb725dca { rankdir=LR; size="8.0, 12.0"; "randomized.lasso.lasso" [URL="#selectinf.randomized.lasso.lasso",fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5)",target="_top",tooltip="A class for the randomized LASSO for post-selection inference."]; "randomized.query.gaussian_query" -> "randomized.lasso.lasso" [arrowsize=0.5,style="setlinewidth(0.5)"]; "randomized.lasso.split_lasso" [URL="#selectinf.randomized.lasso.split_lasso",fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5)",target="_top",tooltip="Data split, then LASSO (i.e. data carving)"]; "randomized.lasso.lasso" -> "randomized.lasso.split_lasso" [arrowsize=0.5,style="setlinewidth(0.5)"]; "randomized.query.gaussian_query" [URL="selectinf.randomized.query.html#selectinf.randomized.query.gaussian_query",fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5)",target="_top"]; "randomized.query.query" -> "randomized.query.gaussian_query" [arrowsize=0.5,style="setlinewidth(0.5)"]; "randomized.query.query" [URL="selectinf.randomized.query.html#selectinf.randomized.query.query",fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5)",target="_top",tooltip="This class is the base of randomized selective inference"]; }

Classes

lasso

class selectinf.randomized.lasso.lasso(loglike, feature_weights, ridge_term, randomizer, perturb=None)[source]

Bases: selectinf.randomized.query.gaussian_query

A class for the randomized LASSO for post-selection inference. The problem solved is

\[\text{minimize}_{\beta} \ell(\beta) + \sum_{i=1}^p \lambda_i |\beta_i\| - \omega^T\beta + \frac{\epsilon}{2} \|\beta\|^2_2\]

where \(\lambda\) is lam, \(\omega\) is a randomization generated below and the last term is a small ridge penalty. Each static method forms \(\ell\) as well as the \(\ell_1\) penalty. The generic class forms the remaining two terms in the objective.

__init__(loglike, feature_weights, ridge_term, randomizer, perturb=None)[source]

Create a new post-selection object for the LASSO problem

Parameters

loglike : regreg.smooth.glm.glm

A (negative) log-likelihood as implemented in regreg.

feature_weights : np.ndarray

Feature weights for L-1 penalty. If a float, it is brodcast to all features.

ridge_term : float

How big a ridge term to add?

randomizer : object

Randomizer – contains representation of randomization density.

perturb : np.ndarray

Random perturbation subtracted as a linear term in the objective function.

fit(solve_args={'min_its': 50, 'tol': 1e-12}, perturb=None)[source]

Fit the randomized lasso using regreg.

Parameters

solve_args : keyword args

Passed to regreg.problems.simple_problem.solve.

Returns

signs : np.float

Support and non-zero signs of randomized lasso solution.

static gaussian(X, Y, feature_weights, sigma=1.0, quadratic=None, ridge_term=None, randomizer_scale=None)[source]

Squared-error LASSO with feature weights. Objective function is (before randomization)

\[\beta \mapsto \frac{1}{2} \|Y-X\beta\|^2_2 + \sum_{i=1}^p \lambda_i |\beta_i|\]

where \(\lambda\) is feature_weights. The ridge term is determined by the Hessian and np.std(Y) by default, as is the randomizer scale.

Parameters

X : ndarray

Shape (n,p) – the design matrix.

Y : ndarray

Shape (n,) – the response.

feature_weights: [float, sequence]

Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.

sigma : float (optional)

Noise variance. Set to 1 if covariance_estimator is not None. This scales the loglikelihood by sigma**(-2).

quadratic : regreg.identity_quadratic.identity_quadratic (optional)

An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.

ridge_term : float

How big a ridge term to add?

randomizer_scale : float

Scale for IID components of randomizer.

randomizer : str

One of [‘laplace’, ‘logistic’, ‘gaussian’]

Returns

L : selection.randomized.lasso.lasso

static logistic(X, successes, feature_weights, trials=None, quadratic=None, ridge_term=None, randomizer_scale=None)[source]

Logistic LASSO with feature weights (before randomization)

\[\beta \mapsto \ell(X\beta) + \sum_{i=1}^p \lambda_i |\beta_i|\]

where \(\ell\) is the negative of the logistic log-likelihood (half the logistic deviance) and \(\lambda\) is feature_weights.

Parameters

X : ndarray

Shape (n,p) – the design matrix.

successes : ndarray

Shape (n,) – response vector. An integer number of successes. For data that is proportions, multiply the proportions by the number of trials first.

feature_weights: [float, sequence]

Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.

trials : ndarray (optional)

Number of trials per response, defaults to ones the same shape as Y.

quadratic : regreg.identity_quadratic.identity_quadratic (optional)

An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.

ridge_term : float

How big a ridge term to add?

randomizer_scale : float

Scale for IID components of randomizer.

randomizer : str

One of [‘laplace’, ‘logistic’, ‘gaussian’]

Returns

L : selection.randomized.lasso.lasso

static coxph(X, times, status, feature_weights, quadratic=None, ridge_term=None, randomizer_scale=None)[source]

Cox proportional hazards LASSO with feature weights. Objective function is (before randomization)

\[\beta \mapsto \ell^{\text{Cox}}(\beta) + \sum_{i=1}^p \lambda_i |\beta_i|\]

where \(\ell^{\text{Cox}}\) is the negative of the log of the Cox partial likelihood and \(\lambda\) is feature_weights. Uses Efron’s tie breaking method.

Parameters

X : ndarray

Shape (n,p) – the design matrix.

times : ndarray

Shape (n,) – the survival times.

status : ndarray

Shape (n,) – the censoring status.

feature_weights: [float, sequence]

Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.

covariance_estimator : optional

If None, use the parameteric covariance estimate of the selected model.

quadratic : regreg.identity_quadratic.identity_quadratic (optional)

An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.

ridge_term : float

How big a ridge term to add?

randomizer_scale : float

Scale for IID components of randomizer.

randomizer : str

One of [‘laplace’, ‘logistic’, ‘gaussian’]

Returns

L : selection.randomized.lasso.lasso

static poisson(X, counts, feature_weights, quadratic=None, ridge_term=None, randomizer_scale=None)[source]

Poisson log-linear LASSO with feature weights. Objective function is (before randomization)

\[\beta \mapsto \ell^{\text{Poisson}}(\beta) + \sum_{i=1}^p \lambda_i |\beta_i|\]

where \(\ell^{\text{Poisson}}\) is the negative of the log of the Poisson likelihood (half the deviance) and \(\lambda\) is feature_weights.

Parameters

X : ndarray

Shape (n,p) – the design matrix.

counts : ndarray

Shape (n,) – the response.

feature_weights: [float, sequence]

Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.

quadratic : regreg.identity_quadratic.identity_quadratic (optional)

An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.

ridge_term : float

How big a ridge term to add?

randomizer_scale : float

Scale for IID components of randomizer.

randomizer : str

One of [‘laplace’, ‘logistic’, ‘gaussian’]

Returns

L : selection.randomized.lasso.lasso

static sqrt_lasso(X, Y, feature_weights, quadratic=None, ridge_term=None, randomizer_scale=None, solve_args={'min_its': 200}, perturb=None)[source]

Use sqrt-LASSO to choose variables. Objective function is (before randomization)

\[\beta \mapsto \|Y-X\beta\|_2 + \sum_{i=1}^p \lambda_i |\beta_i|\]

where \(\lambda\) is feature_weights. After solving the problem treat as if gaussian with implied variance and choice of multiplier. See arxiv.org/abs/1504.08031 for details.

Parameters

X : ndarray

Shape (n,p) – the design matrix.

Y : ndarray

Shape (n,) – the response.

feature_weights: [float, sequence]

Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.

quadratic : regreg.identity_quadratic.identity_quadratic (optional)

An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.

covariance : str

One of ‘parametric’ or ‘sandwich’. Method used to estimate covariance for inference in second stage.

solve_args : dict

Arguments passed to solver.

ridge_term : float

How big a ridge term to add?

randomizer_scale : float

Scale for IID components of randomizer.

randomizer : str

One of [‘laplace’, ‘logistic’, ‘gaussian’]

Returns

L : selection.randomized.lasso.lasso

Notes

Unlike other variants of LASSO, this solves the problem on construction as the active set is needed to find equivalent gaussian LASSO. Assumes parametric model is correct for inference, i.e. does not accept a covariance estimator.

get_sampler()
randomize(perturb=None)

The actual randomization step.

Parameters

perturb : ndarray, optional

Value of randomization vector, an instance of \(\omega\).

property sampler

Sampler of optimization (augmented) variables.

selective_MLE(observed_target, target_cov, target_score_cov, level=0.9, solve_args={'tol': 1e-12})
Parameters

observed_target : ndarray

Observed estimate of target.

target_cov : ndarray

Estimated covaraince of target.

target_score_cov : ndarray

Estimated covariance of target and score of randomized query.

level : float, optional

Confidence level.

solve_args : dict, optional

Arguments passed to solver.

set_sampler(sampler)
setup_sampler()

Setup query to prepare for sampling. Should set a few key attributes:

  • observed_score_state

  • observed_opt_state

  • opt_transform

solve()
summary(observed_target, target_cov, target_score_cov, alternatives, opt_sample=None, target_sample=None, parameter=None, level=0.9, ndraw=10000, burnin=2000, compute_intervals=False)

Produce p-values and confidence intervals for targets of model including selected features

Parameters

target : one of [‘selected’, ‘full’]

features : np.bool

Binary encoding of which features to use in final model and targets.

parameter : np.array

Hypothesized value for parameter – defaults to 0.

level : float

Confidence level.

ndraw : int (optional)

Defaults to 1000.

burnin : int (optional)

Defaults to 1000.

compute_intervals : bool

Compute confidence intervals?

dispersion : float (optional)

Use a known value for dispersion, or Pearson’s X^2?

useC = True

split_lasso

class selectinf.randomized.lasso.split_lasso(loglike, feature_weights, proportion_select, ridge_term=0, perturb=None)[source]

Bases: selectinf.randomized.lasso.lasso

Data split, then LASSO (i.e. data carving)

__init__(loglike, feature_weights, proportion_select, ridge_term=0, perturb=None)[source]

Create a new post-selection object for the LASSO problem

Parameters

loglike : regreg.smooth.glm.glm

A (negative) log-likelihood as implemented in regreg.

feature_weights : np.ndarray

Feature weights for L-1 penalty. If a float, it is brodcast to all features.

ridge_term : float

How big a ridge term to add?

randomizer : object

Randomizer – contains representation of randomization density.

perturb : np.ndarray

Random perturbation subtracted as a linear term in the objective function.

fit(solve_args={'min_its': 50, 'tol': 1e-12}, perturb=None, estimate_dispersion=True)[source]

Fit the randomized lasso using regreg.

Parameters

solve_args : keyword args

Passed to regreg.problems.simple_problem.solve.

Returns

signs : np.float

Support and non-zero signs of randomized lasso solution.

static gaussian(X, Y, feature_weights, proportion, sigma=1.0, quadratic=None, ridge_term=0)[source]

Squared-error LASSO with feature weights. Objective function is (before randomization)

\[ \beta \mapsto \frac{1}{2} \|Y-X\beta\|^2_2 + \sum_{i=1}^p \lambda_i |\beta_i|\]

where \(\lambda\) is feature_weights. The ridge term is determined by the Hessian and np.std(Y) by default.

Parameters

X : ndarray

Shape (n,p) – the design matrix.

Y : ndarray

Shape (n,) – the response.

feature_weights: [float, sequence]

Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.

sigma : float (optional)

Noise variance. Set to 1 if covariance_estimator is not None. This scales the loglikelihood by sigma**(-2).

quadratic : regreg.identity_quadratic.identity_quadratic (optional)

An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.

randomizer_scale : float

Scale for IID components of randomizer.

randomizer : str

One of [‘laplace’, ‘logistic’, ‘gaussian’]

Returns

L : selection.randomized.lasso.lasso

static coxph(X, times, status, feature_weights, quadratic=None, ridge_term=None, randomizer_scale=None)

Cox proportional hazards LASSO with feature weights. Objective function is (before randomization)

\[\beta \mapsto \ell^{\text{Cox}}(\beta) + \sum_{i=1}^p \lambda_i |\beta_i|\]

where \(\ell^{\text{Cox}}\) is the negative of the log of the Cox partial likelihood and \(\lambda\) is feature_weights. Uses Efron’s tie breaking method.

Parameters

X : ndarray

Shape (n,p) – the design matrix.

times : ndarray

Shape (n,) – the survival times.

status : ndarray

Shape (n,) – the censoring status.

feature_weights: [float, sequence]

Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.

covariance_estimator : optional

If None, use the parameteric covariance estimate of the selected model.

quadratic : regreg.identity_quadratic.identity_quadratic (optional)

An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.

ridge_term : float

How big a ridge term to add?

randomizer_scale : float

Scale for IID components of randomizer.

randomizer : str

One of [‘laplace’, ‘logistic’, ‘gaussian’]

Returns

L : selection.randomized.lasso.lasso

get_sampler()
static logistic(X, successes, feature_weights, trials=None, quadratic=None, ridge_term=None, randomizer_scale=None)

Logistic LASSO with feature weights (before randomization)

\[\beta \mapsto \ell(X\beta) + \sum_{i=1}^p \lambda_i |\beta_i|\]

where \(\ell\) is the negative of the logistic log-likelihood (half the logistic deviance) and \(\lambda\) is feature_weights.

Parameters

X : ndarray

Shape (n,p) – the design matrix.

successes : ndarray

Shape (n,) – response vector. An integer number of successes. For data that is proportions, multiply the proportions by the number of trials first.

feature_weights: [float, sequence]

Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.

trials : ndarray (optional)

Number of trials per response, defaults to ones the same shape as Y.

quadratic : regreg.identity_quadratic.identity_quadratic (optional)

An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.

ridge_term : float

How big a ridge term to add?

randomizer_scale : float

Scale for IID components of randomizer.

randomizer : str

One of [‘laplace’, ‘logistic’, ‘gaussian’]

Returns

L : selection.randomized.lasso.lasso

static poisson(X, counts, feature_weights, quadratic=None, ridge_term=None, randomizer_scale=None)

Poisson log-linear LASSO with feature weights. Objective function is (before randomization)

\[\beta \mapsto \ell^{\text{Poisson}}(\beta) + \sum_{i=1}^p \lambda_i |\beta_i|\]

where \(\ell^{\text{Poisson}}\) is the negative of the log of the Poisson likelihood (half the deviance) and \(\lambda\) is feature_weights.

Parameters

X : ndarray

Shape (n,p) – the design matrix.

counts : ndarray

Shape (n,) – the response.

feature_weights: [float, sequence]

Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.

quadratic : regreg.identity_quadratic.identity_quadratic (optional)

An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.

ridge_term : float

How big a ridge term to add?

randomizer_scale : float

Scale for IID components of randomizer.

randomizer : str

One of [‘laplace’, ‘logistic’, ‘gaussian’]

Returns

L : selection.randomized.lasso.lasso

randomize(perturb=None)

The actual randomization step.

Parameters

perturb : ndarray, optional

Value of randomization vector, an instance of \(\omega\).

property sampler

Sampler of optimization (augmented) variables.

selective_MLE(observed_target, target_cov, target_score_cov, level=0.9, solve_args={'tol': 1e-12})
Parameters

observed_target : ndarray

Observed estimate of target.

target_cov : ndarray

Estimated covaraince of target.

target_score_cov : ndarray

Estimated covariance of target and score of randomized query.

level : float, optional

Confidence level.

solve_args : dict, optional

Arguments passed to solver.

set_sampler(sampler)
setup_sampler()

Setup query to prepare for sampling. Should set a few key attributes:

  • observed_score_state

  • observed_opt_state

  • opt_transform

solve()
static sqrt_lasso(X, Y, feature_weights, quadratic=None, ridge_term=None, randomizer_scale=None, solve_args={'min_its': 200}, perturb=None)

Use sqrt-LASSO to choose variables. Objective function is (before randomization)

\[\beta \mapsto \|Y-X\beta\|_2 + \sum_{i=1}^p \lambda_i |\beta_i|\]

where \(\lambda\) is feature_weights. After solving the problem treat as if gaussian with implied variance and choice of multiplier. See arxiv.org/abs/1504.08031 for details.

Parameters

X : ndarray

Shape (n,p) – the design matrix.

Y : ndarray

Shape (n,) – the response.

feature_weights: [float, sequence]

Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.

quadratic : regreg.identity_quadratic.identity_quadratic (optional)

An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.

covariance : str

One of ‘parametric’ or ‘sandwich’. Method used to estimate covariance for inference in second stage.

solve_args : dict

Arguments passed to solver.

ridge_term : float

How big a ridge term to add?

randomizer_scale : float

Scale for IID components of randomizer.

randomizer : str

One of [‘laplace’, ‘logistic’, ‘gaussian’]

Returns

L : selection.randomized.lasso.lasso

Notes

Unlike other variants of LASSO, this solves the problem on construction as the active set is needed to find equivalent gaussian LASSO. Assumes parametric model is correct for inference, i.e. does not accept a covariance estimator.

summary(observed_target, target_cov, target_score_cov, alternatives, opt_sample=None, target_sample=None, parameter=None, level=0.9, ndraw=10000, burnin=2000, compute_intervals=False)

Produce p-values and confidence intervals for targets of model including selected features

Parameters

target : one of [‘selected’, ‘full’]

features : np.bool

Binary encoding of which features to use in final model and targets.

parameter : np.array

Hypothesized value for parameter – defaults to 0.

level : float

Confidence level.

ndraw : int (optional)

Defaults to 1000.

burnin : int (optional)

Defaults to 1000.

compute_intervals : bool

Compute confidence intervals?

dispersion : float (optional)

Use a known value for dispersion, or Pearson’s X^2?

useC = True

Functions

selectinf.randomized.lasso.debiased_targets(loglike, W, features, sign_info={}, penalty=None, dispersion=None, approximate_inverse='JM', debiasing_args={})[source]
selectinf.randomized.lasso.form_targets(target, loglike, W, features, **kwargs)[source]
selectinf.randomized.lasso.full_targets(loglike, W, features, dispersion=None, solve_args={'min_its': 50, 'tol': 1e-12})[source]
selectinf.randomized.lasso.selected_targets(loglike, W, features, sign_info={}, dispersion=None, solve_args={'min_its': 50, 'tol': 1e-12})[source]