algorithms.lasso¶
Module: algorithms.lasso
¶
Inheritance diagram for selectinf.algorithms.lasso
:
This module contains a class lasso that implements post selection for the lasso as described in `post selection LASSO`_. .. _covTest: http://arxiv.org/abs/1301.7161 .. _Kac Rice: http://arxiv.org/abs/1308.3020 .. _Spacings: http://arxiv.org/abs/1401.3889 .. _post selection LASSO: http://arxiv.org/abs/1311.6238 .. _sample carving: http://arxiv.org/abs/1410.2597
Classes¶
ROSI
¶
-
class
selectinf.algorithms.lasso.
ROSI
(loglike, feature_weights, approximate_inverse='BN')[source]¶ Bases:
selectinf.algorithms.lasso.lasso
A class for the LASSO for post-selection inference. The problem solved is .. math:
\text{minimize}_{\beta} \frac{1}{2n} \|y-X\beta\|^2_2 + \lambda \|\beta\|_1
where \(\lambda\) is lam. Notes —– In solving the debiasing problem to approximate the inverse of (X^TWX) in a GLM, this class makes the implicit assumption that the scaling of X is such that diag(X^TWX) is O(n) with n=X.shape[0]. That is, X’s are similar to IID samples from a population that does not depend on n.
-
__init__
(loglike, feature_weights, approximate_inverse='BN')[source]¶ Create a new post-selection for the LASSO problem Parameters ———- loglike : regreg.smooth.glm.glm
A (negative) log-likelihood as implemented in regreg.
- feature_weightsnp.ndarray
Feature weights for L-1 penalty. If a float, it is brodcast to all features.
- approximate_inversestr (optional)
One of “JM” (Javanmard, Montanari) or “BN” (Boot, Niedderling) or None. A form of approximate inverse when p is close to (or larger) than n.
-
fit
(lasso_solution=None, solve_args={'min_its': 50, 'tol': 1e-12}, debiasing_args={})[source]¶ Fit the lasso using regreg. This sets the attributes soln, onestep and forms the constraints necessary for post-selection inference by calling form_constraints(). Parameters ———- lasso_solution : optional
If not None, this is taken to be the solution of the optimization problem. No checks are done, though the implied affine constraints will generally not be satisfied.
- solve_argskeyword args
Passed to regreg.problems.simple_problem.solve.
- debiasing_argsdict
Arguments passed to .debiased_lasso.debiasing_matrix or .debiased_lasso.pseudoinverse_debiasing_matrix depending on self.approximate_inverse.
- solnnp.float
Solution to lasso.
Notes
If self already has an attribute lasso_solution this will be taken to be the solution and no optimization problem will be solved. Supplying the optional argument lasso_solution will overwrite self’s lasso_solution.
-
summary
(level=0.95, compute_intervals=False, dispersion=None, truth=None)[source]¶ Summary table for inference adjusted for selection. Parameters ———- level : float
Form level*100% selective confidence intervals.
- compute_intervalsbool
Should we compute confidence intervals?
- dispersionfloat
Estimate of dispersion. Defaults to a Pearson’s X^2 estimate in the relaxed model.
- truthnp.array
True values of each beta for selected variables. If not None, a column ‘pval’ are p-values computed under these corresponding null hypotheses.
- Returns
pval_summary : np.recarray
Array with one entry per active variable. Columns are ‘variable’, ‘pval’, ‘lasso’, ‘onestep’, ‘lower_trunc’, ‘upper_trunc’, ‘sd’.
-
property
soln
¶ Solution to the lasso problem, set by fit method.
-
classmethod
gaussian
(X, Y, feature_weights, sigma=1.0, covariance_estimator=None, quadratic=None, approximate_inverse=None)[source]¶ Squared-error LASSO with feature weights. Objective function is $$ beta mapsto frac{1}{2} |Y-Xbeta|^2_2 + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\lambda\) is feature_weights. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- Yndarray
Shape (n,) – the response.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- sigmafloat (optional)
Noise variance. Set to 1 if covariance_estimator is not None. This scales the loglikelihood by sigma**(-2).
- covariance_estimatorcallable (optional)
If None, use the parameteric covariance estimate of the selected model.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
L : selection.algorithms.lasso.lasso
Notes
If not None, covariance_estimator should take arguments (beta, active, inactive) and return an estimate of some of the rows and columns of the covariance of \((\bar{\beta}_E, \nabla \ell(\bar{\beta}_E)_{-E})\), the unpenalized estimator and the inactive coordinates of the gradient of the likelihood at the unpenalized estimator.
-
classmethod
logistic
(X, successes, feature_weights, trials=None, covariance_estimator=None, quadratic=None, approximate_inverse=None)[source]¶ Logistic LASSO with feature weights. Objective function is $$ beta mapsto ell(Xbeta) + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\ell\) is the negative of the logistic log-likelihood (half the logistic deviance) and \(\lambda\) is feature_weights. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- successesndarray
Shape (n,) – response vector. An integer number of successes. For data that is proportions, multiply the proportions by the number of trials first.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- trialsndarray (optional)
Number of trials per response, defaults to ones the same shape as Y.
- covariance_estimatoroptional
If None, use the parameteric covariance estimate of the selected model.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
L : selection.algorithms.lasso.lasso
Notes
If not None, covariance_estimator should take arguments (beta, active, inactive) and return an estimate of the covariance of \((\bar{\beta}_E, \nabla \ell(\bar{\beta}_E)_{-E})\), the unpenalized estimator and the inactive coordinates of the gradient of the likelihood at the unpenalized estimator.
-
classmethod
poisson
(X, counts, feature_weights, covariance_estimator=None, quadratic=None, approximate_inverse=None)[source]¶ Poisson log-linear LASSO with feature weights. Objective function is $$ beta mapsto ell^{text{Poisson}}(beta) + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\ell^{\text{Poisson}}\) is the negative of the log of the Poisson likelihood (half the deviance) and \(\lambda\) is feature_weights. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- countsndarray
Shape (n,) – the response.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- covariance_estimatoroptional
If None, use the parameteric covariance estimate of the selected model.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
L : selection.algorithms.lasso.lasso
Notes
If not None, covariance_estimator should take arguments (beta, active, inactive) and return an estimate of the covariance of \((\bar{\beta}_E, \nabla \ell(\bar{\beta}_E)_{-E})\), the unpenalized estimator and the inactive coordinates of the gradient of the likelihood at the unpenalized estimator.
-
property
constraints
¶ Affine constraints for this LASSO problem. These are the constraints determined only by the active block.
-
classmethod
coxph
(X, times, status, feature_weights, covariance_estimator=None, quadratic=None)¶ Cox proportional hazards LASSO with feature weights. Objective function is $$ beta mapsto ell^{text{Cox}}(beta) + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\ell^{\text{Cox}}\) is the negative of the log of the Cox partial likelihood and \(\lambda\) is feature_weights. Uses Efron’s tie breaking method. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- timesndarray
Shape (n,) – the survival times.
- statusndarray
Shape (n,) – the censoring status.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- covariance_estimatoroptional
If None, use the parameteric covariance estimate of the selected model.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
L : selection.algorithms.lasso.lasso
Notes
If not None, covariance_estimator should take arguments (beta, active, inactive) and return an estimate of the covariance of \((\bar{\beta}_E, \nabla \ell(\bar{\beta}_E)_{-E})\), the unpenalized estimator and the inactive coordinates of the gradient of the likelihood at the unpenalized estimator.
-
classmethod
sqrt_lasso
(X, Y, feature_weights, quadratic=None, covariance='parametric', sigma_estimate='truncated', solve_args={'min_its': 200})¶ Use sqrt-LASSO to choose variables. Objective function is $$ beta mapsto |Y-Xbeta|_2 + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\lambda\) is feature_weights. After solving the problem treat as if gaussian with implied variance and choice of multiplier. See arxiv.org/abs/1504.08031 for details. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- Yndarray
Shape (n,) – the response.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
- covariancestr
One of ‘parametric’ or ‘sandwich’. Method used to estimate covariance for inference in second stage.
- sigma_estimatestr
One of ‘truncated’ or ‘OLS’. Method used to estimate \(\sigma\) when using parametric covariance.
- solve_argsdict
Arguments passed to solver.
L : selection.algorithms.lasso.lasso
Notes
Unlike other variants of LASSO, this solves the problem on construction as the active set is needed to find equivalent gaussian LASSO. Assumes parametric model is correct for inference, i.e. does not accept a covariance estimator.
-
ROSI_modelQ
¶
-
class
selectinf.algorithms.lasso.
ROSI_modelQ
(Q, X, y, feature_weights)[source]¶ Bases:
selectinf.algorithms.lasso.lasso
A class for the LASSO for post-selection inference in which The problem solved is .. math:
\text{minimize}_{\beta} -(X\beta)^Ty + \frac{1}{2} \beta^TQ\beta + \sum_i \lambda_i |\beta_i|
where \(\lambda\) is feature_weights. Notes —– In solving the debiasing problem to approximate the inverse of (X^TWX) in a GLM, this class makes the implicit assumption that the scaling of X is such that diag(X^TWX) is O(n) with n=X.shape[0]. That is, X’s are similar to IID samples from a population that does not depend on n.
-
__init__
(Q, X, y, feature_weights)[source]¶ Create a new post-selection for the LASSO problem Parameters ———- Q : np.ndarray((p,p)) X : np.ndarray((n, p)) y : np.ndarray(n) feature_weights : np.ndarray
Feature weights for L-1 penalty. If a float, it is brodcast to all features.
-
fit
(solve_args={'min_its': 50, 'tol': 1e-12}, debiasing_args={})[source]¶ Fit the lasso using regreg. This sets the attributes soln, onestep and forms the constraints necessary for post-selection inference by calling form_constraints(). Parameters ———- lasso_solution : optional
If not None, this is taken to be the solution of the optimization problem. No checks are done, though the implied affine constraints will generally not be satisfied.
- solve_argskeyword args
Passed to regreg.problems.simple_problem.solve.
- solnnp.float
Solution to lasso.
Notes
If self already has an attribute lasso_solution this will be taken to be the solution and no optimization problem will be solved. Supplying the optional argument lasso_solution will overwrite self’s lasso_solution.
-
summary
(level=0.05, compute_intervals=False, dispersion=None)[source]¶ Summary table for inference adjusted for selection. Parameters ———- level : float
Form level*100% selective confidence intervals.
- compute_intervalsbool
Should we compute confidence intervals?
- dispersionfloat
Estimate of dispersion. Defaults to a Pearson’s X^2 estimate in the relaxed model.
- pval_summarynp.recarray
Array with one entry per active variable. Columns are ‘variable’, ‘pval’, ‘lasso’, ‘onestep’, ‘lower_trunc’, ‘upper_trunc’, ‘sd’.
-
property
constraints
¶ Affine constraints for this LASSO problem. These are the constraints determined only by the active block.
-
classmethod
coxph
(X, times, status, feature_weights, covariance_estimator=None, quadratic=None)¶ Cox proportional hazards LASSO with feature weights. Objective function is $$ beta mapsto ell^{text{Cox}}(beta) + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\ell^{\text{Cox}}\) is the negative of the log of the Cox partial likelihood and \(\lambda\) is feature_weights. Uses Efron’s tie breaking method. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- timesndarray
Shape (n,) – the survival times.
- statusndarray
Shape (n,) – the censoring status.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- covariance_estimatoroptional
If None, use the parameteric covariance estimate of the selected model.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
L : selection.algorithms.lasso.lasso
Notes
If not None, covariance_estimator should take arguments (beta, active, inactive) and return an estimate of the covariance of \((\bar{\beta}_E, \nabla \ell(\bar{\beta}_E)_{-E})\), the unpenalized estimator and the inactive coordinates of the gradient of the likelihood at the unpenalized estimator.
-
classmethod
gaussian
(X, Y, feature_weights, sigma=1.0, covariance_estimator=None, quadratic=None)¶ Squared-error LASSO with feature weights. Objective function is $$ beta mapsto frac{1}{2} |Y-Xbeta|^2_2 + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\lambda\) is feature_weights. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- Yndarray
Shape (n,) – the response.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- sigmafloat (optional)
Noise variance. Set to 1 if covariance_estimator is not None. This scales the loglikelihood by sigma**(-2).
- covariance_estimatorcallable (optional)
If None, use the parameteric covariance estimate of the selected model.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
L : selection.algorithms.lasso.lasso
Notes
If not None, covariance_estimator should take arguments (beta, active, inactive) and return an estimate of some of the rows and columns of the covariance of \((\bar{\beta}_E, \nabla \ell(\bar{\beta}_E)_{-E})\), the unpenalized estimator and the inactive coordinates of the gradient of the likelihood at the unpenalized estimator.
-
classmethod
logistic
(X, successes, feature_weights, trials=None, covariance_estimator=None, quadratic=None)¶ Logistic LASSO with feature weights. Objective function is $$ beta mapsto ell(Xbeta) + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\ell\) is the negative of the logistic log-likelihood (half the logistic deviance) and \(\lambda\) is feature_weights. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- successesndarray
Shape (n,) – response vector. An integer number of successes. For data that is proportions, multiply the proportions by the number of trials first.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- trialsndarray (optional)
Number of trials per response, defaults to ones the same shape as Y.
- covariance_estimatoroptional
If None, use the parameteric covariance estimate of the selected model.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
L : selection.algorithms.lasso.lasso
Notes
If not None, covariance_estimator should take arguments (beta, active, inactive) and return an estimate of the covariance of \((\bar{\beta}_E, \nabla \ell(\bar{\beta}_E)_{-E})\), the unpenalized estimator and the inactive coordinates of the gradient of the likelihood at the unpenalized estimator.
-
classmethod
poisson
(X, counts, feature_weights, covariance_estimator=None, quadratic=None)¶ Poisson log-linear LASSO with feature weights. Objective function is $$ beta mapsto ell^{text{Poisson}}(beta) + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\ell^{\text{Poisson}}\) is the negative of the log of the Poisson likelihood (half the deviance) and \(\lambda\) is feature_weights. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- countsndarray
Shape (n,) – the response.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- covariance_estimatoroptional
If None, use the parameteric covariance estimate of the selected model.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
L : selection.algorithms.lasso.lasso
Notes
If not None, covariance_estimator should take arguments (beta, active, inactive) and return an estimate of the covariance of \((\bar{\beta}_E, \nabla \ell(\bar{\beta}_E)_{-E})\), the unpenalized estimator and the inactive coordinates of the gradient of the likelihood at the unpenalized estimator.
-
property
soln
¶ Solution to the lasso problem, set by fit method.
-
classmethod
sqrt_lasso
(X, Y, feature_weights, quadratic=None, covariance='parametric', sigma_estimate='truncated', solve_args={'min_its': 200})¶ Use sqrt-LASSO to choose variables. Objective function is $$ beta mapsto |Y-Xbeta|_2 + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\lambda\) is feature_weights. After solving the problem treat as if gaussian with implied variance and choice of multiplier. See arxiv.org/abs/1504.08031 for details. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- Yndarray
Shape (n,) – the response.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
- covariancestr
One of ‘parametric’ or ‘sandwich’. Method used to estimate covariance for inference in second stage.
- sigma_estimatestr
One of ‘truncated’ or ‘OLS’. Method used to estimate \(\sigma\) when using parametric covariance.
- solve_argsdict
Arguments passed to solver.
L : selection.algorithms.lasso.lasso
Notes
Unlike other variants of LASSO, this solves the problem on construction as the active set is needed to find equivalent gaussian LASSO. Assumes parametric model is correct for inference, i.e. does not accept a covariance estimator.
-
data_carving
¶
-
class
selectinf.algorithms.lasso.
data_carving
(loglike_select, loglike_inference, loglike_full, feature_weights, covariance_estimator=None)[source]¶ Bases:
selectinf.algorithms.lasso.lasso
Notes
Even if a covariance estimator is supplied, we assume that we can drop inactive constraints, i.e. the same (asymptotic) independence that holds for parametric model is assumed to hold here as well.
-
__init__
(loglike_select, loglike_inference, loglike_full, feature_weights, covariance_estimator=None)[source]¶ Create a new post-selection dor the LASSO problem Parameters ———- loglike : regreg.smooth.glm.glm
A (negative) log-likelihood as implemented in regreg.
- feature_weightsnp.ndarray
Feature weights for L-1 penalty. If a float, it is brodcast to all features.
- covariance_estimatorcallable (optional)
If None, use the parameteric covariance estimate of the selected model.
If not None, covariance_estimator should take arguments (beta, active, inactive) and return an estimate of the covariance of \((\bar{\beta}_E, \nabla \ell(\bar{\beta}_E)_{-E})\), the unpenalized estimator and the inactive coordinates of the gradient of the likelihood at the unpenalized estimator.
-
classmethod
gaussian
(X, Y, feature_weights, split_frac=0.9, sigma=1.0, stage_one=None)[source]¶ Squared-error LASSO with feature weights. Objective function is $$ beta mapsto frac{1}{2} |Y-Xbeta|^2_2 + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\lambda\) is feature_weights. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- Yndarray
Shape (n,) – the response.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- sigmafloat (optional)
Noise variance. Set to 1 if covariance_estimator is not None. This scales the loglikelihood by sigma**(-2).
- covariance_estimatorcallable (optional)
If None, use the parameteric covariance estimate of the selected model.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
L : selection.algorithms.lasso.lasso
Notes
If not None, covariance_estimator should take arguments (beta, active, inactive) and return an estimate of some of the rows and columns of the covariance of \((\bar{\beta}_E, \nabla \ell(\bar{\beta}_E)_{-E})\), the unpenalized estimator and the inactive coordinates of the gradient of the likelihood at the unpenalized estimator.
-
classmethod
logistic
(X, successes, feature_weights, trials=None, split_frac=0.9, sigma=1.0, stage_one=None)[source]¶ Logistic LASSO with feature weights. Objective function is $$ beta mapsto ell(Xbeta) + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\ell\) is the negative of the logistic log-likelihood (half the logistic deviance) and \(\lambda\) is feature_weights. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- successesndarray
Shape (n,) – response vector. An integer number of successes. For data that is proportions, multiply the proportions by the number of trials first.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- trialsndarray (optional)
Number of trials per response, defaults to ones the same shape as Y.
- covariance_estimatoroptional
If None, use the parameteric covariance estimate of the selected model.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
L : selection.algorithms.lasso.lasso
Notes
If not None, covariance_estimator should take arguments (beta, active, inactive) and return an estimate of the covariance of \((\bar{\beta}_E, \nabla \ell(\bar{\beta}_E)_{-E})\), the unpenalized estimator and the inactive coordinates of the gradient of the likelihood at the unpenalized estimator.
-
classmethod
poisson
(X, counts, feature_weights, split_frac=0.9, sigma=1.0, stage_one=None)[source]¶ Poisson log-linear LASSO with feature weights. Objective function is $$ beta mapsto ell^{text{Poisson}}(beta) + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\ell^{\text{Poisson}}\) is the negative of the log of the Poisson likelihood (half the deviance) and \(\lambda\) is feature_weights. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- countsndarray
Shape (n,) – the response.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- covariance_estimatoroptional
If None, use the parameteric covariance estimate of the selected model.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
L : selection.algorithms.lasso.lasso
Notes
If not None, covariance_estimator should take arguments (beta, active, inactive) and return an estimate of the covariance of \((\bar{\beta}_E, \nabla \ell(\bar{\beta}_E)_{-E})\), the unpenalized estimator and the inactive coordinates of the gradient of the likelihood at the unpenalized estimator.
-
classmethod
coxph
(X, times, status, feature_weights, split_frac=0.9, sigma=1.0, stage_one=None)[source]¶ Cox proportional hazards LASSO with feature weights. Objective function is $$ beta mapsto ell^{text{Cox}}(beta) + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\ell^{\text{Cox}}\) is the negative of the log of the Cox partial likelihood and \(\lambda\) is feature_weights. Uses Efron’s tie breaking method. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- timesndarray
Shape (n,) – the survival times.
- statusndarray
Shape (n,) – the censoring status.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- covariance_estimatoroptional
If None, use the parameteric covariance estimate of the selected model.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
L : selection.algorithms.lasso.lasso
Notes
If not None, covariance_estimator should take arguments (beta, active, inactive) and return an estimate of the covariance of \((\bar{\beta}_E, \nabla \ell(\bar{\beta}_E)_{-E})\), the unpenalized estimator and the inactive coordinates of the gradient of the likelihood at the unpenalized estimator.
-
classmethod
sqrt_lasso
(X, Y, feature_weights, split_frac=0.9, stage_one=None, solve_args={'min_its': 200})[source]¶ Use sqrt-LASSO to choose variables. Objective function is $$ beta mapsto |Y-Xbeta|_2 + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\lambda\) is feature_weights. After solving the problem treat as if gaussian with implied variance and choice of multiplier. See arxiv.org/abs/1504.08031 for details. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- Yndarray
Shape (n,) – the response.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
- covariancestr
One of ‘parametric’ or ‘sandwich’. Method used to estimate covariance for inference in second stage.
- sigma_estimatestr
One of ‘truncated’ or ‘OLS’. Method used to estimate \(\sigma\) when using parametric covariance.
- solve_argsdict
Arguments passed to solver.
L : selection.algorithms.lasso.lasso
Notes
Unlike other variants of LASSO, this solves the problem on construction as the active set is needed to find equivalent gaussian LASSO. Assumes parametric model is correct for inference, i.e. does not accept a covariance estimator.
-
fit
(solve_args={'min_its': 50, 'tol': 1e-12})[source]¶ Fit the lasso using regreg. This sets the attributes soln, onestep and forms the constraints necessary for post-selection inference by calling form_constraints(). Parameters ———- lasso_solution : optional
If not None, this is taken to be the solution of the optimization problem. No checks are done, though the implied affine constraints will generally not be satisfied.
- solve_argskeyword args
Passed to regreg.problems.simple_problem.solve.
- solnnp.float
Solution to lasso.
Notes
If self already has an attribute lasso_solution this will be taken to be the solution and no optimization problem will be solved. Supplying the optional argument lasso_solution will overwrite self’s lasso_solution.
-
property
constraints
¶ Affine constraints for this LASSO problem. These are the constraints determined only by the active block.
-
property
soln
¶ Solution to the lasso problem, set by fit method.
-
summary
(alternative='twosided', level=0.95, compute_intervals=False, truth=None)¶ Summary table for inference adjusted for selection.
- Parameters
alternative : str
One of [“twosided”,”onesided”]
level : float
Form level*100% selective confidence intervals.
compute_intervals : bool
Should we compute confidence intervals?
truth : np.array
True values of each beta for selected variables. If not None, a column ‘pval’ are p-values computed under these corresponding null hypotheses.
- Returns
pval_summary : np.recarray
Array with one entry per active variable. Columns are ‘variable’, ‘pval’, ‘lasso’, ‘onestep’, ‘lower_trunc’, ‘upper_trunc’, ‘sd’.
-
data_splitting
¶
-
class
selectinf.algorithms.lasso.
data_splitting
(loglike_select, loglike_inference, loglike_full, feature_weights, covariance_estimator=None)[source]¶ Bases:
selectinf.algorithms.lasso.data_carving
-
__init__
(loglike_select, loglike_inference, loglike_full, feature_weights, covariance_estimator=None)¶ Create a new post-selection dor the LASSO problem Parameters ———- loglike : regreg.smooth.glm.glm
A (negative) log-likelihood as implemented in regreg.
- feature_weightsnp.ndarray
Feature weights for L-1 penalty. If a float, it is brodcast to all features.
- covariance_estimatorcallable (optional)
If None, use the parameteric covariance estimate of the selected model.
If not None, covariance_estimator should take arguments (beta, active, inactive) and return an estimate of the covariance of \((\bar{\beta}_E, \nabla \ell(\bar{\beta}_E)_{-E})\), the unpenalized estimator and the inactive coordinates of the gradient of the likelihood at the unpenalized estimator.
-
fit
(solve_args={'min_its': 500, 'tol': 1e-12}, use_full_cov=True)[source]¶ Fit the lasso using regreg. This sets the attributes soln, onestep and forms the constraints necessary for post-selection inference by calling form_constraints(). Parameters ———- lasso_solution : optional
If not None, this is taken to be the solution of the optimization problem. No checks are done, though the implied affine constraints will generally not be satisfied.
- solve_argskeyword args
Passed to regreg.problems.simple_problem.solve.
- solnnp.float
Solution to lasso.
Notes
If self already has an attribute lasso_solution this will be taken to be the solution and no optimization problem will be solved. Supplying the optional argument lasso_solution will overwrite self’s lasso_solution.
-
property
constraints
¶ Affine constraints for this LASSO problem. These are the constraints determined only by the active block.
-
classmethod
coxph
(X, times, status, feature_weights, split_frac=0.9, sigma=1.0, stage_one=None)¶ Cox proportional hazards LASSO with feature weights. Objective function is $$ beta mapsto ell^{text{Cox}}(beta) + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\ell^{\text{Cox}}\) is the negative of the log of the Cox partial likelihood and \(\lambda\) is feature_weights. Uses Efron’s tie breaking method. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- timesndarray
Shape (n,) – the survival times.
- statusndarray
Shape (n,) – the censoring status.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- covariance_estimatoroptional
If None, use the parameteric covariance estimate of the selected model.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
L : selection.algorithms.lasso.lasso
Notes
If not None, covariance_estimator should take arguments (beta, active, inactive) and return an estimate of the covariance of \((\bar{\beta}_E, \nabla \ell(\bar{\beta}_E)_{-E})\), the unpenalized estimator and the inactive coordinates of the gradient of the likelihood at the unpenalized estimator.
-
classmethod
gaussian
(X, Y, feature_weights, split_frac=0.9, sigma=1.0, stage_one=None)¶ Squared-error LASSO with feature weights. Objective function is $$ beta mapsto frac{1}{2} |Y-Xbeta|^2_2 + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\lambda\) is feature_weights. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- Yndarray
Shape (n,) – the response.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- sigmafloat (optional)
Noise variance. Set to 1 if covariance_estimator is not None. This scales the loglikelihood by sigma**(-2).
- covariance_estimatorcallable (optional)
If None, use the parameteric covariance estimate of the selected model.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
L : selection.algorithms.lasso.lasso
Notes
If not None, covariance_estimator should take arguments (beta, active, inactive) and return an estimate of some of the rows and columns of the covariance of \((\bar{\beta}_E, \nabla \ell(\bar{\beta}_E)_{-E})\), the unpenalized estimator and the inactive coordinates of the gradient of the likelihood at the unpenalized estimator.
-
classmethod
logistic
(X, successes, feature_weights, trials=None, split_frac=0.9, sigma=1.0, stage_one=None)¶ Logistic LASSO with feature weights. Objective function is $$ beta mapsto ell(Xbeta) + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\ell\) is the negative of the logistic log-likelihood (half the logistic deviance) and \(\lambda\) is feature_weights. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- successesndarray
Shape (n,) – response vector. An integer number of successes. For data that is proportions, multiply the proportions by the number of trials first.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- trialsndarray (optional)
Number of trials per response, defaults to ones the same shape as Y.
- covariance_estimatoroptional
If None, use the parameteric covariance estimate of the selected model.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
L : selection.algorithms.lasso.lasso
Notes
If not None, covariance_estimator should take arguments (beta, active, inactive) and return an estimate of the covariance of \((\bar{\beta}_E, \nabla \ell(\bar{\beta}_E)_{-E})\), the unpenalized estimator and the inactive coordinates of the gradient of the likelihood at the unpenalized estimator.
-
classmethod
poisson
(X, counts, feature_weights, split_frac=0.9, sigma=1.0, stage_one=None)¶ Poisson log-linear LASSO with feature weights. Objective function is $$ beta mapsto ell^{text{Poisson}}(beta) + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\ell^{\text{Poisson}}\) is the negative of the log of the Poisson likelihood (half the deviance) and \(\lambda\) is feature_weights. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- countsndarray
Shape (n,) – the response.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- covariance_estimatoroptional
If None, use the parameteric covariance estimate of the selected model.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
L : selection.algorithms.lasso.lasso
Notes
If not None, covariance_estimator should take arguments (beta, active, inactive) and return an estimate of the covariance of \((\bar{\beta}_E, \nabla \ell(\bar{\beta}_E)_{-E})\), the unpenalized estimator and the inactive coordinates of the gradient of the likelihood at the unpenalized estimator.
-
property
soln
¶ Solution to the lasso problem, set by fit method.
-
classmethod
sqrt_lasso
(X, Y, feature_weights, split_frac=0.9, stage_one=None, solve_args={'min_its': 200})¶ Use sqrt-LASSO to choose variables. Objective function is $$ beta mapsto |Y-Xbeta|_2 + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\lambda\) is feature_weights. After solving the problem treat as if gaussian with implied variance and choice of multiplier. See arxiv.org/abs/1504.08031 for details. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- Yndarray
Shape (n,) – the response.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
- covariancestr
One of ‘parametric’ or ‘sandwich’. Method used to estimate covariance for inference in second stage.
- sigma_estimatestr
One of ‘truncated’ or ‘OLS’. Method used to estimate \(\sigma\) when using parametric covariance.
- solve_argsdict
Arguments passed to solver.
L : selection.algorithms.lasso.lasso
Notes
Unlike other variants of LASSO, this solves the problem on construction as the active set is needed to find equivalent gaussian LASSO. Assumes parametric model is correct for inference, i.e. does not accept a covariance estimator.
-
summary
(alternative='twosided', level=0.95, compute_intervals=False, truth=None)¶ Summary table for inference adjusted for selection.
- Parameters
alternative : str
One of [“twosided”,”onesided”]
level : float
Form level*100% selective confidence intervals.
compute_intervals : bool
Should we compute confidence intervals?
truth : np.array
True values of each beta for selected variables. If not None, a column ‘pval’ are p-values computed under these corresponding null hypotheses.
- Returns
pval_summary : np.recarray
Array with one entry per active variable. Columns are ‘variable’, ‘pval’, ‘lasso’, ‘onestep’, ‘lower_trunc’, ‘upper_trunc’, ‘sd’.
-
lasso
¶
-
class
selectinf.algorithms.lasso.
lasso
(loglike, feature_weights, covariance_estimator=None, ignore_inactive_constraints=False)[source]¶ Bases:
object
A class for the LASSO for post-selection inference. The problem solved is .. math:
\text{minimize}_{\beta} \frac{1}{2n} \|y-X\beta\|^2_2 + \lambda \|\beta\|_1
where \(\lambda\) is lam.
-
__init__
(loglike, feature_weights, covariance_estimator=None, ignore_inactive_constraints=False)[source]¶ Create a new post-selection dor the LASSO problem Parameters ———- loglike : regreg.smooth.glm.glm
A (negative) log-likelihood as implemented in regreg.
- feature_weightsnp.ndarray
Feature weights for L-1 penalty. If a float, it is brodcast to all features.
- covariance_estimatorcallable (optional)
If None, use the parameteric covariance estimate of the selected model.
If not None, covariance_estimator should take arguments (beta, active, inactive) and return an estimate of the covariance of \((\bar{\beta}_E, \nabla \ell(\bar{\beta}_E)_{-E})\), the unpenalized estimator and the inactive coordinates of the gradient of the likelihood at the unpenalized estimator.
-
fit
(lasso_solution=None, solve_args={'min_its': 50, 'tol': 1e-12})[source]¶ Fit the lasso using regreg. This sets the attributes soln, onestep and forms the constraints necessary for post-selection inference by calling form_constraints(). Parameters ———- lasso_solution : optional
If not None, this is taken to be the solution of the optimization problem. No checks are done, though the implied affine constraints will generally not be satisfied.
- solve_argskeyword args
Passed to regreg.problems.simple_problem.solve.
- solnnp.float
Solution to lasso.
Notes
If self already has an attribute lasso_solution this will be taken to be the solution and no optimization problem will be solved. Supplying the optional argument lasso_solution will overwrite self’s lasso_solution.
-
summary
(alternative='twosided', level=0.95, compute_intervals=False, truth=None)[source]¶ Summary table for inference adjusted for selection.
- Parameters
alternative : str
One of [“twosided”,”onesided”]
level : float
Form level*100% selective confidence intervals.
compute_intervals : bool
Should we compute confidence intervals?
truth : np.array
True values of each beta for selected variables. If not None, a column ‘pval’ are p-values computed under these corresponding null hypotheses.
- Returns
pval_summary : np.recarray
Array with one entry per active variable. Columns are ‘variable’, ‘pval’, ‘lasso’, ‘onestep’, ‘lower_trunc’, ‘upper_trunc’, ‘sd’.
-
property
soln
¶ Solution to the lasso problem, set by fit method.
-
property
constraints
¶ Affine constraints for this LASSO problem. These are the constraints determined only by the active block.
-
classmethod
gaussian
(X, Y, feature_weights, sigma=1.0, covariance_estimator=None, quadratic=None)[source]¶ Squared-error LASSO with feature weights. Objective function is $$ beta mapsto frac{1}{2} |Y-Xbeta|^2_2 + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\lambda\) is feature_weights. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- Yndarray
Shape (n,) – the response.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- sigmafloat (optional)
Noise variance. Set to 1 if covariance_estimator is not None. This scales the loglikelihood by sigma**(-2).
- covariance_estimatorcallable (optional)
If None, use the parameteric covariance estimate of the selected model.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
L : selection.algorithms.lasso.lasso
Notes
If not None, covariance_estimator should take arguments (beta, active, inactive) and return an estimate of some of the rows and columns of the covariance of \((\bar{\beta}_E, \nabla \ell(\bar{\beta}_E)_{-E})\), the unpenalized estimator and the inactive coordinates of the gradient of the likelihood at the unpenalized estimator.
-
classmethod
logistic
(X, successes, feature_weights, trials=None, covariance_estimator=None, quadratic=None)[source]¶ Logistic LASSO with feature weights. Objective function is $$ beta mapsto ell(Xbeta) + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\ell\) is the negative of the logistic log-likelihood (half the logistic deviance) and \(\lambda\) is feature_weights. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- successesndarray
Shape (n,) – response vector. An integer number of successes. For data that is proportions, multiply the proportions by the number of trials first.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- trialsndarray (optional)
Number of trials per response, defaults to ones the same shape as Y.
- covariance_estimatoroptional
If None, use the parameteric covariance estimate of the selected model.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
L : selection.algorithms.lasso.lasso
Notes
If not None, covariance_estimator should take arguments (beta, active, inactive) and return an estimate of the covariance of \((\bar{\beta}_E, \nabla \ell(\bar{\beta}_E)_{-E})\), the unpenalized estimator and the inactive coordinates of the gradient of the likelihood at the unpenalized estimator.
-
classmethod
coxph
(X, times, status, feature_weights, covariance_estimator=None, quadratic=None)[source]¶ Cox proportional hazards LASSO with feature weights. Objective function is $$ beta mapsto ell^{text{Cox}}(beta) + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\ell^{\text{Cox}}\) is the negative of the log of the Cox partial likelihood and \(\lambda\) is feature_weights. Uses Efron’s tie breaking method. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- timesndarray
Shape (n,) – the survival times.
- statusndarray
Shape (n,) – the censoring status.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- covariance_estimatoroptional
If None, use the parameteric covariance estimate of the selected model.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
L : selection.algorithms.lasso.lasso
Notes
If not None, covariance_estimator should take arguments (beta, active, inactive) and return an estimate of the covariance of \((\bar{\beta}_E, \nabla \ell(\bar{\beta}_E)_{-E})\), the unpenalized estimator and the inactive coordinates of the gradient of the likelihood at the unpenalized estimator.
-
classmethod
poisson
(X, counts, feature_weights, covariance_estimator=None, quadratic=None)[source]¶ Poisson log-linear LASSO with feature weights. Objective function is $$ beta mapsto ell^{text{Poisson}}(beta) + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\ell^{\text{Poisson}}\) is the negative of the log of the Poisson likelihood (half the deviance) and \(\lambda\) is feature_weights. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- countsndarray
Shape (n,) – the response.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- covariance_estimatoroptional
If None, use the parameteric covariance estimate of the selected model.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
L : selection.algorithms.lasso.lasso
Notes
If not None, covariance_estimator should take arguments (beta, active, inactive) and return an estimate of the covariance of \((\bar{\beta}_E, \nabla \ell(\bar{\beta}_E)_{-E})\), the unpenalized estimator and the inactive coordinates of the gradient of the likelihood at the unpenalized estimator.
-
classmethod
sqrt_lasso
(X, Y, feature_weights, quadratic=None, covariance='parametric', sigma_estimate='truncated', solve_args={'min_its': 200})[source]¶ Use sqrt-LASSO to choose variables. Objective function is $$ beta mapsto |Y-Xbeta|_2 + sum_{i=1}^p lambda_i |\beta_i| $$ where \(\lambda\) is feature_weights. After solving the problem treat as if gaussian with implied variance and choice of multiplier. See arxiv.org/abs/1504.08031 for details. Parameters ———- X : ndarray
Shape (n,p) – the design matrix.
- Yndarray
Shape (n,) – the response.
- feature_weights: [float, sequence]
Penalty weights. An intercept, or other unpenalized features are handled by setting those entries of feature_weights to 0. If feature_weights is a float, then all parameters are penalized equally.
- quadraticregreg.identity_quadratic.identity_quadratic (optional)
An optional quadratic term to be added to the objective. Can also be a linear term by setting quadratic coefficient to 0.
- covariancestr
One of ‘parametric’ or ‘sandwich’. Method used to estimate covariance for inference in second stage.
- sigma_estimatestr
One of ‘truncated’ or ‘OLS’. Method used to estimate \(\sigma\) when using parametric covariance.
- solve_argsdict
Arguments passed to solver.
L : selection.algorithms.lasso.lasso
Notes
Unlike other variants of LASSO, this solves the problem on construction as the active set is needed to find equivalent gaussian LASSO. Assumes parametric model is correct for inference, i.e. does not accept a covariance estimator.
-
Functions¶
-
selectinf.algorithms.lasso.
additive_noise
(X, y, sigma, lam_frac=1.0, perturb_frac=0.2, y_star=None, coverage=0.95, ndraw=8000, compute_intervals=True, burnin=2000)[source]¶ Additive noise LASSO. Parameters ———- y : np.float
Response vector
- Xnp.float
Design matrix
- sigmanp.float
Noise variance
- lam_fracfloat (optional)
Multiplier for choice of \(\lambda\). Defaults to 2.
- perturb_fracfloat (optional)
How much noise to add? Noise added has variance proportional to existing variance.
- coveragefloat
Coverage for selective intervals. Defaults to 0.95.
- ndrawint (optional)
How many draws to keep from Gibbs hit-and-run sampler. Defaults to 8000.
- burninint (optional)
Defaults to 2000.
- compute_intervalsbool (optional)
Compute selective intervals?
- Returns
results : [(variable, pvalue, interval)
Indices of active variables, selected (twosided) pvalue and selective interval. If splitting, then each entry also includes a (split_pvalue, split_interval) using stage_two for inference.
randomized_lasso : lasso
Results of fitting LASSO to randomized data.
-
selectinf.algorithms.lasso.
glm_parametric_estimator
(loglike, dispersion=None)[source]¶ Parametric estimator of covariance of
\[(ar{eta}_E, X_{-E}^T(y-\]- abla ell(X_Ear{eta}_E))
the OLS estimator of population regression coefficients and inactive correlation with the OLS residuals. If sigma is None, it computes usual unbiased estimate of variance in Gaussian model and plugs it in, assuming parametric form is correct. Returns ——- estimator : callable
Takes arguments (beta, active, inactive)
-
selectinf.algorithms.lasso.
glm_sandwich_estimator
(loss, B=1000)[source]¶ Bootstrap estimator of covariance of
\[(ar{eta}_E, X_{-E}^T(y-X_Ear{eta}_E)\]the OLS estimator of population regression coefficients and inactive correlation with the OLS residuals. Returns ——- estimator : callable
Takes arguments (beta, active, inactive)
-
selectinf.algorithms.lasso.
nominal_intervals
(lasso_obj, level=0.95)[source]¶ Intervals for OLS parameters of active variables that have not been adjusted for selection.
-
selectinf.algorithms.lasso.
split_model
(X, y, sigma=1, lam_frac=1.0, split_frac=0.9, stage_one=None)[source]¶ Fit a LASSO with a default choice of Lagrange parameter equal to lam_frac times \(\sigma \cdot E(|X^T\epsilon|)\) with \(\epsilon\) IID N(0,1) on a proportion (split_frac) of the data. Parameters ———- y : np.float
Response vector
- Xnp.float
Design matrix
- sigmanp.float
Noise variance
- lam_fracfloat (optional)
Multiplier for choice of \(\lambda\). Defaults to 2.
- split_fracfloat (optional)
What proportion of the data to use in the first stage? Defaults to 0.9.
- stage_one[np.array(np.int), None] (optional)
Index of data points to be used in first stage. If None, a randomly chosen set of entries is used based on split_frac.
- first_stagelasso
Lasso object from stage one.
- stage_onenp.array(int)
Indices used for stage one.
- stage_twonp.array(int)
Indices used for stage two.
-
selectinf.algorithms.lasso.
standard_lasso
(X, y, sigma=1, lam_frac=1.0, **solve_args)[source]¶ Fit a LASSO with a default choice of Lagrange parameter equal to lam_frac times \(\sigma \cdot E(|X^T\epsilon|)\) with \(\epsilon\) IID N(0,1). Parameters ———- y : np.float
Response vector
- Xnp.float
Design matrix
- sigmanp.float
Noise variance
- lam_fracfloat
Multiplier for choice of \(\lambda\)
- solve_argskeyword args
Passed to regreg.problems.simple_problem.solve.
- lasso_selectionlasso
Instance of lasso after fitting.