constraints.quasi_affine

Module: constraints.quasi_affine

Inheritance diagram for selectinf.constraints.quasi_affine:

digraph inheritance984bc71617 { rankdir=LR; size="8.0, 12.0"; "constraints.quasi_affine.constraints" [URL="#selectinf.constraints.quasi_affine.constraints",fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5)",target="_top",tooltip="This class is the core object for quasiaffine selection procedures."]; "constraints.quasi_affine.orthogonal" [URL="#selectinf.constraints.quasi_affine.orthogonal",fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5)",target="_top",tooltip="This class is the core object for quasiaffine selection procedures."]; "constraints.quasi_affine.constraints" -> "constraints.quasi_affine.orthogonal" [arrowsize=0.5,style="setlinewidth(0.5)"]; }

This module contains the core code needed for post selection inference based on affine selection procedures as described in the papers Kac Rice, Spacings, covTest and post selection LASSO.

Classes

constraints

class selectinf.constraints.quasi_affine.constraints(linear_part, LHS_offset, RHS_offset, residual_projector, covariance=None, mean=None)[source]

Bases: object

This class is the core object for quasiaffine selection procedures. It is meant to describe sets of the form \(C\) where

\[C = \left\{z: Az + u \leq \|Pz\|_2b \right \}\]

where \(u\) is LHS_offset, \(b\) is RHS_offset, \(P\) is a projection assumed to satisfy \(AP=0\), and \(A\) is linear_part, some fixed matrix.

Notes

In this parameterization, the parameter self.mean corresponds to the reference measure that is being truncated. It is not the mean of the truncated Gaussian.

__init__(linear_part, LHS_offset, RHS_offset, residual_projector, covariance=None, mean=None)[source]

Create a new inequality.

Parameters

linear_part : np.float((q,p))

The linear part, \(A\) of the quasi-affine constraint \(C\).

LHS_offset: np.float(q)

The value of \(u\) in the quasi-affine constraint C.

RHS_offset: np.float(q)

The value of \(b\) in the quasi-affine constraint C.

residual_projector: np.float((p,p))

The matrix \(P\) above. C. If covariance is not identity, then \(\|Pz\|_2\) should be interpreted as a Mahalanobis distance.

covariance : np.float((p,p))

Covariance matrix of Gaussian distribution to be truncated. Defaults to np.identity(self.dim).

mean : np.float(p)

Mean vector of Gaussian distribution to be truncated. Defaults to np.zeros(self.dim).

conditional(linear_part, value)[source]

Return an equivalent constraint after having conditioned on a linear equality.

Let the inequality constraints be specified by (A,b) and the equality constraints be specified by (C,d). We form equivalent inequality constraints by considering the residual

Parameters

linear_part : np.float((k,q))

Linear part of equality constraint, C above.

value : np.float(k)

Value of equality constraint, b above.

.. math::

AZ - E(AZ|CZ=d)

Returns

conditional_con : constraints

Quasi-affine constraints having applied equality constraint.

bounds(direction_of_interest, Y)[source]

For a realization \(Y\) of the random variable \(N(\mu,\Sigma)\) truncated to \(C\) specified by self.constraints compute the slice of the inequality constraints in a given direction \(\eta\).

Parameters

direction_of_interest: np.float

A direction \(\eta\) for which we may want to form selection intervals or a test.

Y : np.float

A realization of \(N(\mu,\Sigma)\) where \(\Sigma\) is self.covariance.

Returns

intervals : []

Set of truncation intervals for the \(T\) statistic.

Tobs : np.float

The observed \(T\) statistic.

pivot(direction_of_interest, Y, alternative='greater')[source]

For a realization \(Y\) of the random variable \(N(\mu,\Sigma)\) truncated to \(C\) specified by self.constraints compute the slice of the inequality constraints in a given direction \(\eta\) and test whether \(\eta^T\mu\) is greater then 0, less than 0 or equal to 0.

Parameters

direction_of_interest: np.float

A direction \(\eta\) for which we may want to form selection intervals or a test.

Y : np.float

A realization of \(N(0,\Sigma)\) where \(\Sigma\) is self.covariance.

alternative : [‘greater’, ‘less’, ‘twosided’]

What alternative to use.

Returns

P : np.float

\(p\)-value of corresponding test.

Notes

All of the tests are based on the exact pivot \(F\) given by the truncated T distribution for the given direction \(\eta\). If the alternative is ‘greater’ then we return \(1-F\); if it is ‘less’ we return \(F\) and if it is ‘twosided’ we return \(2 \min(F,1-F)\).

whiten()[source]
Parameters

Return a whitened version of constraints in a different

basis, and a change of basis matrix.

If `self.covariance` is rank deficient, the change-of

basis matrix will not be square.

orthogonal

class selectinf.constraints.quasi_affine.orthogonal(linear_part, LHS_offset, RHS_offset, RSS, RSS_df, covariance=None, mean=None)[source]

Bases: selectinf.constraints.quasi_affine.constraints

This class is the core object for quasiaffine selection procedures. It is meant to describe sets of the form \(C\) where

\[C = \left\{z: Az + u \leq \|Pz\|_2b \right \}\]

where \(u\) is LHS_offset, \(b\) is RHS_offset, \(P\) is a projection assumed to satisfy \(AP=0\), and \(A\) is linear_part, some fixed matrix.

The condition \(AP=0\) is why this class is called orthogonal.

Notes

In this parameterization, the parameter self.mean corresponds to the reference measure that is being truncated. It is not the mean of the truncated Gaussian.

__init__(linear_part, LHS_offset, RHS_offset, RSS, RSS_df, covariance=None, mean=None)[source]

Create a new inequality.

Parameters

linear_part : np.float((q,p))

The linear part, \(A\) of the quasi-affine constraint \(C\).

LHS_offset: np.float(q)

The value of \(u\) in the quasi-affine constraint C.

RHS_offset: np.float(q)

The value of \(b\) in the quasi-affine constraint C.

RSS : float

The value of \(\|Pz\|_2\) above. If covariance is not identity, then \(\|Pz\|_2\) should be interpreted as a Mahalanobis distance relative to self.covariance.

RSS_df : int

Degrees of freedom in \(\|Pz\|_2\), when covariance is a multiple of identity, then this should be trace(P).

covariance : np.float((p,p))

Covariance matrix of Gaussian distribution to be truncated. Defaults to np.identity(self.dim).

mean : np.float(p)

Mean vector of Gaussian distribution to be truncated. Defaults to np.zeros(self.dim).

conditional(linear_part, value)[source]

Return an equivalent constraint after having conditioned on a linear equality.

Let the inequality constraints be specified by (A,b) and the equality constraints be specified by (C,d). We form equivalent inequality constraints by considering the residual

Parameters

linear_part : np.float((k,q))

Linear part of equality constraint, C above.

value : np.float(k)

Value of equality constraint, b above.

.. math::

AZ - E(AZ|CZ=d)

Returns

conditional_con : orthogonal

Quasi-affine constraints having applied equality constraint.

Notes

The calculations here assume that \(CZ\) is independent of \(PZ\).

bounds(direction_of_interest, Y)[source]

For a realization \(Y\) of the random variable \(N(\mu,\Sigma)\) truncated to \(C\) specified by self.constraints compute the slice of the inequality constraints in a given direction \(\eta\).

Parameters

direction_of_interest: np.float

A direction \(\eta\) for which we may want to form selection intervals or a test.

Y : np.float

A realization of \(N(\mu,\Sigma)\) where \(\Sigma\) is self.covariance.

Returns

intervals : []

Set of truncation intervals for the \(T\) statistic.

Tobs : np.float

The observed \(T\) statistic.

pivot(direction_of_interest, Y, alternative='greater')[source]

For a realization \(Y\) of the random variable \(N(\mu,\Sigma)\) truncated to \(C\) specified by self.constraints compute the slice of the inequality constraints in a given direction \(\eta\) and test whether \(\eta^T\mu\) is greater then 0, less than 0 or equal to 0.

Parameters

direction_of_interest: np.float

A direction \(\eta\) for which we may want to form selection intervals or a test.

Y : np.float

A realization of \(N(0,\Sigma)\) where \(\Sigma\) is self.covariance.

alternative : [‘greater’, ‘less’, ‘twosided’]

What alternative to use.

Returns

P : np.float

\(p\)-value of corresponding test.

Notes

All of the tests are based on the exact pivot \(F\) given by the truncated T distribution for the given direction \(\eta\). If the alternative is ‘greater’ then we return \(1-F\); if it is ‘less’ we return \(F\) and if it is ‘twosided’ we return \(2 \min(F,1-F)\).

whiten()[source]
Parameters

Return a whitened version of constraints in a different

basis, and a change of basis matrix.

If `self.covariance` is rank deficient, the change-of

basis matrix will not be square.

Functions

selectinf.constraints.quasi_affine.constraints_unknown_sigma(support_directions, RHS_offsets, LHS_offsets, observed_data, direction_of_interest, RSS, RSS_df, value_under_null=0.0, tol=0.0001, DEBUG=False)[source]

Given a quasi-affine constraint \(\{z:Az+u \leq \hat{\sigma}b\}\) (elementwise) specified with \(A\) as support_directions and \(b\) as support_offset, a new direction of interest \(\eta\), and an observed_data is Gaussian vector \(Z \sim N(\mu,\sigma^2 I)\) with \(\sigma\) unknown, this function returns \(\eta^TZ\) as well as a set bounding this value. The value of \(\hat{\sigma}\) is taken to be sqrt(RSS/RSS_df)

The interval constructed is such that the endpoints are independent of \(\eta^TZ\), hence the selective \(T\) distribution of of sample carving can be used to form an exact pivot.

To construct the interval, we are in effect conditioning on all randomness perpendicular to the direction of interest, i.e. \(P_{\eta}^{\perp}X\) where \(X\) is the Gaussian data vector.

Parameters

support_directions : np.float

Matrix specifying constraint, \(A\).

RHS : np.float

Offset in constraint, \(b\).

LHS_offsets : np.float

Offset in LHS of constraint, \(u\).

observed_data : np.float

Observations.

direction_of_interest : np.float

Direction in which we’re interested for the contrast.

RSS : float

Residual sum of squares.

RSS_df : int

Degrees of freedom of RSS.

tol : float

Relative tolerance parameter for deciding sign of \(Az-b\).

Returns

lower_bound : float

observed : float

upper_bound : float

sigma : float

Notes

Covariance is assumed to be an unknown multiple of the identity.

selectinf.constraints.quasi_affine.gibbs_test(quasi_affine_con, Y, direction_of_interest, how_often=-1, ndraw=5000, burnin=2000, white=False, alternative='twosided', UMPU=True, sigma_known=False, alpha=0.05, use_constraint_directions=False)[source]

A Monte Carlo significance test for a given function of con.mean.

Parameters

quasi_affine_con : orthogonal

Y : np.float

Point satisfying the constraint.

direction_of_interest: np.float

Which linear function of con.mean is of interest? (a.k.a. \(\eta\) in many of related papers)

how_often : int (optional)

How often should the sampler make a move along direction_of_interest? If negative, defaults to ndraw+burnin (so it will never be used).

ndraw : int (optional)

Defaults to 1000.

burnin : int (optional)

Defaults to 1000.

white : bool (optional)

Is con.covariance equal to identity?

alternative : str

One of [‘greater’, ‘less’, ‘twosided’]

UMPU : bool

Perform the UMPU test?

sigma_known : bool

Is \(\sigma\) assumed known?

alpha :

Level for UMPU test.

use_constraint_directions : bool (optional)

Use the directions formed by the constraints as in the Gibbs scheme?

Returns

pvalue : float

P-value (using importance weights) for specified hypothesis test.

Z : np.float((ndraw, n))

Sample from the sphere intersect the constraints.

weights : np.float(ndraw)

Importance weights for the sample.

selectinf.constraints.quasi_affine.intersection(I1, I2)[source]
selectinf.constraints.quasi_affine.quadratic_inequality_solver(a, b, c, direction='less than')[source]

solves a * x**2 + b * x + c leq 0, if direction is “less than”, solves a * x**2 + b * x + c geq 0, if direction is “greater than”,

returns: the truancated interval, may include [-infty, + infty] the returned interval(s) is a list of disjoint intervals indicating the union. when the left endpoint of the interval is equal to the right, return empty list

selectinf.constraints.quasi_affine.sample_from_constraints(con, Y, direction_of_interest=None, how_often=-1, ndraw=1000, burnin=1000, white=False, use_constraint_directions=True)[source]

Use Gibbs sampler to simulate from con.

Parameters

con : `selection.affine.constraints`_

Y : np.float

Point satisfying the constraint.

direction_of_interest : np.float (optional)

Which projection is of most interest?

how_often : int (optional)

How often should the sampler make a move along direction_of_interest? If negative, defaults to ndraw+burnin (so it will never be used).

ndraw : int (optional)

Defaults to 1000.

burnin : int (optional)

Defaults to 1000.

white : bool (optional)

Is con.covariance equal to identity?

use_constraint_directions : bool (optional)

Use the directions formed by the constraints as in the Gibbs scheme?

Returns

Z : np.float((ndraw, n))

Sample from the sphere intersect the constraints.

selectinf.constraints.quasi_affine.sample_from_sphere(con, Y, direction_of_interest=None, how_often=-1, ndraw=1000, burnin=1000, white=False)[source]

Use Gibbs sampler to simulate from con intersected with (whitened) sphere of radius np.linalg.norm(Y). When con.covariance is not \(I\), it samples from the ellipse of constant Mahalanobis distance from con.mean.

Parameters

con : `selection.affine.constraints`_

Y : np.float

Point satisfying the constraint.

direction_of_interest : np.float (optional)

Which projection is of most interest?

how_often : int (optional)

How often should the sampler make a move along direction_of_interest? If negative, defaults to ndraw+burnin (so it will never be used).

ndraw : int (optional)

Defaults to 1000.

burnin : int (optional)

Defaults to 1000.

white : bool (optional)

Is con.covariance equal to identity?

Returns

Z : np.float((ndraw, n))

Sample from the sphere intersect the constraints.

weights : np.float(ndraw)

Importance weights for the sample.

selectinf.constraints.quasi_affine.sqrt_inequality_solver(a, b, c, n)[source]

find the intervals for t such that, a*t + b*sqrt(n + t**2) leq c

returns: should return a single interval

selectinf.constraints.quasi_affine.stack(*cons)[source]

Combine constraints into a large constaint by intersection.

Parameters

cons : [`selection.affine.constraints`_]

A sequence of constraints.

Returns

intersection : `selection.quasi_affine.constraints`_

Notes

Resulting constraint will have mean 0 and covariance \(I\). If each is of type constraints, then quietly assumes that all residual projectors are the same, so it uses the first residual projector in the stack. If they are of type orthogonal then quietly assumes that all RSS and RSS_df are the same.

If they are of mixed type, raises an exception.