distributions.pvalue

Module: distributions.pvalue

Inheritance diagram for selectinf.distributions.pvalue:

digraph inheritancee4633029ad { rankdir=LR; size="8.0, 12.0"; "distributions.pvalue.SelectionInterval" [URL="#selectinf.distributions.pvalue.SelectionInterval",fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5)",target="_top",tooltip="Compute a selection interval for"]; }

This module contains functions needed to evaluate post selection p-values for non polyhedral selection procedures through a variety of means.

These p-values appear for the group LASSO global null test as well as the nuclear norm p-value test.

They are described in the Kac Rice paper.

Class

SelectionInterval

class selectinf.distributions.pvalue.SelectionInterval(lower_bound, observed, upper_bound, sigma)[source]

Bases: object

Compute a selection interval for a Gaussian truncated to an interval.

__init__(lower_bound, observed, upper_bound, sigma)[source]

Initialize self. See help(type(self)) for accurate signature.

pivot(exp)[source]
conf_int(lb, ub, alpha=0.05)[source]

Functions

selectinf.distributions.pvalue.chi_pvalue(observed, lower_bound, upper_bound, sd, df, method='MC', nsim=1000)[source]

Compute a truncated \(\chi\) p-value based on the conditional survival function.

Parameters

observed : float

lower_bound : float

upper_bound : float

sd : float

Standard deviation.

df : float

Degrees of freedom.

method: string

One of [‘MC’, ‘cdf’, ‘sf’]

Returns

pvalue : float

Notes

Let \(T\) be observed, \(L\) be lower_bound and \(U\) be upper_bound, and \(\sigma\) be sd. The p-value, for \(L \leq T \leq U\) is

\[\frac{P(\chi^2_k / \sigma^2 \geq T^2) - P(\chi^2_k / \sigma^2 \geq U^2)} {P(\chi^2_k / \sigma^2 \geq L^2) - P(\chi^2_k / \sigma^2 \geq U^2)} \]

It can be computed using scipy.stats.chi either its cdf (distribution function) or sf (survival function) or evaluated by Monte Carlo if method is MC.

selectinf.distributions.pvalue.gauss_poly(lower_bound, upper_bound, curvature, nsim=100)[source]

Computes the integral of a polynomial times the standard Gaussian density over an interval.

Introduced in Kac Rice, display (33) of v2.

Parameters

lower_bound : float

upper_bound : float

curvature : np.array

A diagonal matrix related to curvature. It is assumed that curvature + lower_bound I is non-negative definite.

nsim : int

How many draws from \(N(0,1)\) should we use?

Returns

integral : float

Notes

The return value is a Monte Carlo estimate of

\[\int_{L}^{U} \det(\Lambda + z I) \frac{e^{-z^2/2\sigma^2}}{\sqrt{2\pi\sigma^2}} \, dz\]

where \(L\) is lower_bound, \(U\) is upper_bound and \(\Lambda\) is the diagonal matrix curvature.

selectinf.distributions.pvalue.general_pvalue(observed, lower_bound, upper_bound, curvature, nsim=100)[source]

Computes the integral of a polynomial times the standard Gaussian density over an interval.

Introduced in Kac Rice, display (35) of v2.

Parameters

observed : float

lower_bound : float

upper_bound : float

curvature : np.array

A diagonal matrix related to curvature. It is assumed that curvature + lower_bound I is non-negative definite.

nsim : int

How many draws from \(N(0,1)\) should we use?

Returns

integral : float

Notes

The return value is a Monte Carlo estimate of

\[\frac{\int_{T}^{U} \det(\Lambda + z I) \frac{e^{-z^2/2\sigma^2}}{\sqrt{2\pi\sigma^2}} \, dz} {\int_{L}^{U} \det(\Lambda + z I) \frac{e^{-z^2/2\sigma^2}}{\sqrt{2\pi\sigma^2}} \, dz}\]

where \(T\) is observed, \(L\) is lower_bound, \(U\) is upper_bound and \(\Lambda\) is the diagonal matrix curvature.

selectinf.distributions.pvalue.norm_interval(lower, upper)[source]

A multiprecision evaluation of

\[\Phi(U) - \Phi(L)\]
Parameters

lower : float

The lower limit \(L\)

upper : float

The upper limit \(U\)

selectinf.distributions.pvalue.norm_pdf(observed)[source]

A multi-precision calculation of the standard normal density function:

\[\frac{e^{-T^2/2}}{\sqrt{2\pi}}\]

where T is observed.

Parameters

observed : float

Returns

density : float

selectinf.distributions.pvalue.norm_q(prob)[source]

A multi-precision calculation of the standard normal quantile function:

\[\int_{-\infty}^{q(p)} \frac{e^{-z^2/2}}{\sqrt{2\pi}} \; dz = p\]

where \(p\) is prob.

Parameters

prob : float

Returns

quantile : float

selectinf.distributions.pvalue.truncnorm_cdf(observed, lower, upper)[source]

Compute the truncated normal distribution function.

\[\frac{\Phi(U) - \Phi(T)}{\Phi(U) - \Phi(L)}\]

where \(T\) is observed, \(L\) is lower_bound and \(U\) is upper_bound.

Parameters

observed : float

lower : float

upper : float

Returns

P : float