algorithms.softmax

Module: algorithms.softmax

Inheritance diagram for selectinf.algorithms.softmax:

digraph inheritance5e9e98c1dc { rankdir=LR; size="8.0, 12.0"; "algorithms.softmax.softmax_objective" [URL="#selectinf.algorithms.softmax.softmax_objective",fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5)",target="_top",tooltip="The softmax objective"]; "regreg.smooth.smooth_atom" -> "algorithms.softmax.softmax_objective" [arrowsize=0.5,style="setlinewidth(0.5)"]; "problems.composite.composite" [fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5)",tooltip="A generic way to specify a problem in composite form."]; "problems.composite.smooth" [fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5)",tooltip="A composite subclass that has 0 as "]; "problems.composite.composite" -> "problems.composite.smooth" [arrowsize=0.5,style="setlinewidth(0.5)"]; "regreg.smooth.smooth_atom" [fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5)",tooltip="A class for representing a smooth function and its gradient"]; "problems.composite.smooth" -> "regreg.smooth.smooth_atom" [arrowsize=0.5,style="setlinewidth(0.5)"]; }

This module implements the softmax approximation for a multivariate Gaussian truncated by affine constraints. The approximation is an approximation of the normalizing constant in the likelihood.

softmax_objective

class selectinf.algorithms.softmax.softmax_objective(shape, precision, constraints, feasible_point, coef=1.0, offset=None, quadratic=None, initial=None)[source]

Bases: regreg.smooth.smooth_atom

The softmax objective

\[z \mapsto \frac{1}{2} z^TQz + \sum_{i=1}^{m} \log \left(1 + \frac{1}{(b_i-A_i^T z) / s_i} \right)\]

Notes

Recall Chernoff’s approximation for \(Z \sim N(0,I_{n \times n})\):

\[-\log P_{\mu}(AZ \leq b) \approx \inf_{z:Az \leq b} \frac{1}{2}\|z-\mu\|^2_2 = \inf_{z} I_K(z) + \frac{1}{2}\|z-\mu\|^2_2\]

where \(I_K\) is the constraint for the set \(K=\left\{z:Az \leq b \right\}.\)

The softmax approximation is similar to Chernoff’s approximation though it uses a soft max barrier function

\[\sum_{i=1}^{m}\log\left(1+\frac{1}{b_i-A_i^T z}\right).\]

The softmax objective is

\[z \mapsto \frac{1}{2} z^TQz + \sum_{i=1}^{m}\log\left(1+\frac{1}{(b_i-A_i^T z) / s_i}\right).\]

where \(s_i\) are scalings and \(Q\) is a precision matrix (i.e. inverse covariance).

__init__(shape, precision, constraints, feasible_point, coef=1.0, offset=None, quadratic=None, initial=None)[source]

Initialize self. See help(type(self)) for accurate signature.

objective_template = '\\text{softmax}_K\\left(%(var)s\\right)'
smooth_objective(param, mode='both', check_feasibility=False)[source]

Evaluate the smooth objective, computing its value, gradient or both.

Parameters

mean_param : ndarray

The current parameter values.

mode : str

One of [‘func’, ‘grad’, ‘both’].

check_feasibility : bool

If True, return np.inf when point is not feasible, i.e. when mean_param is not in the domain.

Returns

If mode is ‘func’ returns just the objective value

at mean_param, else if mode is ‘grad’ returns the gradient

else returns both.

classmethod affine(linear_operator, offset, coef=1, diag=False, quadratic=None, **kws)

Keywords given in kws are passed to cls constructor along with other arguments

apply_offset(x)

If self.offset is not None, return x-self.offset, else return x.

property conjugate
get_conjugate()
get_lipschitz()
get_offset()
get_quadratic()

Get the quadratic part of the composite.

latexify(var=None, idx='')
classmethod linear(linear_operator, coef=1, diag=False, offset=None, quadratic=None, **kws)

Keywords given in kws are passed to cls constructor along with other arguments

property lipschitz
nonsmooth_objective(x, check_feasibility=False)
objective(x, check_feasibility=False)
objective_vars = {'coef': 'C', 'offset': '\\alpha+', 'shape': 'p', 'var': '\\beta'}
property offset
proximal(quadratic)
proximal_optimum(quadratic)
proximal_step(quadratic, prox_control=None)

Compute the proximal optimization

Parameters

prox_control: [None, dict]

If not None, then a dictionary of parameters for the prox procedure

property quadratic

Quadratic part of the object, instance of regreg.identity_quadratic.identity_quadratic.

scale(obj, copy=False)
set_lipschitz(value)
set_offset(value)
set_quadratic(quadratic)

Set the quadratic part of the composite.

classmethod shift(offset, coef=1, quadratic=None, **kws)

Keywords given in kws are passed to cls constructor along with other arguments

smoothed(smoothing_quadratic)

Add quadratic smoothing term

solve(quadratic=None, return_optimum=False, **fit_args)