MinVol#

class mcrnmf.models.MinVol(rank, constraint_kind=0, unimodal=None, iter_max=500, tol=0.0001, inner_iter_max=20, inner_iter_tol=0.1, delta=0.1, lambdaa=0.001)[source]#

Minimum-Volume Nonnegative Matrix Factorization (NMF) implementation using Fast Projected Gradient Method.

This class implements the FPGM to perform Minimum-Volume NMF, minimizing the following objective function subject to non-negativity and other optional constraints.

\[f_{\textrm{obj}} = ||X - WH||_F^2 + \lambda \times \log(\det(W^TW + \delta \times I))\]

where:

  • \(||\cdot||_{F}\) is the Frobenius norm.

  • \(X\) the nonnegative matrix to be factorized

  • \(W\) and \(H\) are the nonnegative factors

  • \(\lambda\) and \(\delta\) are model parameters. See definition in the Parameters section

  • \(I\) is an identity matrix

Parameters:
rankint

The number of components for the factorization.

constraint_kindinteger-like {0, 1, 2, 3, 4}, default=0

The following constraints are applied based on the integer value specified:

  • If 0: Only \(W \geq 0\), \(H \geq 0\).

  • If 1: Closure constraint \(H^T e ≤ e\).

  • If 2: Closure constraint \(H e = e\).

  • If 3: Constraint \(W^T e = e\).

  • If 4: Closure constraint \(H^T e = e\).

Note, for 1, 2, 3, and 4 values of constraint_kind nonnegativity constraints are also applied along with the additional constraint specified above.

unimodaldict or None, default=None

Specifies unimodality constraints for \(W\) and \(H\) matrices. If None, no unimodality constraints are applied.

If dict, Format: {‘W’: bool | list of bool, ‘H’: bool | list of bool}:

  • Must contain at least one key (‘W’ or ‘H’)

  • No other keys besides ‘W’ and ‘H’ are allowed

  • For ‘W’: Controls unimodality of columns in W

  • For ‘H’: Controls unimodality of rows in H

Each value can be:

  • A boolean: applies to all components

  • A list of booleans: selectively applies to specific components

Examples:

  • {'H': True}: All \(H\) components are unimodal

  • {'W': True, 'H': True}: All \(W\) and \(H\) components are unimodal

  • {'H': [True, False, True]}: Only components 0 and 2 of \(H\) have unimodal behavior

  • {'W': [False, True, False]}: Only component 1 of \(W\) has unimodal behavior

iter_maxint, default=500

Maximum number of iterations. It must be greater \(\geq 10\).

tolfloat, default=1e-4

Tolerance for convergence. Must be in the interval \((0, 1)\).

Convergence is reached when:

\[{|e[i] - e[i-10]| \over e[i]} \leq \textrm{tol}\]

where:

  • iteration \(i \geq 10\)

  • \(e[i]\) is the squared relative loss after iteration \(i\), which is defined as

    \[e[i] = {||X - W^{i}H^{i}||_{F}^2 \over ||X||_{F}^2}\]

    where \(W^{i}\) and \(H^{i}\) is the value of \(W\) and \(H\), respectively, after iteration \(i\).

inner_iter_maxint, default=20

Maximum number of inner iterations performed during each single update of either \(W\) or \(H\) while the other is held fixed.

inner_iter_tolfloat, default=0.1

Tolerance for the convergence of the inner loop while updating \(H\).

The inner loop convergence is reached when:

\[{||H^{k} - H^{k-1}||_{F} \over ||H^{1} - H^{0}||_{F}} \leq \textrm{inner_iter_tol}\]

where:

  • \(H^{k}\) is value of \(H\) after inner-iteration \(k\).

  • \(H^{0}\) is the value of \(H\) before the first inner-iteration.

deltafloat, default=0.1

Value of parameter \(\delta\). It ensures numerical stability when \(W\) is rank-deficient.

lambdaafloat, default=1e-3

The intial weight, \(\lambda\), of the volume-regularization term.

References

[1]

Gillis, N. (2020). Nonnegative matrix factorization. Society for Industrial and Applied Mathematics.

[2]

Leplat, Valentin, Andersen MS Ang, and Nicolas Gillis. “Minimum-volume rank- deficient nonnegative matrix factorizations.” ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2019.

Examples

>>> from mcrnmf.models import MinVol, SNPA
>>> from mcrnmf.datasets import load_rxn_spectra
>>>
>>> # load the example dataset from mcrnmf
>>> X, wv, time  = load_rxn_spectra()
>>>
>>> # generate initial guess using SNPA
>>> snpa = SNPA(rank=4)
>>> snpa.fit(X)
>>> Wi = snpa.W  # Initial estimate for W
>>> Hi = snpa.H  # Initial estimate for H
>>>
>>> # create an instance of FroALS and fit the model
>>> model = MinVol(rank=4, constraint_kind=1, iter_max=2000, tol=1e-4, lambdaa=1e-3)
>>> model.fit(X, Wi, Hi)
>>> # access decomposed factors
>>> W, H = model.W, model.H
>>> # check convergence status
>>> converged = model.is_converged
>>> # access rel. reconstruction error after each iterations
>>> rel_recon_err = model.rel_reconstruction_error_ls
fit(X, Wi, Hi, known_W=None, known_H=None, preprocess_scale_WH=True)[source]#

Fit the MinVol model to the provided data.

Parameters:
Xndarray of shape (n_features, n_samples)

Data array to be factorized.

Windarray of shape (n_features, rank)

Initial guess for the factor \(W\).

Hindarray of shape (rank, n_samples)

Initial guess for the factor \(H\).

known_Wndarray of shape (n_features, rank), default=None

Array containing known values of \(W\).

  • The np.nan elements of the array are treated as unknown.

  • Equality constraint is applied at those indices of \(W\) which do not correspond np.nan entries in known_W.

known_Hndarray of shape (rank, n_samples), default=None

Array containing known values of \(H\).

  • The np.nan elements of the array are treated as unknown.

  • Equality constraint is applied at those indices of \(H\) which do not correspond np.nan entries in known_H.

preprocess_scale_WHbool, default=True

If True, Wi and Hi are scaled before optimization.

property H#

The coefficient matrix \(H\) obtained after fitting the model.

Returns:
ndarray of shape (rank, n_samples)

The obtained \(H\) after fitting the model.

property W#

The basis matrix \(W\) obtained after fitting the model.

Returns:
ndarray of shape (n_features, rank)

The obtained \(W\) after fitting the model.

property is_converged#

The convergence status.

Returns:
bool

Whether the algorithm converged within iter_max iterations.

  • True if convergence was reached based on tol criterion

  • False if maximum iterations were reached without convergence

property rel_loss_ls#

List of relative loss values from each iteration during model fitting.

It is defined as:

\[\dfrac{\sqrt{f_{\textrm{obj}}^{i}}}{||X||_F}\ \textrm{,}\]

where:

  • \(||\cdot||_F\) denotes the Frobenius norm

  • \(X\) is the original data matrix

  • \(f_{\textrm{obj}}^{i}\) is the value of objective function after iteration \(i\)

Returns:
list of float

Relative loss value from each iteration.

property rel_reconstruction_error_ls#

List of relative reconstruction errors from each iteration during model fitting.

The relative reconstruction error measures how well the current factors approximate the original data. It is the ratio:

\[\dfrac{||X - W^{i}H^{i}||_F}{||X||_F}\ \textrm{,}\]

where:

  • \(||\cdot||_F\) denotes the Frobenius norm

  • \(X\) is the original data matrix

  • \(W^{i}\) and \(H^{i}\) are values of \(W\) and \(H\) after iteration \(i\)

Returns:
list of float

Relative reconstruction error from each iteration.