FroALS#

class mcrnmf.models.FroALS(rank, constraint_kind=0, unimodal=None, iter_max=500, tol=0.0001)[source]#

Frobenius norm-based Nonnegative Matrix Factorization (NMF) using Alternating Least Squares (ALS) method.

Factorizes a nonnegative matrix \(X\) into two nonnegative matrices \(W\) and \(H\) by minimizing the squared Frobenius norm between \(X\) and product \(WH\) using FPGM algorithm. The following objective function is minimized subject to nonnegativity and other optional constraints on \(W\) and \(H\):

\[f_{\textrm{obj}} = ||X - WH||_F^2\]

where \(||\cdot||_{F}\) is the Frobenius norm.

Parameters:
rankint

The number of components for the factorization.

constraint_kindinteger-like {0, 1, 2, 3, 4}, default=0

The following constraints are applied based on the integer value specified:

  • If 0: Only \(W \geq 0\), \(H \geq 0\).

  • If 1: Closure constraint \(H^T e ≤ e\).

  • If 2: Closure constraint \(H e = e\).

  • If 3: Constraint \(W^T e = e\).

  • If 4: Closure constraint \(H^T e = e\).

Note, for 1, 2, 3, and 4 values of constraint_kind nonnegativity constraints are also applied along with the additional constraint specified above.

unimodaldict or None, default=None

Specifies unimodality constraints for \(W\) and \(H\) matrices. If None, no unimodality constraints are applied.

If dict, Format: {‘W’: bool | list of bool, ‘H’: bool | list of bool}:

  • Must contain at least one key (‘W’ or ‘H’)

  • No other keys besides ‘W’ and ‘H’ are allowed

  • For ‘W’: Controls unimodality of columns in W

  • For ‘H’: Controls unimodality of rows in H

Each value can be:

  • A boolean: applies to all components

  • A list of booleans: selectively applies to specific components

Examples:

  • {'H': True}: All \(H\) components are unimodal

  • {'W': True, 'H': True}: All \(W\) and \(H\) components are unimodal

  • {'H': [True, False, True]}: Only components 0 and 2 of \(H\) have unimodal behavior

  • {'W': [False, True, False]}: Only component 1 of \(W\) has unimodal behavior

iter_maxint, default=500

Maximum number of iterations. It must be greater \(\geq 10\).

tolfloat, default=1e-4

Tolerance for convergence. Must be in the interval \((0, 1)\).

Convergence is reached when:

\[{|e[i] - e[i-10]| \over e[i]} \leq \textrm{tol}\]

where:

  • iteration \(i \geq 10\)

  • \(e[i]\) is the squared relative loss after iteration \(i\), which is defined as

    \[e[i] = {||X - W^{i}H^{i}||_{F}^2 \over ||X||_{F}^2}\]

    where \(W^{i}\) and \(H^{i}\) is the value of \(W\) and \(H\), respectively, after iteration \(i\).

References

[1]

Gillis, Nicolas. Nonnegative matrix factorization. Society for Industrial and Applied Mathematics, 2020.

[2]

Van Benthem, Mark H., and Michael R. Keenan. “Fast algorithm for the solution of large‐scale non‐negativity‐constrained least squares problems.” Journal of Chemometrics: A Journal of the Chemometrics Society 18.10 (2004): 441-450.

Examples

>>> from mcrnmf.models import FroALS, SNPA
>>> from mcrnmf.datasets import load_rxn_spectra
>>>
>>> # load the example dataset from mcrnmf
>>> X, wv, time  = load_rxn_spectra()
>>>
>>> # generate initial guess using SNPA
>>> snpa = SNPA(rank=4)
>>> snpa.fit(X)
>>> Wi = snpa.W  # Initial estimate for W
>>> Hi = snpa.H  # Initial estimate for H
>>>
>>> # create an instance of FroALS and fit the model
>>> model = FroALS(rank=4, constraint_kind=1, iter_max=2000, tol=1e-4)
>>> model.fit(X, Wi, Hi)
>>> # access decomposed factors
>>> W, H = model.W, model.H
>>> # check convergence status
>>> converged = model.is_converged
>>> # access rel. reconstruction error after each iterations
>>> rel_recon_err = model.rel_reconstruction_error_ls
fit(X, Wi, Hi, known_W=None, known_H=None, preprocess_scale_WH=False)[source]#

Fit the FroALS model to the provided data.

Parameters:
Xndarray of shape (n_features, n_samples)

Data array to be factorized.

Windarray of shape (n_features, rank)

Initial guess for the factor \(W\).

Hindarray of shape (rank, n_samples)

Initial guess for the factor \(H\).

known_Wndarray of shape (n_features, rank), default=None

Array containing known values of \(W\).

  • The np.nan elements of the array are treated as unknown.

  • Equality constraint is applied at those indices of \(W\) which do not correspond np.nan entries in known_W.

known_Hndarray of shape (rank, n_samples), default=None

Array containing known values of \(H\).

  • The np.nan elements of the array are treated as unknown.

  • Equality constraint is applied at those indices of \(H\) which do not correspond np.nan entries in known_H.

preprocess_scale_WHbool, default=False

If True, Wi and Hi are scaled before optimization.

property H#

The coefficient matrix \(H\) obtained after fitting the model.

Returns:
ndarray of shape (rank, n_samples)

The obtained \(H\) after fitting the model.

property W#

The basis matrix \(W\) obtained after fitting the model.

Returns:
ndarray of shape (n_features, rank)

The obtained \(W\) after fitting the model.

property is_converged#

The convergence status.

Returns:
bool

Whether the algorithm converged within iter_max iterations.

  • True if convergence was reached based on tol criterion

  • False if maximum iterations were reached without convergence

property rel_loss_ls#

List of relative loss values from each iteration during model fitting.

It is defined as:

\[\dfrac{\sqrt{f_{\textrm{obj}}^{i}}}{||X||_F}\ \textrm{,}\]

where:

  • \(||\cdot||_F\) denotes the Frobenius norm

  • \(X\) is the original data matrix

  • \(f_{\textrm{obj}}^{i}\) is the value of objective function after iteration \(i\)

Returns:
list of float

Relative loss value from each iteration.

property rel_reconstruction_error_ls#

List of relative reconstruction errors from each iteration during model fitting.

The relative reconstruction error measures how well the current factors approximate the original data. It is the ratio:

\[\dfrac{||X - W^{i}H^{i}||_F}{||X||_F}\ \textrm{,}\]

where:

  • \(||\cdot||_F\) denotes the Frobenius norm

  • \(X\) is the original data matrix

  • \(W^{i}\) and \(H^{i}\) are values of \(W\) and \(H\) after iteration \(i\)

Returns:
list of float

Relative reconstruction error from each iteration.