| Title: | Collapsed Variational Inference for Dirichlet Process (DP) Mixture Model |
|---|---|
| Description: | Collapsed Variational Inference for a Dirichlet Process (DP) mixture model with unknown covariance matrix structure and DP concentration parameter. It enables efficient clustering of high-dimensional data with significantly improved computational speed than traditional MCMC methods. The package incorporates 8 parameterisations and corresponding prior choices for the unknown covariance matrix, from which the user can choose and apply accordingly. |
| Authors: | Annesh Pal [aut, cre] (ORCID: <https://orcid.org/0009-0003-2146-180X>), Boris Hejblum [aut] (ORCID: <https://orcid.org/0000-0003-0646-452X>) |
| Maintainer: | Annesh Pal <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.2 |
| Built: | 2026-05-25 22:16:04 UTC |
| Source: | https://github.com/annesh07/vimixr |
Calculate the columnwise sum of rowwise cummulative probability
cum_clustprop(P1)cum_clustprop(P1)
P1 |
probability matrix |
rowwise cummulative probability
Calculate the columnwise sum of rowwise cummulative probability for variance
cum_clustprop_var(P1)cum_clustprop_var(P1)
P1 |
probability matrix |
rowwise cummulative probability for variance
Collapsed variational inference for non-parametric Bayesian mixture models
cvi_npmm( X, variational_params, prior_shape_alpha, prior_rate_alpha, post_shape_alpha, post_rate_alpha, prior_mean_eta, post_mean_eta, log_prob_matrix = NULL, maxit = 100, n_inits = 5, Seed = NULL, parallel = FALSE, pca_plot = FALSE, covariance_type = "full", fixed_variance = FALSE, cluster_specific_covariance = TRUE, variance_prior_type = c("IW", "decomposed", "sparse", "off-diagonal normal"), ... )cvi_npmm( X, variational_params, prior_shape_alpha, prior_rate_alpha, post_shape_alpha, post_rate_alpha, prior_mean_eta, post_mean_eta, log_prob_matrix = NULL, maxit = 100, n_inits = 5, Seed = NULL, parallel = FALSE, pca_plot = FALSE, covariance_type = "full", fixed_variance = FALSE, cluster_specific_covariance = TRUE, variance_prior_type = c("IW", "decomposed", "sparse", "off-diagonal normal"), ... )
X |
input data as a matrix |
variational_params |
number of clusters in the variational distribution |
prior_shape_alpha |
shape parameter of Gamma prior for the DP concentration parameter alpha. Default is 0.001 |
prior_rate_alpha |
rate parameter of Gamma prior for the DP concentration parameter alpha. Default is 0.001 |
post_shape_alpha |
initial value for posterior update of shape parameter for alpha. Default is 0.001 |
post_rate_alpha |
initial value for posterior update of ratee parameter for alpha. Default is 0.001 |
prior_mean_eta |
mean vector of MVN prior for the DP mean parameters. Default is zero vector |
post_mean_eta |
initial value of posterior update for the DP mean parameter |
log_prob_matrix |
logarithm of cluster allocation probability matrix. Default is NULL |
maxit |
maximum number of iterations. Default is 100 |
n_inits |
Number of random initialisations if log_prob_matrix and other case-specific hyperparameters are NULL. Default is 5 |
Seed |
Seeds for random initialisation; either a vector of n_inits integers or NULL. Default is NULL. |
parallel |
Logical input for parallelisation. Default is FALSE |
pca_plot |
Logical input for pca plot. Default is FALSE |
covariance_type |
covariance matrix is considered diagonal or full. Default is 'full' |
fixed_variance |
covariance matrix of the data is considered known (fixed) or unknown. Default is FALSE |
cluster_specific_covariance |
covariance matrix is specific to a cluster allocation or it is same over all cluster choices. Default is TRUE |
variance_prior_type |
For unknown and full covariance matrix, choice of matrix prior is either Inverse-Wishart ('IW') or Cholesky-decomposed ('decomposed'). For unknown, full and cluster-specific covariance matrix, choice of matrix prior is either Inverse-Wishart ('IW'), element-wise Gamma and Laplace distributed ('sparse') or element-wise Gamma and Normal distributed ('off-diagonal normal') |
... |
additional parameters, further details given below |
The following models are supported in vimixr, listing their
required input arguments in ... when calling cvi_npmm():
Known covariance
diagonal covariance We need the following additional arguments:
cov_data: a non-negative diagonal matrix, representing
the covariance of the dataprior_precision_scalar_eta: a non-negative scalar,
representing the precision prior for the DP mean parameterspost_precision_scalar_eta: initial value for the
posterior update of precision for the DP mean parametersfull covariance We need the following additional arguments:
cov_data: a positive definite matrix, representing
the covariance of the dataprior_cov_eta: a positive definite matrix,
representing the covariance prior for the DP mean parameterspost_cov_eta: initial value for the
posterior update of covariance for the DP mean parametersUnknown covariance (Global)
diagonal covariance We need the following additional arguments:
prior_shape_scalar_cov: a non-negative scalar, representing
the shape parameter of Gamma prior for the precisionprior_rate_scalar_cov: a non-negative scalar, representing
the rate parameter of Gamma prior for the precisionpost_shape_scalar_cov: initial value for posterior update of
precision shape parameterpost_rate_scalar_cov: initial value for posterior update of
precision rate parameterprior_precision_scalar_eta: a non-negative scalar,
representing the precision prior for the DP mean parameterspost_precision_scalar_eta: initial value for the
posterior update of precision for the DP mean parametersInverse-Wishart We need the following additional arguments:
prior_df_cov: a scalar as the degree of freedom parameter
of the Inverse-Wishart prior, Default value D+2prior_scale_cov: positive-definite matrix as the scale
parameter of the Inverse-Wishart priorpost_df_cov: initial value for the posterior update of
degree of freedompost_scale_cov: initial value for the posterior update of
scale matrixprior_cov_eta: a positive definite matrix,
representing the covariance prior for the DP mean parameterspost_cov_eta: initial value for the
posterior update of covariance for the DP mean parametersCholesky-decomposition We need the following additional arguments:
prior_shape_diag_decomp: a non-negative scalar as the shape
parameter of Gamma prior for diagonal elements of the
Cholesly-decomposed matrixprior_rate_diag_decomp: a non-negative scalar as the rate
parameter of Gamma prior for diagonal elements of the
Cholesly-decomposed matrixprior_mean_offdiag_decomp: a scalar as the mean
parameter of Normal prior for off-diagonal elements of the
Cholesly-decomposed matrixprior_var_offdiag_decomp: a non-negative scalar as the variance
parameter of Normal prior for off-diagonal elements of the
Cholesly-decomposed matrixpost_shape_diag_decomp: initial value for posterior update
of the shape parameter for diagonal elementspost_rate_diag_decomp: initial value for posterior update
of the rate parameter for diagonal elementspost_mean_offdiag_decomp: initial value for posterior update
of the mean parameter for off-diagonal elementspost_var_offdiag_decomp: initial value for posterior update
of the variance parameter for off-diagonal elementsprior_cov_eta: a positive definite matrix,
representing the covariance prior for the DP mean parameterspost_cov_eta: initial value for the
posterior update of covariance for the DP mean parametersUnknown covariance (cluster-specific)
Inverse Wishart We need the following additional arguments:
prior_df_cs_cov: a vector representing degree of freedom
parameters for each cluster-specific Inverse-Wishart priorprior_scale_cs_cov: an array of positive-definite matrices
representing scale matrix parameters for each cluster-specific
Inverse-Wishart priorpost_df_cs_cov: initial value for posterior update of the
degree of freedom parameterspost_scale_cs_cov: initial value for posterior update of
the scale matrix parametersscaling_cov_eta: a non-negative scaling factor for
covariance matrix of the DP mean parametersElement-wise Gamma and Laplace prior We need the following additional arguments:
prior_shape_d_cs_cov: a non-negative vector as shape
parameters for cluster-specific Gamma priors of the diagonal
elementsprior_rate_d_cs_cov: a non-negative matrix as rate
parameter for cluster-specific Gamma prior of the diagonal
elementsprior_var_offd_cs_cov: a non-negative vector as variance
parameter for cluster-specific Laplace priors of the off-diagonal
elementspost_shape_d_cs_cov: initial value for posterior update of
the diagonal shape parameterspost_rate_d_cs_cov: initial value for posterior update of
the diagonal rate parameterspost_var_offd_cs_cov: initial value for sum, squared sum and log sum of
the off-diagonal variance parameters for computation purpose, strictly positivescaling_cov_eta: a non-negative scaling factor for
covariance matrix of the DP mean parametersElement-wise Gamma and Normal prior We need the following additional arguments:
prior_shape_d_cs_cov: a non-negative vector as shape
parameters for cluster-specific Gamma priors of the diagonal
elementsprior_rate_d_cs_cov: a non-negative matrix as rate
parameter for cluster-specific Gamma prior of the diagonal
elementsprior_var_offd_cs_cov: a non-negative scalar as variance
parameter for cluster-specific Normal priors of the off-diagonal
elementspost_shape_d_cs_cov: initial value for posterior update of
the diagonal shape parameterspost_rate_d_cs_cov: initial value for posterior update of
the diagonal rate parameterspost_mean_offd_cs_cov: initial value for posterior update of
the off-diagonal mean parametersscaling_cov_eta: a non-negative scaling factor for
covariance matrix of the DP mean parameters[vimixr()] returns a list with the following elements:
alpha: posterior DP concentration parameter
Cluster number: number of clusters from posterior probability allocation matrix
Cluster Proportion: cluster proportions from posterior probability allocation matrix
log Probability matrix: log of posterior probability allocation matrix
ELBO: Optimisation of the ELBO function
Iterations: Number of iterations required for convergence
PCA_viz: A PCA [ggplot2] plot to visualize the clustering of data based on cluster labels
ELBO_viz: A line [ggplot2] plot to visualize the ELBO optimisation
X <- rbind(matrix(rnorm(100, m=0, sd=0.5), ncol=2), matrix(rnorm(100, m=3, sd=0.5), ncol=2)) #for fixed-diagonal res <- cvi_npmm(X, variational_params = 20, prior_shape_alpha = 0.001, prior_rate_alpha = 0.001, post_shape_alpha = 0.001, post_rate_alpha = 0.001, prior_mean_eta = matrix(0, 1, ncol(X)), post_mean_eta = matrix(0.001, 20, ncol(X)), log_prob_matrix = t(apply(matrix(-3, nrow(X), 20), 1, function(x){x/sum(x)})), maxit = 100, fixed_variance = TRUE, covariance_type = "diagonal", prior_precision_scalar_eta = 0.001, post_precision_scalar_eta = matrix(0.001, 20, 1), cov_data = diag(ncol(X))) summary(res) plot(res)X <- rbind(matrix(rnorm(100, m=0, sd=0.5), ncol=2), matrix(rnorm(100, m=3, sd=0.5), ncol=2)) #for fixed-diagonal res <- cvi_npmm(X, variational_params = 20, prior_shape_alpha = 0.001, prior_rate_alpha = 0.001, post_shape_alpha = 0.001, post_rate_alpha = 0.001, prior_mean_eta = matrix(0, 1, ncol(X)), post_mean_eta = matrix(0.001, 20, ncol(X)), log_prob_matrix = t(apply(matrix(-3, nrow(X), 20), 1, function(x){x/sum(x)})), maxit = 100, fixed_variance = TRUE, covariance_type = "diagonal", prior_precision_scalar_eta = 0.001, post_precision_scalar_eta = matrix(0.001, 20, 1), cov_data = diag(ncol(X))) summary(res) plot(res)
Update of the variational parameters
CVI_update_function( fixed_variance = FALSE, covariance_type = "diagonal", cluster_specific_covariance = TRUE, variance_prior_type = c("IW", "decomposed", "sparse", "off-diagonal normal"), X, inverts, params )CVI_update_function( fixed_variance = FALSE, covariance_type = "diagonal", cluster_specific_covariance = TRUE, variance_prior_type = c("IW", "decomposed", "sparse", "off-diagonal normal"), X, inverts, params )
fixed_variance |
whether the covariance is fixed or estimated.
Default is |
covariance_type |
The assumed type of the covariance matrix.
Can be either |
cluster_specific_covariance |
whether the the covariance is shared across
estimated clusters or is cluster specific. Default is |
variance_prior_type |
character string specifying the type of prior distribution
for the covariance when cluster_specific_covariance is |
X |
the data matrix |
inverts |
a list of inverses |
params |
a list of required arguments |
Updated parameters
Root for a0 hyper-parameter for Sparse DPMM
eBa0( logP, X, a_min = min(1e-08, 1/ncol(X)), a_max = max(1e+06, ncol(X)), grid_points = min(ncol(X), 10000) )eBa0( logP, X, a_min = min(1e-08, 1/ncol(X)), a_max = max(1e+06, ncol(X)), grid_points = min(ncol(X), 10000) )
logP |
log of probability allocation matrix |
X |
observed data |
a_min |
minimum value of a0 for grid search |
a_max |
maximum value of a0 for grid search |
grid_points |
number of points for grid search |
No return value, called for side effects.
ELBO calculating functions depending on type of model for covariance matrix
elbo_fixed_diagonal(X, inverts, params)elbo_fixed_diagonal(X, inverts, params)
X |
the data matrix |
inverts |
a list of inverses |
params |
a list of required arguments |
No return value, called for side effects.
General ELBO function
ELBO_function( fixed_variance = FALSE, covariance_type = "diagonal", cluster_specific_covariance = TRUE, variance_prior_type = c("IW", "decomposed", "sparse", "off-diagonal normal"), X, inverts, params )ELBO_function( fixed_variance = FALSE, covariance_type = "diagonal", cluster_specific_covariance = TRUE, variance_prior_type = c("IW", "decomposed", "sparse", "off-diagonal normal"), X, inverts, params )
fixed_variance |
whether the covariance is fixed or estimated.
Default is |
covariance_type |
The assumed type of the covariance matrix.
Can be either |
cluster_specific_covariance |
whether the the covariance is shared across
estimated clusters or is cluster specific. Default is |
variance_prior_type |
character string specifying the type of prior distribution
for the covariance when cluster_specific_covariance is |
X |
the data matrix |
inverts |
a list of inverses |
params |
a list of required arguments |
ELBO values
Generate random log Probability matrix if not provided
generate_log_prob(N, T0, seed0)generate_log_prob(N, T0, seed0)
N |
rows of the data matrix |
T0 |
variational clusters |
seed0 |
seed for generating log Probability matrix |
No return value, called for side effects.
Log-sum-exponential computation on the log probability allocation matrix
log_sum_exp(Plog)log_sum_exp(Plog)
Plog |
log probability allocation matrix |
per sample ordered log probability allocation matrix
Extract lower diagonal elements of a Matrix, and perform sum, squared sum and log sum
lower_tri_stats(M)lower_tri_stats(M)
M |
matrix |
a vector of sum, squared sum and log sum elements
Calculate matrix multiplication with optional transposition.
mat_mult(A, B, transpose_A = FALSE, transpose_B = FALSE)mat_mult(A, B, transpose_A = FALSE, transpose_B = FALSE)
A |
matrix or vector |
B |
matrix or vector |
transpose_A |
transpose A before multiplying |
transpose_B |
transpose B before multiplying |
A %*% B (or variant), as vector if either input was a vector
Calculate a combination of matrix multiplications
mat_mult_t(A, B, C)mat_mult_t(A, B, C)
A |
matrix |
B |
matrix |
C |
matrix |
A %% B %% t(C)
Function to check the list of type-specific arguments
params_check( params, fixed_variance = FALSE, covariance_type = "diagonal", cluster_specific_covariance = TRUE, variance_prior_type = c("IW", "decomposed", "sparse", "off-diagonal normal") )params_check( params, fixed_variance = FALSE, covariance_type = "diagonal", cluster_specific_covariance = TRUE, variance_prior_type = c("IW", "decomposed", "sparse", "off-diagonal normal") )
params |
the list of required parameters |
fixed_variance |
whether covariance is assumed fixed or not; can be TRUE or FALSE |
covariance_type |
structure of covariance matrix; can be "diagonal" or "full" |
cluster_specific_covariance |
whether covariance matrix is cluster specific or not; can be TRUE or FALSE |
variance_prior_type |
prior distribution for the covariance matrix; can be "IW" or "decomposed" when cluster_specific_covariance = FALSE, or can be "IW", "sparse" or "off-diagonal normal" otherwise |
stops the code if the required list of arguments are not present
CVIoutputobjects'S3 plotting function for CVIoutputobjects'
## S3 method for class 'CVIoutput' plot(x, ...)## S3 method for class 'CVIoutput' plot(x, ...)
x |
a CVIoutput object |
... |
additional arguments |
A ggplot object representing visualisation
Calculate a combination of matrix multiplications
quadratic_form_diag(A, B)quadratic_form_diag(A, B)
A |
matrix |
B |
matrix |
diag(A %% B %% t(A))
CVI implementation for one set of initial parameters
run_single( config, X, N, D, T0, prior_shape_alpha, prior_rate_alpha, post_shape_alpha, post_rate_alpha, prior_mean_eta, post_mean_eta, fixed_variance, covariance_type, cluster_specific_covariance, variance_prior_type, maxit, varargs )run_single( config, X, N, D, T0, prior_shape_alpha, prior_rate_alpha, post_shape_alpha, post_rate_alpha, prior_mean_eta, post_mean_eta, fixed_variance, covariance_type, cluster_specific_covariance, variance_prior_type, maxit, varargs )
config |
List of inputs that are generated if not user-provided |
X |
the data matrix |
N |
samples of X |
D |
dimensions of X |
T0 |
variational clusters |
prior_shape_alpha |
shape parameter of Gamma prior for the DP concentration parameter alpha. Default is 0.001 |
prior_rate_alpha |
rate parameter of Gamma prior for the DP concentration parameter alpha. Default is 0.001 |
post_shape_alpha |
initial value for posterior update of shape parameter for alpha. Default is 0.001 |
post_rate_alpha |
initial value for posterior update of ratee parameter for alpha. Default is 0.001 |
prior_mean_eta |
mean vector of MVN prior for the DP mean parameters. Default is zero vector |
post_mean_eta |
initial value of posterior update for the DP mean parameter |
fixed_variance |
covariance matrix of the data is considered known (fixed) or unknown. |
covariance_type |
covariance matrix is considered diagonal or full. |
cluster_specific_covariance |
covariance matrix is specific to a cluster allocation or it is same over all cluster choices. |
variance_prior_type |
For unknown and full covariance matrix, choice of matrix prior is either Inverse-Wishart ('IW') or Cholesky-decomposed ('decomposed'). For unknown, full and cluster-specific covariance matrix, choice of matrix prior is either Inverse-Wishart ('IW'), element-wise Gamma and Laplace distributed ('sparse') or element-wise Gamma and Normal distributed ('off-diagonal normal') |
maxit |
Maximum number of iterations for variational updates |
varargs |
List of case specific parameters |
a list with the following elements:
alpha: posterior DP concentration parameter
Cluster number: number of clusters from posterior probability allocation matrix
Cluster Proportion: cluster proportions from posterior probability allocation matrix
log Probability matrix: log of posterior probability allocation matrix
ELBO: Optimisation of the ELBO function
Iterations: Number of iterations required for convergence
Calculate the sum, squared sum and log sum of off-diagonal vector elements from the covariance array
sparse_cov_op(X, P, inv_C0, L1)sparse_cov_op(X, P, inv_C0, L1)
X |
data matrix |
P |
probability allocation matrix |
inv_C0 |
matrix corresponding diagonal elements of the cluster precision matrices |
L1 |
cluster mean matrix |
likelihood term calculation in elbo
A C++ alternative of sweep() function from base R
sweep_3D(A, R, dims, n_threads = 4L)sweep_3D(A, R, dims, n_threads = 4L)
A |
a 3D array |
R |
a vector |
dims |
dimensions in 3D |
n_threads |
number of threads |
sweep(A, 3, R, "*")
Calculate a combination of matrix multiplications
t_mat_mult(A, B, C)t_mat_mult(A, B, C)
A |
matrix |
B |
matrix |
C |
matrix |
t(A) %% B %% C