Apply ResNMTF
apply_resnmtf.Rd
Apply ResNMTF to data for a range of biclusters selecting the optimal number, with optional stability analysis
Usage
apply_resnmtf(
data,
init_f = NULL,
init_s = NULL,
init_g = NULL,
k_vec = NULL,
phi = NULL,
xi = NULL,
psi = NULL,
n_iters = NULL,
k_min = 3,
k_max = 8,
distance = "euclidean",
num_repeats = 5,
no_clusts = FALSE,
sample_rate = 0.9,
n_stability = 5,
stability = TRUE,
stab_thres = 0.4,
remove_unstable = TRUE,
use_parallel = TRUE
)
Arguments
- data
list of n_v matrices, data to be factorised. If only one view is supplied, can be given as a matrix.
- init_f
list of matrices, initialisation for F matrices
- init_s
list of matrices, initialisation for S matrices
- init_g
list of matrices, initialisation for G matrices
- k_vec
vector of integers, number of clusters to consider in each view, default is NULL
- phi
n_v x n_v matrix, default is NULL, restriction matrices for F
- xi
n_v x n_v matrix, default is NULL, restriction matrices for S
- psi
n_v x n_v matrix, default is NULL, restriction matrices for G
- n_iters
integer, default is NULL, number of iterations to run for, otherwise will run until convergence
- k_min
positive integer, default is 3, smallest value of k to be considered initially,
- k_max
positive integer, default is 6, largest value of k to be considered initially,
- distance
string, default is "euclidean", distance metric to use within the bisilhouette score
- num_repeats
integer, default is 5, number of repeats to use within stability analysis
- no_clusts
boolean, default is FALSE, whether to return only the factorisation or not,
- sample_rate
numeric, default is 0.9, proportion of data to sample for stability analysis,
- n_stability
integer, default is 5, number of times to repeat stability analysis,
- stability
boolean, default is TRUE, whether to perform stability analysis or not,
- stab_thres
numeric, default is 0.4, threshold for stability analysis,
- remove_unstable
boolean, default is TRUE, whether to remove unstable clusters or not
- use_parallel
boolean, default is TRUE, wheather to use parallelisation, not applicable on Windows or linux machines
Value
list of results from ResNMTF, containing the following: - output_f: list of matrices, F matrices - output_s: list of matrices, S matrices - output_g: list of matrices, G matrices - Error: numeric, mean error - All_Error: numeric, all errors - bisil: numeric, bisilhouette score - row_clusters: list of matrices, row clusters - col_clusters: list of matrices, column clusters - lambda: list of vectors, lambda vectors - mu: list of vectors, mu vectors
Examples
row_clusters <- cbind(
rbinom(100, 1, 0.5),
rbinom(100, 1, 0.5),
rbinom(100, 1, 0.5)
)
col_clusters <- cbind(
rbinom(50, 1, 0.4),
rbinom(50, 1, 0.4),
rbinom(50, 1, 0.4)
)
n_col <- 50
data <- list(
row_clusters %*% diag(c(5, 5, 5)) %*% t(col_clusters) +
abs(matrix(rnorm(100 * n_col), 100, n_col)),
row_clusters %*% diag(c(5, 5, 5)) %*% t(col_clusters) +
abs(0.01 * matrix(rnorm(100 * n_col), 100, n_col))
)
apply_resnmtf(data, k_max = 4)
#> Error in loadNamespace(x): there is no package called ‘bisilhouette’