Type: | Package |
Title: | Simultaneous Clustering and Factorial Decomposition of Three-Way Datasets |
Version: | 0.0.3 |
Maintainer: | Prosper Ablordeppey <pablordeppey@ua.pt> |
Description: | Implements two iterative techniques called T3Clus and 3Fkmeans, aimed at simultaneously clustering objects and a factorial dimensionality reduction of variables and occasions on three-mode datasets developed by Vichi et al. (2007) <doi:10.1007/s00357-007-0006-x>. Also, we provide a convex combination of these two simultaneous procedures called CT3Clus and based on a hyperparameter alpha (alpha in [0,1], with 3FKMeans for alpha=0 and T3Clus for alpha= 1) also developed by Vichi et al. (2007) <doi:10.1007/s00357-007-0006-x>. Furthermore, we implemented the traditional tandem procedures of T3Clus (TWCFTA) and 3FKMeans (TWFCTA) for sequential clustering-factorial decomposition (TWCFTA), and vice-versa (TWFCTA) proposed by P. Arabie and L. Hubert (1996) <doi:10.1007/978-3-642-79999-0_1>. |
License: | GPL-3 |
Encoding: | UTF-8 |
RoxygenNote: | 7.2.1 |
Depends: | R (≥ 2.10) |
Imports: | methods, stats, Rdpack |
RdMacros: | Rdpack |
Suggests: | testthat (≥ 3.0.0) |
Config/testthat/edition: | 3 |
NeedsCompilation: | no |
Packaged: | 2022-10-17 16:54:46 UTC; iCETEE |
Author: | Prosper Ablordeppey
|
Repository: | CRAN |
Date/Publication: | 2022-10-18 06:40:05 UTC |
Simultaneous results attributes
Description
Simultaneous results attributes
Slots
U_i_g0
matrix. Initial object membership function matrix
B_j_q0
matrix. Initial factor/component matrix for the variables
C_k_r0
matrix. Initial factor/component matrix for the occasions
U_i_g
matrix. Final/updated object membership function matrix
B_j_q
matrix. Final/updated factor/component matrix for the variables
C_k_r
matrix. Final/updated factor/component matrix for the occasions
Y_g_qr
matrix. Derived centroids in the reduced space (data matrix)
X_i_jk_scaled
matrix. Standardized dataset matrix
BestTimeElapsed
numeric. Execution time for the best iterate
BestLoop
numeric. Loop that obtained the best iterate
BestIteration
numeric. Iteration yielding the best results
Converged
numeric. Flag to check if algorithm converged for the K-means
nConverges
numeric. Number of loops that converged for the K-means
TSS_full
numeric. Total deviance in the full-space
BSS_full
numeric. Between deviance in the reduced-space
RSS_full
numeric. Residual deviance in the reduced-space
PF_full
numeric. PseudoF in the full-space
TSS_reduced
numeric. Total deviance in the reduced-space
BSS_reduced
numeric. Between deviance in the reduced-space
RSS_reduced
numeric. Residual deviance in the reduced-space
PF_reduced
numeric. PseudoF in the reduced-space
PF
numeric. Weighted PseudoF score
Labels
integer. Object cluster assignments
Fs
numeric. Objective function values for the KM best iterate
Enorm
numeric. Average l2 norm of the residual norm.
Tandem results attributes
Description
Tandem results attributes
Slots
U_i_g0
matrix. Initial object membership function matrix.
B_j_q0
matrix. Initial factor/component matrix for the variables.
C_k_r0
matrix. Initial factor/component matrix for the occasions.
U_i_g
matrix. Final/updated object membership function matrix.
B_j_q
matrix. Final/updated factor/component matrix for the variables.
C_k_r
matrix. Final/updated factor/component matrix for the occasions.
Y_g_qr
matrix. Derived centroids in the reduced space (data matrix).
X_i_jk_scaled
matrix. Standardized dataset matrix.
BestTimeElapsed
numeric. Execution time for the best iterate.
BestLoop
numeric. Loop that obtained the best iterate.
BestKmIteration
numeric. Number of iteration until best iterate for the K-means.
BestFaIteration
numeric. Number of iteration until best iterate for the FA.
FaConverged
numeric. Flag to check if algorithm converged for the K-means.
KmConverged
numeric. Flag to check if algorithm converged for the Factor Decomposition.
nKmConverges
numeric. Number of loops that converged for the K-means.
nFaConverges
numeric. Number of loops that converged for the Factor decomposition.
TSS_full
numeric. Total deviance in the full-space.
BSS_full
numeric. Between deviance in the reduced-space.
RSS_full
numeric. Residual deviance in the reduced-space.
PF_full
numeric. PseudoF in the full-space.
TSS_reduced
numeric. Total deviance in the reduced-space.
BSS_reduced
numeric. Between deviance in the reduced-space.
RSS_reduced
numeric. Residual deviance in the reduced-space.
PF_reduced
numeric. PseudoF in the reduced-space.
PF
numeric. Actual PseudoF value to obtain best loop.
Labels
integer. Object cluster assignments.
FsKM
numeric. Objective function values for the KM best iterate.
FsFA
numeric. Objective function values for the FA best iterate.
Enorm
numeric. Average l2 norm of the residual norm.
3FKMeans Model
Description
Implements simultaneous version of TWFCTA
Usage
fit.3fkmeans(model, X_i_jk, full_tensor_shape, reduced_tensor_shape)
## S4 method for signature 'simultaneous'
fit.3fkmeans(model, X_i_jk, full_tensor_shape, reduced_tensor_shape)
Arguments
model |
Initialized simultaneous model. |
X_i_jk |
Matricized tensor along mode-1 (I objects). |
full_tensor_shape |
Dimensions of the tensor in full-space. |
reduced_tensor_shape |
Dimensions of tensor in the reduced-space. |
Details
The procedure performs simultaneously the sequential TWFCTA model. The model finds B_j_q and C_k_r such that the within-clusters deviance of the component scores is minimized.
Value
Output attributes accessible via the '@' operator.
U_i_g0 - Initial object membership function matrix
B_j_q0 - Initial factor/component matrix for the variables
C_k_r0 - Initial factor/component matrix for the occasions
U_i_g - Final/updated object membership function matrix
B_j_q - Final/updated factor/component matrix for the variables
C_k_r - Final/updated factor/component matrix for the occasions
Y_g_qr - Derived centroids in the reduced space (data matrix)
X_i_jk_scaled - Standardized dataset matrix
BestTimeElapsed - Execution time for the best iterate
BestLoop - Loop that obtained the best iterate
BestIteration - Iteration yielding the best results
Converged - Flag to check if algorithm converged for the K-means
nConverges - Number of loops that converged for the K-means
TSS_full - Total deviance in the full-space
BSS_full - Between deviance in the reduced-space
RSS_full - Residual deviance in the reduced-space
PF_full - PseudoF in the full-space
TSS_reduced - Total deviance in the reduced-space
BSS_reduced - Between deviance in the reduced-space
RSS_reduced - Residual deviance in the reduced-space
PF_reduced - PseudoF in the reduced-space
PF - Weighted PseudoF score
Labels - Object cluster assignments
Fs - Objective function values for the KM best iterate
Enorm - Average l2 norm of the residual norm.
References
Tucker L (1966). “Some mathematical notes on three-mode factor analysis.” Psychometrika, 31(3), 279-311. doi:10.1007/BF02289464, https://ideas.repec.org/a/spr/psycho/v31y1966i3p279-311.html. Vichi M, Kiers HAL (2001). “Factorial k-means analysis for two-way data.” Computational Statistics and Data Analysis, 37(1), 49-64. https://EconPapers.repec.org/RePEc:eee:csdana:v:37:y:2001:i:1:p:49-64. Vichi M, Rocci R, Kiers H (2007). “Simultaneous Component and Clustering Models for Three-way Data: Within and Between Approaches.” Journal of Classification, 24, 71-98. doi:10.1007/s00357-007-0006-x.
Examples
X_i_jk = generate_dataset()$X_i_jk
model = simultaneous()
tfkmeans = fit.3fkmeans(model, X_i_jk, c(8,5,4), c(3,3,2))
CT3Clus Model
Description
Implements simultaneous T3Clus and 3FKMeans integrating an alpha value between 0 and 1 inclusive for a weighted result.
Usage
fit.ct3clus(
model,
X_i_jk,
full_tensor_shape,
reduced_tensor_shape,
alpha = 0.5
)
## S4 method for signature 'simultaneous'
fit.ct3clus(
model,
X_i_jk,
full_tensor_shape,
reduced_tensor_shape,
alpha = 0.5
)
Arguments
model |
Initialized simultaneous model. |
X_i_jk |
Matricized tensor along mode-1 (I objects). |
full_tensor_shape |
Dimensions of the tensor in full space. |
reduced_tensor_shape |
Dimensions of tensor in the reduced space. |
alpha |
0<alpha>1 hyper parameter. Model is T3Clus when alpha=1 and 3FKMeans when alpha=0. |
Value
Output attributes accessible via the '@' operator.
U_i_g0 - Initial object membership function matrix
B_j_q0 - Initial factor/component matrix for the variables
C_k_r0 - Initial factor/component matrix for the occasions
U_i_g - Final/updated object membership function matrix
B_j_q - Final/updated factor/component matrix for the variables
C_k_r - Final/updated factor/component matrix for the occasions
Y_g_qr - Derived centroids in the reduced space (data matrix)
X_i_jk_scaled - Standardized dataset matrix
BestTimeElapsed - Execution time for the best iterate
BestLoop - Loop that obtained the best iterate
BestIteration - Iteration yielding the best results
Converged - Flag to check if algorithm converged for the K-means
nConverges - Number of loops that converged for the K-means
TSS_full - Total deviance in the full-space
BSS_full - Between deviance in the reduced-space
RSS_full - Residual deviance in the reduced-space
PF_full - PseudoF in the full-space
TSS_reduced - Total deviance in the reduced-space
BSS_reduced - Between deviance in the reduced-space
RSS_reduced - Residual deviance in the reduced-space
PF_reduced - PseudoF in the reduced-space
PF - Weighted PseudoF score
Labels - Object cluster assignments
Fs - Objective function values for the KM best iterate
Enorm - Average l2 norm of the residual norm.
References
Tucker L (1966). “Some mathematical notes on three-mode factor analysis.” Psychometrika, 31(3), 279-311. doi:10.1007/BF02289464, https://ideas.repec.org/a/spr/psycho/v31y1966i3p279-311.html. Rocci R, Vichi M (2005). “Three-Mode Component Analysis with Crisp or Fuzzy Partition of Units.” Psychometrika, 70, 715-736. doi:10.1007/s11336-001-0926-z. Vichi M, Kiers HAL (2001). “Factorial k-means analysis for two-way data.” Computational Statistics and Data Analysis, 37(1), 49-64. https://EconPapers.repec.org/RePEc:eee:csdana:v:37:y:2001:i:1:p:49-64. Vichi M, Rocci R, Kiers H (2007). “Simultaneous Component and Clustering Models for Three-way Data: Within and Between Approaches.” Journal of Classification, 24, 71-98. doi:10.1007/s00357-007-0006-x.
See Also
fit.t3clus
fit.3fkmeans
simultaneous
Examples
X_i_jk = generate_dataset()$X_i_jk
model = simultaneous()
ct3clus = fit.ct3clus(model, X_i_jk, c(8,5,4), c(3,3,2), alpha=0.5)
T3Clus Model
Description
Implements simultaneous version of TWCFTA
Usage
fit.t3clus(model, X_i_jk, full_tensor_shape, reduced_tensor_shape)
## S4 method for signature 'simultaneous'
fit.t3clus(model, X_i_jk, full_tensor_shape, reduced_tensor_shape)
Arguments
model |
Initialized simultaneous model. |
X_i_jk |
Matricized tensor along mode-1 (I objects). |
full_tensor_shape |
Dimensions of the tensor in full-space. |
reduced_tensor_shape |
Dimensions of tensor in the reduced-space. |
Details
The procedure performs simultaneously the sequential TWCFTA model. The model finds B_j_q and C_k_r such that the between-clusters deviance of the component scores is maximized.
Value
Output attributes accessible via the '@' operator.
U_i_g0 - Initial object membership function matrix
B_j_q0 - Initial factor/component matrix for the variables
C_k_r0 - Initial factor/component matrix for the occasions
U_i_g - Final/updated object membership function matrix
B_j_q - Final/updated factor/component matrix for the variables
C_k_r - Final/updated factor/component matrix for the occasions
Y_g_qr - Derived centroids in the reduced space (data matrix)
X_i_jk_scaled - Standardized dataset matrix
BestTimeElapsed - Execution time for the best iterate
BestLoop - Loop that obtained the best iterate
BestIteration - Iteration yielding the best results
Converged - Flag to check if algorithm converged for the K-means
nConverges - Number of loops that converged for the K-means
TSS_full - Total deviance in the full-space
BSS_full - Between deviance in the reduced-space
RSS_full - Residual deviance in the reduced-space
PF_full - PseudoF in the full-space
TSS_reduced - Total deviance in the reduced-space
BSS_reduced - Between deviance in the reduced-space
RSS_reduced - Residual deviance in the reduced-space
PF_reduced - PseudoF in the reduced-space
PF - Weighted PseudoF score
Labels - Object cluster assignments
Fs - Objective function values for the KM best iterate
Enorm - Average l2 norm of the residual norm.
References
Tucker L (1966). “Some mathematical notes on three-mode factor analysis.” Psychometrika, 31(3), 279-311. doi:10.1007/BF02289464, https://ideas.repec.org/a/spr/psycho/v31y1966i3p279-311.html. Rocci R, Vichi M (2005). “Three-Mode Component Analysis with Crisp or Fuzzy Partition of Units.” Psychometrika, 70, 715-736. doi:10.1007/s11336-001-0926-z. Vichi M, Rocci R, Kiers H (2007). “Simultaneous Component and Clustering Models for Three-way Data: Within and Between Approaches.” Journal of Classification, 24, 71-98. doi:10.1007/s00357-007-0006-x.
Examples
X_i_jk = generate_dataset()$X_i_jk
model = simultaneous()
t3clus = fit.t3clus(model, X_i_jk, c(8,5,4), c(3,3,2))
TWCFTA model
Description
Implements K-means clustering and afterwards factorial reduction in a sequential fashion.
Usage
fit.twcfta(model, X_i_jk, full_tensor_shape, reduced_tensor_shape)
## S4 method for signature 'tandem'
fit.twcfta(model, X_i_jk, full_tensor_shape, reduced_tensor_shape)
Arguments
model |
Initialized tandem model. |
X_i_jk |
Matricized tensor along mode-1 (I objects). |
full_tensor_shape |
Dimensions of the tensor in full space. |
reduced_tensor_shape |
Dimensions of tensor in the reduced space. |
Details
The procedure requires sequential clustering and factorial decomposition.
The K-means clustering algorithm is initially applied to the matricized tensor X_i_jk to obtain the centroids matrix X_g_jk and the membership matrix U_i_g.
The Tucker2 decomposition technique is then implemented on the centroids matrix X_g_jk to yield the core centroids matrix Y_g_qr and the component weights matrices B_j_q and C_k_r.
Value
Output attributes accessible via the '@' operator.
U_i_g0 - Initial object membership function matrix.
B_j_q0 - Initial factor/component matrix for the variables.
C_k_r0 - Initial factor/component matrix for the occasions.
U_i_g - Final/updated object membership function matrix.
B_j_q - Final/updated factor/component matrix for the variables.
C_k_r - Final/updated factor/component matrix for the occasions.
Y_g_qr - Derived centroids in the reduced space (data matrix).
X_i_jk_scaled - Standardized dataset matrix.
BestTimeElapsed - Execution time for the best iterate.
BestLoop - Loop that obtained the best iterate.
BestKmIteration - Number of iteration until best iterate for the K-means.
BestFaIteration - Number of iteration until best iterate for the FA.
FaConverged - Flag to check if algorithm converged for the K-means.
KmConverged - Flag to check if algorithm converged for the Factor Decomposition.
nKmConverges - Number of loops that converged for the K-means.
nFaConverges - Number of loops that converged for the Factor decomposition.
TSS_full - Total deviance in the full-space.
BSS_full - Between deviance in the reduced-space.
RSS_full - Residual deviance in the reduced-space.
PF_full - PseudoF in the full-space.
TSS_reduced - Total deviance in the reduced-space.
BSS_reduced - Between deviance in the reduced-space.
RSS_reduced - Residual deviance in the reduced-space.
PF_reduced - PseudoF in the reduced-space.
PF - Actual PseudoF value to obtain best loop.
Labels - Object cluster assignments.
FsKM - Objective function values for the KM best iterate.
FsFA - Objective function values for the FA best iterate.
Enorm - Average l2 norm of the residual norm.
Note
This procedure is useful to further interpret the between clusters variability of the data and to understand the variables and/or occasions that most contribute to discriminate the clusters. However, the application of this technique could lead to the masking of variables that are not informative of the clustering structure.
since the Tucker2 model is applied after the clustering, this cannot help select the most relevant information for the clustering in the dataset.
References
Arabie P, Hubert L (1996). “Advances in Cluster Analysis Relevant to Marketing Research.” In Gaul W, Pfeifer D (eds.), From Data to Knowledge, 3–19. Tucker L (1966). “Some mathematical notes on three-mode factor analysis.” Psychometrika, 31(3), 279-311. doi:10.1007/BF02289464, https://ideas.repec.org/a/spr/psycho/v31y1966i3p279-311.html.
See Also
Examples
X_i_jk = generate_dataset()$X_i_jk
model = tandem()
twcfta = fit.twcfta(model, X_i_jk, c(8,5,4), c(3,3,2))
TWFCTA model
Description
Implements factorial reduction and then K-means clustering in a sequential fashion.
Usage
fit.twfcta(model, X_i_jk, full_tensor_shape, reduced_tensor_shape)
## S4 method for signature 'tandem'
fit.twfcta(model, X_i_jk, full_tensor_shape, reduced_tensor_shape)
Arguments
model |
Initialized tandem model. |
X_i_jk |
Matricized tensor along mode-1 (I objects). |
full_tensor_shape |
Dimensions of the tensor in full space. |
reduced_tensor_shape |
Dimensions of tensor in the reduced space. |
Details
The procedure implements sequential factorial decomposition and clustering.
The technique performs Tucker2 decomposition on the X_i_jk matrix to obtain the matrix of component scores Y_i_qr with component weights matrices B_j_q and C_k_r.
The K-means clustering algorithm is then applied to the component scores matrix Y_i_qr to obtain the desired core centroids matrix Y_g_qr and its associated stochastic membership function matrix U_i_g.
Value
Output attributes accessible via the '@' operator.
U_i_g0 - Initial object membership function matrix.
B_j_q0 - Initial factor/component matrix for the variables.
C_k_r0 - Initial factor/component matrix for the occasions.
U_i_g - Final/updated object membership function matrix.
B_j_q - Final/updated factor/component matrix for the variables.
C_k_r - Final/updated factor/component matrix for the occasions.
Y_g_qr - Derived centroids in the reduced space (data matrix).
X_i_jk_scaled - Standardized dataset matrix.
BestTimeElapsed - Execution time for the best iterate.
BestLoop - Loop that obtained the best iterate.
BestKmIteration - Number of iteration until best iterate for the K-means.
BestFaIteration - Number of iteration until best iterate for the FA.
FaConverged - Flag to check if algorithm converged for the K-means.
KmConverged - Flag to check if algorithm converged for the Factor Decomposition.
nKmConverges - Number of loops that converged for the K-means.
nFaConverges - Number of loops that converged for the Factor decomposition.
TSS_full - Total deviance in the full-space.
BSS_full - Between deviance in the reduced-space.
RSS_full - Residual deviance in the reduced-space.
PF_full - PseudoF in the full-space.
TSS_reduced - Total deviance in the reduced-space.
BSS_reduced - Between deviance in the reduced-space.
RSS_reduced - Residual deviance in the reduced-space.
PF_reduced - PseudoF in the reduced-space.
PF - Actual PseudoF value to obtain best loop.
Labels - Object cluster assignments.
FsKM - Objective function values for the KM best iterate.
FsFA - Objective function values for the FA best iterate.
Enorm - Average l2 norm of the residual norm.
Note
The technique helps interpret the within clusters variability of the data. The Tucker2 tends to explain most of the total variation in the dataset. Hence, the variance of variables that do not contribute to the clustering structure in the dataset is also included.
The Tucker2 dimensions may still mask some essential clustering structures in the dataset.
References
Arabie P, Hubert L (1996). “Advances in Cluster Analysis Relevant to Marketing Research.” In Gaul W, Pfeifer D (eds.), From Data to Knowledge, 3–19. Tucker L (1966). “Some mathematical notes on three-mode factor analysis.” Psychometrika, 31(3), 279-311. doi:10.1007/BF02289464, https://ideas.repec.org/a/spr/psycho/v31y1966i3p279-311.html.
See Also
Examples
X_i_jk = generate_dataset()$X_i_jk
model = tandem()
twfCta = fit.twfcta(model, X_i_jk, c(8,5,4), c(3,3,2))
Folding Matrix to Tensor by Mode.
Description
X_i_jk => X_i_j_k, X_j_ki => X_i_j_k, X_k_ij => X_i_j_k
Usage
fold(X, mode, shape)
Arguments
X |
Data matrix to fold. |
mode |
Mode of operation. |
shape |
Dimension of original tensor. |
Value
X_i_j_k Three-mode tensor.
Examples
X_i_jk = generate_dataset()$X_i_jk
X_i_j_k = fold(X_i_jk, mode=1, shape=c(I=8,J=5,K=4)) # X_i_j_k
Three-Mode Dataset Generator for Simulations
Description
Generate G clustered synthetic dataset of I objects measured on J variables for K occasions with additive noise.
Usage
generate_dataset(
I = 8,
J = 5,
K = 4,
G = 3,
Q = 3,
R = 2,
centroids_spread = c(0, 1),
noise_mean = 0,
noise_stdev = 0.5,
seed = NULL
)
Arguments
I |
Number of objects. |
J |
Number of variables per occasion. |
K |
Number of occasions. |
G |
Number of clusters. |
Q |
Number of factors for the variables. |
R |
Number of factors for the occasions. |
centroids_spread |
interval from which to uniformly pick the centroids. |
noise_mean |
Mean of noise to generate. |
noise_stdev |
Noise effect level/spread/standard deviation. |
seed |
Seed for random sequence generation. |
Value
Z_i_jk: Component scores in the full space.
E_i_jk: Generated noise at the given noise level.
X_i_jk: Dataset with noise level set to noise_stdev specified.
Y_g_qr: Centroids matrix in the reduced space.
U_i_g: Stochastic membership function matrix.
B_j_q: Objects component scores matrix.
C_k_r: Occasions component scores matrix.
Examples
generate_dataset(seed=0)
Random Membership Function Matrix Generator
Description
Generates random binary stochastic membership function matrix for the I objects.
Usage
generate_rmfm(I, G, seed = NULL)
Arguments
I |
Number of objects. |
G |
Number of groups/clusters. |
seed |
Seed for random number generation. |
Value
U_i_g, binary stochastic membership matrix.
Examples
generate_rmfm(I=8,G=3)
One-run of the K-means clustering technique
Description
Initializes centroids based on a given membership function matrix or randomly. Iterate once over the input data to update the membership function matrix assigning objects to the closest centroids.
Usage
onekmeans(Y_i_qr, G, U_i_g = NULL, seed = NULL)
Arguments
Y_i_qr |
Input data to group/cluster. |
G |
Number of clusters to find. |
U_i_g |
Initial membership matrix for the I objects. |
seed |
Seed for random values generation. |
Value
updated membership matrix U_i_g.
References
Oti EU, Olusola MO, Eze FC, Enogwe SU (2021). “Comprehensive Review of K-Means Clustering Algorithms.” International Journal of Advances in Scientific Research and Engineering (IJASRE), ISSN:2454-8006, DOI: 10.31695/IJASRE, 7(8), 64–69. doi:10.31695/IJASRE.2021.34050, https://ijasre.net/index.php/ijasre/article/view/1301.
Examples
X_i_jk = generate_dataset(seed=0)$X_i_jk
onekmeans(X_i_jk, G=5)
PseudoF Score in the Full-Space
Description
Computes the PseudoF score in the full space.
Usage
pseudof.full(bss, wss, full_tensor_shape, reduced_tensor_shape)
Arguments
bss |
Between sums of squared deviations between clusters. |
wss |
Within sums of squared deviations within clusters. |
full_tensor_shape |
Dimensions of the tensor in the original space. |
reduced_tensor_shape |
Dimension of the tensor in the reduced space. |
Value
PseudoF score
References
Caliński T, Harabasz J (1974). “A dendrite method for cluster analysis.” Communications in Statistics, 3(1), 1-27. doi:10.1080/03610927408827101, https://www.tandfonline.com/doi/pdf/10.1080/03610927408827101 , https://www.tandfonline.com/doi/abs/10.1080/03610927408827101. Rocci R, Vichi M (2005). “Three-Mode Component Analysis with Crisp or Fuzzy Partition of Units.” Psychometrika, 70, 715-736. doi:10.1007/s11336-001-0926-z.
Examples
pseudof.full(12,6,c(8,5,4),c(3,3,2))
PseudoF Score in the Reduced-Space
Description
Computes the PseudoF score in the reduced space.
Usage
pseudof.reduced(bss, wss, full_tensor_shape, reduced_tensor_shape)
Arguments
bss |
Between sums of squared deviations between clusters. |
wss |
Within sums of squared deviations within clusters. |
full_tensor_shape |
Dimensions of the tensor in the original space. |
reduced_tensor_shape |
Dimension of the tensor in the reduced space. |
Value
PseudoF score
References
Caliński T, Harabasz J (1974). “A dendrite method for cluster analysis.” Communications in Statistics, 3(1), 1-27. doi:10.1080/03610927408827101, https://www.tandfonline.com/doi/pdf/10.1080/03610927408827101 , https://www.tandfonline.com/doi/abs/10.1080/03610927408827101.
Examples
pseudof.reduced(12,6,c(8,5,4),c(3,3,2))
Simultaneous Model Constructor
Description
Initialize model object required by the simultaneous methods.
Usage
simultaneous(
seed = NULL,
verbose = TRUE,
init = "svd",
n_max_iter = 10,
n_loops = 10,
tol = 1e-05,
U_i_g = NULL,
B_j_q = NULL,
C_k_r = NULL
)
Arguments
seed |
Seed for random sequence generation. |
verbose |
Flag to display output result for each loop. |
init |
The initialization method for the model parameters. Values could be 'svd','random','twcfta' or 'twfcta' Defaults to svd. |
n_max_iter |
Maximum number of iterations to optimize objective function. |
n_loops |
Number of runs/loops in search of the global result. |
tol |
Acceptable tolerance level. |
U_i_g |
Membership function matrix for the objects. |
B_j_q |
Component matrix for the variables. |
C_k_r |
Component matrix for the occasions. |
Details
Two simultaneous models T3Clus and 3FKMeans are the implemented methods.
T3Clus finds B_j_q and C_k_r such that the between-clusters deviance of the component scores is maximized.
3FKMeans finds B_j_q and C_k_r such that the within-clusters deviance of the component scores is minimized.
Value
An object of class "simultaneous".
Note
The model finds the best partition described by the best orthogonal linear combinations of the variables and orthogonal linear combinations of the occasions.
References
Tucker L (1966). “Some mathematical notes on three-mode factor analysis.” Psychometrika, 31(3), 279-311. doi:10.1007/BF02289464, https://ideas.repec.org/a/spr/psycho/v31y1966i3p279-311.html. Vichi M, Rocci R, Kiers H (2007). “Simultaneous Component and Clustering Models for Three-way Data: Within and Between Approaches.” Journal of Classification, 24, 71-98. doi:10.1007/s00357-007-0006-x.
See Also
fit.t3clus
fit.3fkmeans
fit.ct3clus
tandem
Examples
simultaneous()
Simultaneous Model
Description
Simultaneous Model
Slots
seed
numeric. Seed for random sequence generation. Defaults to None.
verbose
logical. Whether to display executions output or not. Defaults to False.
init
character. The parameter initialization method. Defaults to 'svd'.
n_max_iter
numeric. Maximum number of iterations. Defaults to 10.
n_loops
numeric. Number of initialization to guarantee global results. Defaults to 10.
tol
numeric. Tolerance level/acceptable error. Defaults to 1e-5.
U_i_g
numeric. (I,G) initial stochastic membership function matrix.
B_j_q
numeric. (J,Q) initial component weight matrix for variables.
C_k_r
numeric. (K,R) initial component weight matrix for occasions.
Split Member of Largest cluster with An Empty cluster.
Description
If there is an empty cluster share members of largest cluster with empty cluster via the k-means clustering technique
Usage
split_update(LC, LC_members, LC_scores, EC, U_i_g, C_g, seed)
Arguments
LC |
Largest cluster index. |
LC_members |
Members of largest cluster. |
LC_scores |
Scores of largest cluster. |
EC |
Empty cluster index to share members of LC with. |
U_i_g |
Current membership function matrix with empty cluster to update. |
C_g |
Number of members in each cluster. |
seed |
Seed for random number generation. |
Value
U_i_g, the updated membership matrix.
Initializes an instance of the tandem model required by the tandem methods.
Description
Initializes an instance of the tandem model required by the tandem methods.
Usage
tandem(
seed = NULL,
verbose = TRUE,
init = "svd",
n_max_iter = 10,
n_loops = 10,
tol = 1e-05,
U_i_g = NULL,
B_j_q = NULL,
C_k_r = NULL
)
Arguments
seed |
Seed for random sequence generation. |
verbose |
Flag to display iteration outputs for each loop. |
init |
Parameter initialization method, 'svd' or 'random'. |
n_max_iter |
Maximum number of iteration to optimize the objective function. |
n_loops |
Maximum number of loops/runs for global results. |
tol |
Allowable tolerance to check convergence. |
U_i_g |
Initial membership function matrix for the objects. |
B_j_q |
Initial component scores matrix for the variables. |
C_k_r |
Initial component sores matrix for the occasions. |
Value
An object of class "tandem".
References
Arabie P, Hubert L (1996). “Advances in Cluster Analysis Relevant to Marketing Research.” In Gaul W, Pfeifer D (eds.), From Data to Knowledge, 3–19. Tucker L (1966). “Some mathematical notes on three-mode factor analysis.” Psychometrika, 31(3), 279-311. doi:10.1007/BF02289464, https://ideas.repec.org/a/spr/psycho/v31y1966i3p279-311.html.
See Also
fit.twcfta
fit.twfcta
simultaneous
Tandem Class
Description
Tandem Class
Slots
seed
Seed for random sequence generation. Defaults to None.
verbose
logical. Whether to display executions output or not. Defaults to False.
init
character. The parameter initialization method. Defaults to 'svd'.
n_max_iter
numeric. Maximum number of iterations. Defaults to 10.
n_loops
numeric. Number of initialization to guarantee global results. Defaults to 10.
tol
numeric. Tolerance level/acceptable error. Defaults to 1e-5.
U_i_g
matrix. (I,G) initial stochastic membership function matrix.
B_j_q
matrix. (J,Q) initial component weight matrix for variables.
C_k_r
matrix. (K,R) initial component weight matrix for occasions.
Tensor Matricization
Description
Unfold/Matricize tensor. convert matrix to tensor by mode.
Usage
unfold(tensor, mode)
Arguments
tensor |
Three-mode tensor array. |
mode |
Mode of operation. |
Value
Matrix
Examples
X_i_jk = generate_dataset()$X_i_jk
X_i_j_k = fold(X_i_jk, mode=1, shape=c(I=8,J=5,K=4))
unfold(X_i_j_k, mode=1) # X_i_jk