| Title: | Factor Analytic Profile Analysis of Ipsatized Data |
|---|---|
| Description: | Implements Factor Analytic Profile Analysis of Ipsatized Data ('FAPA'), a metric inferential framework for pattern detection and person-level reconstruction in multivariate profile data. After row-centering (ipsatization) to remove profile elevation, 'FAPA' applies singular value decomposition ('SVD') to recover shared core profiles and individual pattern weights. Dimensionality is determined by a variance-matched Horn's parallel analysis. A three-stage bootstrap verification framework assesses (1) dimensionality via parallel analysis, (2) subspace stability via Procrustes principal angles, and (3) profile replicability via Tucker's congruence coefficients. BCa bootstrap confidence intervals for core-profile coordinates are computed via the canonical 'boot' package implementation of Davison and Hinkley (1997) <doi:10.1017/CBO9780511802843>. |
| Authors: | Se-Kang Kim [aut, cre] |
| Maintainer: | Se-Kang Kim <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.1 |
| Built: | 2026-05-15 09:13:01 UTC |
| Source: | https://github.com/sekangakim/fapa |
Implements Factor Analytic Profile Analysis of Ipsatized Data (FAPA), a metric inferential framework for pattern detection and person-level reconstruction in multivariate profile data.
After row-centering (ipsatization) to remove profile elevation, FAPA applies singular value decomposition (SVD) to recover shared core profiles and individual pattern weights, supporting the following workflow:
Ipsatization — load_and_ipsatize removes
person-level elevation, isolating within-person pattern structure.
Core estimation — fapa_core performs SVD
and returns the core-profile matrix, person weights, and
variance decomposition.
Stage 1: Dimensionality — fapa_pa
applies variance-matched Horn's parallel analysis to determine
how many components to retain.
Stage 2: Subspace stability —
fapa_procrustes assesses dimensional stability via
bootstrap principal angles.
Stage 3: Profile replicability —
fapa_tucker computes Tucker's congruence coefficients
across bootstrap resamples.
Inference — fapa_bca provides BCa
bootstrap confidence intervals for core-profile coordinates using
the canonical boot implementation.
Reconstruction — fapa_person projects
each person onto the retained core profiles and reports
reconstruction and optional bootstrap CIs for selected
participants.
Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30(2), 179–185.
Davison, A. C., & Hinkley, D. V. (1997). Bootstrap Methods and Their Application. Cambridge University Press.
Lorenzo-Seva, U., & ten Berge, J. M. F. (2006). Tucker's congruence coefficient as a meaningful index of factor similarity. Methodology, 2(2), 57–64.
Kim, S.-K. (2024). Factorization of person response profiles to identify summative profiles carrying central response patterns. Psychological Methods, 29(4), 723–730. doi:10.1037/met0000568
Se-Kang Kim [email protected]
Useful links:
Computes BCa (bias-corrected and accelerated) bootstrap confidence intervals
for every coordinate of every retained core profile, using the canonical
implementation in boot and boot.ci.
fapa_bca(Xtilde, K, B = 2000, alpha = 0.05, seed = NULL)fapa_bca(Xtilde, K, B = 2000, alpha = 0.05, seed = NULL)
Xtilde |
Numeric matrix (persons |
K |
Integer. Number of core profiles (must equal the retained
dimensionality from |
B |
Integer. Number of bootstrap replicates. Default |
alpha |
Numeric. Two-tailed significance level. Default |
seed |
Integer or |
Sign ambiguity across bootstrap resamples is handled inside the statistic
function via an inner-product alignment rule (see .align_signs),
ensuring that each bootstrap distribution is unimodal before BCa adjustment.
A named list:
List of data frames (one per core profile), each with
columns Ori, Mean, SE, Lower,
Upper, BCaLower, BCaUpper.
Original core-profile matrix ().
The full boot object for downstream diagnostics.
3-D array () of bootstrap
profiles.
Inputs echoed for plotting and output.
Davison, A. C., & Hinkley, D. V. (1997). Bootstrap Methods and Their Application. Cambridge University Press. doi:10.1017/CBO9780511802843
plot_fapa_core, write_fapa_results
Computes the thin singular value decomposition (SVD) of an ipsatized
person-by-variable matrix, returning the core profiles, person
weights, singular values, and variance-accounting summaries.
fapa_core(Xtilde, K, direction = NULL)fapa_core(Xtilde, K, direction = NULL)
Xtilde |
Numeric matrix (persons |
K |
Integer. Number of components to extract. |
direction |
Integer vector of length |
The core-profile (scale) matrix is defined as
, so that each
individual's ipsatized profile satisfies
exactly
(rank- reconstruction).
Signs of the singular vectors are normalised so that the element with the largest absolute value in each core-profile column is positive.
A named list:
Person-weight matrix ().
Singular values (length ).
Right singular vectors ().
Core-profile matrix
().
Total ipsatized variance (Frobenius norm).
Variance per component ().
Proportion of variance per component.
Cumulative proportion of variance.
Normalised person-core correlations ().
Sign vector applied (for reproducibility).
Number of components extracted.
Kim, S.-K. (2024). Factorization of person response profiles to identify summative profiles carrying central response patterns. Psychological Methods, 29(4), 723–730. doi:10.1037/met0000568
Determines the number of components to retain from the SVD of an ipsatized data matrix using a variance-matched bootstrap version of Horn's (1965) parallel analysis.
fapa_pa(Xtilde, B = 2000, alpha = 0.05, seed = NULL)fapa_pa(Xtilde, B = 2000, alpha = 0.05, seed = NULL)
Xtilde |
Numeric matrix (persons |
B |
Integer. Number of bootstrap replicates. Default |
alpha |
Numeric. Significance level. Default |
seed |
Integer or |
For each of B bootstrap replicates, a random matrix of identical
dimensions is row-centred (ipsatized) and then rescaled to the same
Frobenius norm as Xtilde. This variance-matching step is essential:
without it, raw-score data trivially dominates N(0,1) random matrices and
PA retains all components. Components whose observed
exceeds the quantile of the matched null distribution are
retained.
A named list:
Number of components retained.
Observed squared singular values (length =
).
Bootstrap quantile per component.
Proportion of variance per observed component.
Mean proportion of variance per random component.
Full matrix of random
.
Total ipsatized variance.
Inputs echoed for reporting.
Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30(2), 179–185.
Projects each person onto the retained core profiles, returning
reconstruction , pattern weights , and
normalised person-core correlations. Optionally computes percentile
bootstrap confidence intervals for pattern weights of selected participants.
fapa_person( Xtilde, fit, participants = NULL, B_boot = 2000, alpha = 0.05, seed = NULL )fapa_person( Xtilde, fit, participants = NULL, B_boot = 2000, alpha = 0.05, seed = NULL )
Xtilde |
Numeric matrix (persons |
fit |
A list returned by |
participants |
Integer vector of row indices for which bootstrap CIs
are desired, or |
B_boot |
Integer. Bootstrap replicates for participant CIs.
Default |
alpha |
Numeric. Two-tailed significance level. Default
|
seed |
Integer or |
A named list:
Data frame ( rows) with columns Person,
Level, R2, w1 ... wK,
rDim1 ... rDimK.
Matrix of bootstrap summary statistics for
participants (or NULL).
Partial per dimension.
Mean person reconstruction across all persons.
For each of B bootstrap resamples of the ipsatized data matrix,
computes the right singular vectors and measures the principal
angles (in degrees) between the bootstrap subspace and the original
-dimensional right singular vector subspace.
fapa_procrustes(Xtilde, K, B = 2000, angle_thresh = 30, seed = NULL)fapa_procrustes(Xtilde, K, B = 2000, angle_thresh = 30, seed = NULL)
Xtilde |
Numeric matrix (persons |
K |
Integer. Number of dimensions to assess. |
B |
Integer. Number of bootstrap replicates. Default
|
angle_thresh |
Numeric. Upper stability bound in degrees. Default
|
seed |
Integer or |
A bootstrap replicate is declared stable when all
principal angles are strictly less than angle_thresh.
This criterion confirms that the bootstrap subspace is nearly parallel to
the original, providing geometric evidence of dimensional stability.
A named list:
matrix of principal angles
(degrees).
Per-dimension mean and SD of angles.
Per-dimension 2.5th and 97.5th percentiles.
Number of replicates satisfying the stability criterion.
Proportion of stable replicates.
Inputs echoed for reporting.
Bjorck, A., & Golub, G. H. (1973). Numerical methods for computing angles between linear subspaces. Mathematics of Computation, 27(123), 579–594.
print_procrustes, plot_principal_angles
A simulated dataset approximating the structure of the calibration sample used in Kim (in preparation). It contains no real clinical records. The data comprise 500 synthetic cases on 22 variables: 11 pre-treatment and 11 post-treatment administrations of the Eating Disorder Inventory-2 (EDI-2) subscales.
fapa_simdatafapa_simdata
A data frame with 500 rows and 22 columns. The first 11 columns
contain pre-treatment EDI-2 subscale scores (Drive for Thinness through
Social Insecurity) and the remaining 11 columns contain the corresponding
post-treatment scores. Column names follow the convention
Before_<n>_<abbr> and After_<n>_<abbr>, where n is
the subscale index and abbr is a two-letter abbreviation.
Scores are integers in the range 0–40.
The 11 EDI-2 subscales are:
Dt Drive for Thinness
Bu Bulimia
Bd Body Dissatisfaction
In Ineffectiveness
Pf Perfectionism
Id Interpersonal Distrust
Ia Interoceptive Awareness
Mf Maturity Fears
As Asceticism
Ir Impulse Regulation
Si Social Insecurity
The latent structure was constructed to approximate two components: a normative symptom gradient (CP1) and a pre-/post-treatment change contrast (CP2).
Simulated. See data-raw/simulate_fapa_data.R.
Garner, D. M. (1991). Eating Disorder Inventory-2 Manual. Psychological Assessment Resources.
data(fapa_simdata) dim(fapa_simdata) ## Quick ipsatization check Xtilde <- as.matrix(fapa_simdata) - rowMeans(as.matrix(fapa_simdata)) range(rowSums(Xtilde)) # should be ~0data(fapa_simdata) dim(fapa_simdata) ## Quick ipsatization check Xtilde <- as.matrix(fapa_simdata) - rowMeans(as.matrix(fapa_simdata)) range(rowSums(Xtilde)) # should be ~0
For each of B bootstrap resamples, computes Tucker's congruence
coefficient (CC) between each original core profile and its bootstrap
counterpart. Sign ambiguity is resolved by choosing the sign that
maximises the absolute CC before storing.
fapa_tucker(Xtilde, K, B = 2000, cc_thresh = 0.85, seed = NULL)fapa_tucker(Xtilde, K, B = 2000, cc_thresh = 0.85, seed = NULL)
Xtilde |
Numeric matrix (persons |
K |
Integer. Number of core profiles to assess. |
B |
Integer. Number of bootstrap replicates. Default
|
cc_thresh |
Numeric. Acceptability lower bound. Default |
seed |
Integer or |
Conventional thresholds (Lorenzo-Seva & ten Berge, 2006):
CC 0.95: high similarity ("factor replication").
CC 0.85: fair similarity.
CC 0.85: poor similarity.
A named list:
matrix of Tucker CCs.
Per-profile mean and SD of CCs.
Per-profile 2.5th and 97.5th percentiles.
Inputs echoed for reporting.
Lorenzo-Seva, U., & ten Berge, J. M. F. (2006). Tucker's congruence coefficient as a meaningful index of factor similarity. Methodology, 2(2), 57–64. doi:10.1027/1614-2241.2.2.57
Tucker, L. R. (1951). A method for synthesis of factor analysis studies (Personnel Research Section Report No. 984). Department of the Army.
Reads a person-by-variable CSV file, assigns column labels, and removes each person's mean across variables (ipsatization), isolating within-person pattern structure from overall profile elevation.
load_and_ipsatize(path, col_labels)load_and_ipsatize(path, col_labels)
path |
Character. Path to a CSV file with a header row. |
col_labels |
Character vector of length equal to the number of columns. Column names are replaced with these labels after loading. |
A named list with elements:
Original data as a data.frame.
Row-centred matrix ().
Numeric vector of per-person means (profile levels).
The supplied col_labels.
## Create a small temporary CSV and ipsatize it tmp <- tempfile(fileext = ".csv") write.csv(matrix(sample(1:5, 30, replace = TRUE), nrow = 6), tmp, row.names = FALSE) dat <- load_and_ipsatize(tmp, col_labels = paste0("V", 1:5)) round(rowSums(dat$ipsatized), 10) # should all be 0 unlink(tmp)## Create a small temporary CSV and ipsatize it tmp <- tempfile(fileext = ".csv") write.csv(matrix(sample(1:5, 30, replace = TRUE), nrow = 6), tmp, row.names = FALSE) dat <- load_and_ipsatize(tmp, col_labels = paste0("V", 1:5)) round(rowSums(dat$ipsatized), 10) # should all be 0 unlink(tmp)
Displays the th core profile from a fapa_bca result,
split at a variable boundary (e.g., pre vs post) with BCa CI bands.
Variables before split_at are drawn in red (dashed), variables from
split_at + 1 onward in blue (solid).
plot_fapa_core( bca, i = 1, split_at = 11, main = NULL, ylab = "Core-Profile Coordinate" )plot_fapa_core( bca, i = 1, split_at = 11, main = NULL, ylab = "Core-Profile Coordinate" )
bca |
A list returned by |
i |
Integer. Which core profile to plot. Default |
split_at |
Integer. Index at which to switch colour/line-type.
Default |
main |
Character. Plot title. Default auto-generated. |
ylab |
Character. Y-axis label. |
Invisibly returns NULL. Called for its side-effect.
Plots observed versus the random 95th-percentile reference
line, with a vertical cut at the retention boundary.
plot_pa_scree(pa, main = "Horn's Parallel Analysis — Scree")plot_pa_scree(pa, main = "Horn's Parallel Analysis — Scree")
pa |
A list returned by |
main |
Character. Plot title. |
Invisibly returns NULL.
Plots the standardized ipsatized profile of person p alongside each
of the first K core profiles (also standardized), one panel per
dimension.
plot_person_match(bca, Xtilde, p = 1, K = 2)plot_person_match(bca, Xtilde, p = 1, K = 2)
bca |
A list returned by |
Xtilde |
Numeric matrix (persons |
p |
Integer. Row index of the focal person. Default |
K |
Integer. Number of core profiles to overlay. Default |
Invisibly returns NULL.
Draws one histogram per dimension showing the bootstrap distribution of principal angles, with the stability threshold marked as a vertical line.
plot_principal_angles(pr)plot_principal_angles(pr)
pr |
A list returned by |
Invisibly returns NULL.
Draws one histogram per core profile showing the bootstrap distribution of Tucker CCs, with reference lines at the fair (default 0.85) and high (0.95) thresholds.
plot_tucker_cc(tc, cc_thresh = 0.85)plot_tucker_cc(tc, cc_thresh = 0.85)
tc |
A list returned by |
cc_thresh |
Numeric. Fair-similarity reference line. Default
|
Invisibly returns NULL.
Print a summary of Stage 1 parallel analysis results
print_pa(pa)print_pa(pa)
pa |
A list returned by |
Invisibly returns NULL. Called for its side-effect of
printing a formatted table to the console.
Print a summary of Stage 2 principal-angle results
print_procrustes(pr, K_pa = NULL)print_procrustes(pr, K_pa = NULL)
pr |
A list returned by |
K_pa |
Integer or |
Invisibly returns NULL.
Print a summary of Stage 3 Tucker CC results
print_tucker(tc, cc_thresh, K_pa = NULL)print_tucker(tc, cc_thresh, K_pa = NULL)
tc |
A list returned by |
cc_thresh |
Numeric. Acceptability cutoff to display (should match the
value used in |
K_pa |
Integer or |
Invisibly returns NULL.
Writes one CSV file per retained core profile, containing the original coordinates together with bootstrap mean, SE, percentile, and BCa confidence bounds.
write_fapa_results(bca, prefix)write_fapa_results(bca, prefix)
bca |
A list returned by |
prefix |
Character. Base name for output files (e.g., |
Invisibly returns a character vector of file paths written.
Writes one CSV file for each of the three verification stages.
write_verification(pa, pr, tc, prefix, K_pa = NULL)write_verification(pa, pr, tc, prefix, K_pa = NULL)
pa |
A list returned by |
pr |
A list returned by |
tc |
A list returned by |
prefix |
Character. Base name for output files. |
K_pa |
Integer or |
Invisibly returns a named character vector of the three file paths.