| Title: | Bridging Log-Odds Ratios and Correspondence Analysis via Closeness-of-Concordance Measures |
|---|---|
| Description: | Provides a unified analytical workflow that bridges conventional binary and multinomial logistic regression with singly-ordered (SONSCA) and doubly-ordered (DONSCA) nonsymmetric correspondence analysis. Log-odds ratios (LORs) from logistic regression are re-expressed as cosine theta estimates and closeness-of-concordance measures (CCMs) -- including Yule's Q, Yule's Y, and r_meta -- on the familiar [-1, +1] scale introduced by Kim and Grochowalski (2019) <doi:10.3758/s13428-018-1161-1>. Bootstrap confidence intervals for cosine theta are provided throughout. The package is intended to help clinical and medical researchers interpret association strength from logistic regression in an intuitive, correlation-like metric, and to connect conventional regression results with geometric correspondence analysis visualisations. |
| Authors: | Se-Kang Kim [aut, cre] (ORCID: <https://orcid.org/0000-0003-0928-3396>) |
| Maintainer: | Se-Kang Kim <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 0.1.0 |
| Built: | 2026-05-22 07:45:57 UTC |
| Source: | https://github.com/sekangakim/lorbridge |
Fits a binary logistic regression model with a categorical predictor (treated as an unordered factor with a user-specified reference level). For each non-reference category, reports the pairwise LOR, OR, Wald confidence interval, p-value, CCMs, and cosine theta from a 2 x J simple correspondence analysis (Kim & Grochowalski, 2019 bridge).
blr_categorical(outcome, predictor, ref_level = NULL, alpha = 0.05)blr_categorical(outcome, predictor, ref_level = NULL, alpha = 0.05)
outcome |
Integer or numeric vector. Binary outcome (0/1). |
predictor |
Factor or character vector. Categorical predictor. Will be converted to a factor internally. |
ref_level |
Character. Reference category label (default: first level). |
alpha |
Numeric. Significance level for confidence intervals (default 0.05). |
Cosine theta is computed via a 1D singular value decomposition (SVD) of
the standardised residual matrix of the 2 x J correspondence table,
bypassing CAvariants (which requires at least 2 dimensions). In a
2-row table the 1D CA solution is exact, and cosine theta equals +1 or -1
for each non-reference column, with the sign indicating the direction of
association relative to the reference category and majority group.
A named list with elements:
The fitted glm object.
A data.frame with one row per non-reference category, containing LOR, OR, 95 percent CI, p-value, CCMs, and cosine theta.
The 2 x J contingency table used for cosine theta.
Kim, S.-K., & Grochowalski, J. H. (2019). Gaining from discretization of continuous data: The correspondence analysis biplot approach. Behavior Research Methods, 51(2), 589-601. doi:10.3758/s13428-018-1161-1
data(lorbridge_data) res <- blr_categorical(outcome = lorbridge_data$minority, predictor = lorbridge_data$VMbin, ref_level = "VM4") print(res$results)data(lorbridge_data) res <- blr_categorical(outcome = lorbridge_data$minority, predictor = lorbridge_data$VMbin, ref_level = "VM4") print(res$results)
Fits a binary logistic regression model with a single continuous predictor, standardised to unit standard deviation (per 1 SD). Reports the log-odds ratio (LOR), odds ratio (OR), Wald confidence interval, p-value, Nagelkerke R-squared, and the full set of closeness-of-concordance measures (CCMs) on the [-1, +1] scale.
blr_continuous(outcome, predictor, alpha = 0.05)blr_continuous(outcome, predictor, alpha = 0.05)
outcome |
Integer or numeric vector. Binary outcome (0/1). |
predictor |
Numeric vector. Continuous predictor variable. Will be standardised internally (mean = 0, SD = 1) before fitting. |
alpha |
Numeric. Significance level for confidence intervals (default 0.05). |
The predictor is standardised as (x - mean(x)) / sd(x), so the
reported OR and LOR correspond to a one-standard-deviation increase.
Nagelkerke R-squared is computed as:
(1 - exp((2/n)(LL_null - LL_fit))) / (1 - exp((2/n) * LL_null)).
A named list with elements:
The fitted glm object.
A data.frame with LOR, OR, 95\
Nagelkerke R-squared, and CCMs (YuleQ, YuleY, r_meta) with CIs.
Mean of the predictor used for standardisation.
SD of the predictor used for standardisation.
Kim, S.-K., & Grochowalski, J. H. (2019). Gaining from discretization of continuous data: The correspondence analysis biplot approach. Behavior Research Methods, 51(2), 589-601. doi:10.3758/s13428-018-1161-1
data(lorbridge_data) res <- blr_continuous(outcome = lorbridge_data$minority, predictor = lorbridge_data$VM) print(res$summary_table)data(lorbridge_data) res <- blr_continuous(outcome = lorbridge_data$minority, predictor = lorbridge_data$VM) print(res$summary_table)
Computes a full row of closeness-of-concordance measures (CCMs) from an odds ratio (OR), its confidence interval endpoints, and the corresponding log-odds ratio (LOR). CCMs include Yule's Q, Yule's Y, and the meta-analytic correlation r_meta (probit transformation of LOR), all on the [-1, +1] scale introduced by Kim and Grochowalski (2019).
ccm_row(OR, OR_lo, OR_hi, LOR, LOR_lo, LOR_hi)ccm_row(OR, OR_lo, OR_hi, LOR, LOR_lo, LOR_hi)
OR |
Numeric. Odds ratio point estimate. |
OR_lo |
Numeric. Lower confidence limit of the odds ratio. |
OR_hi |
Numeric. Upper confidence limit of the odds ratio. |
LOR |
Numeric. Log-odds ratio point estimate. |
LOR_lo |
Numeric. Lower confidence limit of the log-odds ratio. |
LOR_hi |
Numeric. Upper confidence limit of the log-odds ratio. |
Yule's Q = (OR - 1) / (OR + 1). Ranges from -1 to +1; equals the Pearson correlation for 2x2 tables under a tetrachoric model.
Yule's Y = (sqrt(OR) - 1) / (sqrt(OR) + 1). A shrunken version of Q with better sampling properties for sparse tables.
r_meta converts LOR to Cohen's d via d = LOR * sqrt(3) / pi, then to a correlation-like metric via d / sqrt(d^2 + 4). Equivalent to the biserial correlation used in meta-analysis.
A one-row data.frame with columns:
OR, OR_lo, OR_hi, LOR, LOR_lo, LOR_hi,
YuleQ, Q_lo, Q_hi, YuleY, Y_lo, Y_hi,
r_meta, r_lo, r_hi.
Kim, S.-K., & Grochowalski, J. H. (2019). Gaining from discretization of continuous data: The correspondence analysis biplot approach. Behavior Research Methods, 51(2), 589-601. doi:10.3758/s13428-018-1161-1
lc <- lor_ci_2x2(30, 25, 28, 24) ccm_row(exp(lc$lor), exp(lc$lo), exp(lc$hi), lc$lor, lc$lo, lc$hi)lc <- lor_ci_2x2(30, 25, 28, 24) ccm_row(exp(lc$lor), exp(lc$lo), exp(lc$hi), lc$lor, lc$lo, lc$hi)
Computes anchored cosine theta values from a 2 x J contingency table via a
direct 1D SVD of the standardised residual matrix, bypassing the
CAvariants function (which requires at least 2 dimensions). This
implements the Kim and Grochowalski (2019) log-odds ratio to cosine theta
bridge for 2-row tables.
cosine_theta_2row(tab_2xJ, ref_col)cosine_theta_2row(tab_2xJ, ref_col)
tab_2xJ |
A 2 x J matrix or table. Row 1 = majority/reference group; row 2 = minority/focal group. |
ref_col |
Character. Name of the reference (anchor) column. |
A named numeric vector of cosine theta values, one per non-reference column. Values are +1 or -1 in the 1D case (sign carries the direction).
Kim, S.-K., & Grochowalski, J. H. (2019). Gaining from discretization of continuous data: The correspondence analysis biplot approach. Behavior Research Methods, 51(2), 589-601. doi:10.3758/s13428-018-1161-1
data(lorbridge_data) tab <- table(lorbridge_data$minority, lorbridge_data$VMbin) rownames(tab) <- c("Majority", "Minority") cosine_theta_2row(tab, ref_col = "VM4")data(lorbridge_data) tab <- table(lorbridge_data$minority, lorbridge_data$VMbin) rownames(tab) <- c("Majority", "Minority") cosine_theta_2row(tab, ref_col = "VM4")
Computes doubly-anchored cosine theta values for all non-anchor row and column contrasts in a DONSCA solution. Each cosine theta quantifies the geometric alignment between the direction (row_i - row_anchor) and (col_j - col_anchor) in the full multivariate CA space.
donsca_cosines(fit, col_anchor_idx, row_anchor_idx, dims = "all")donsca_cosines(fit, col_anchor_idx, row_anchor_idx, dims = "all")
fit |
A fitted DONSCA object from |
col_anchor_idx |
Integer. Column index of the anchor (reference) column. |
row_anchor_idx |
Integer. Row index of the anchor (reference) row. |
dims |
Integer or |
A data.frame with columns: Row, Col, cos_theta.
Fits a Doubly-Ordered Nonsymmetric Correspondence Analysis (DONSCA) model
via CAvariants, using all available dimensions.
donsca_fit(tab)donsca_fit(tab)
tab |
A numeric matrix or table. Both rows and columns must represent ordered categories. |
A CAvariants fit object containing, among others,
Rprinccoord (row principal coordinates) and Cstdcoord (column
standard coordinates).
Computes the percentage of total inertia explained by each dimension in a SONSCA solution, using a direct SVD of the standardised residual matrix.
inertia_pct(tab)inertia_pct(tab)
tab |
A numeric matrix or table. |
Numeric vector of percent inertia values (summing to 100).
Computes the log-odds ratio (LOR) and its Wald confidence interval for a 2x2 contingency table, applying the Haldane-Anscombe continuity correction (adding 0.5 to all cells) when any cell count is zero.
lor_ci_2x2(a, b, c, d, alpha = 0.05, cc = 0.5)lor_ci_2x2(a, b, c, d, alpha = 0.05, cc = 0.5)
a |
Numeric. Cell count: row 1 (focal), column 1 (focal). |
b |
Numeric. Cell count: row 1 (focal), column 2 (reference/anchor). |
c |
Numeric. Cell count: row 2 (reference/anchor), column 1 (focal). |
d |
Numeric. Cell count: row 2 (reference/anchor), column 2 (reference/anchor). |
alpha |
Numeric. Significance level for the confidence interval (default 0.05). |
cc |
Numeric. Continuity correction added to all cells when any count is zero (default 0.5, Haldane-Anscombe). |
A named list with elements:
Point estimate of the log-odds ratio.
Standard error of the log-odds ratio.
Lower bound of the Wald confidence interval.
Upper bound of the Wald confidence interval.
Haldane, J. B. S. (1956). The estimation and significance of the logarithm of a ratio of frequencies. Annals of Human Genetics, 20(4), 309-311.
# Compare Race1 vs Race2 at IQ bin 1 vs IQ bin 4 (reference) lor_ci_2x2(a = 30, b = 25, c = 28, d = 24)# Compare Race1 vs Race2 at IQ bin 1 vs IQ bin 4 (reference) lor_ci_2x2(a = 30, b = 25, c = 28, d = 24)
An individual-level dataset with N = 900 observations, reconstructed by
row expansion from the vm_raw grouped data (SONSCA Analysis C). Each row
represents one individual with their Vocabulary Meaning (VM) test score,
discretised VM bin, binary minority/majority group indicator, and race
group label.
lorbridge_datalorbridge_data
A data.frame with 900 rows and 4 columns:
Numeric. Raw Vocabulary Meaning score (range 54-149).
Factor. Discretised VM bin (VM1-VM6), with VM4 as the reference level. Breakpoints: <=64.28, <=81, <=100, <=121, <=138.36, >138.36.
Integer. Binary outcome: 1 = minority (Race1 + Race2 + Race3), 0 = majority (Race4).
Character. Race group label (Race1, Race2, Race3, Race4).
Reconstructed from the vm_raw table in Kim, S.-K. (2026),
unified SONSCA/DONSCA analysis script.
Fits a multinomial logistic regression model with a single standardised continuous predictor (per 1 SD), using a specified baseline outcome category. Returns log-odds ratios, odds ratios, Wald CIs, and the full set of CCMs for each non-baseline outcome level.
mlr_ccm(outcome, predictor, baseline = NULL, alpha = 0.05)mlr_ccm(outcome, predictor, baseline = NULL, alpha = 0.05)
outcome |
Factor or character vector. Polytomous outcome variable. |
predictor |
Numeric vector. Continuous predictor. Standardised internally to unit SD before fitting. |
baseline |
Character. Baseline/reference level of the outcome
(default: first level of |
alpha |
Numeric. Significance level for CIs (default 0.05). |
A data.frame with one row per non-baseline outcome level,
containing LOR, OR, 95% CI, and CCMs (YuleQ, YuleY, r_meta).
Generates bias-corrected and accelerated (BCa-style percentile) bootstrap confidence intervals for doubly-anchored cosine theta estimates from SONSCA. Resamples the contingency table under a multinomial model.
sonsca_bootstrap( tab, row_anchor, col_anchor, row_groups, col_groups, B = 2000, alpha = 0.05 )sonsca_bootstrap( tab, row_anchor, col_anchor, row_groups, col_groups, B = 2000, alpha = 0.05 )
tab |
A numeric matrix or table for SONSCA. |
row_anchor |
Character. Row anchor label. |
col_anchor |
Character. Column anchor label. |
row_groups |
Character vector. Non-anchor row labels to include. |
col_groups |
Character vector. Non-anchor column labels to include. |
B |
Integer. Number of bootstrap replications (default 2000). |
alpha |
Numeric. Significance level for CIs (default 0.05). |
A named list with elements:
Numeric vector of point estimates (flattened row x col).
Lower CI bounds.
Upper CI bounds.
Number of successful bootstrap replications.
Computes pairwise closeness-of-concordance measures (CCMs) for a single (row, column) contrast against the anchor (row_anchor, col_anchor) in a SONSCA contingency table.
sonsca_ccm(tab, row_k, bin_j, row_anchor, col_anchor, alpha = 0.05)sonsca_ccm(tab, row_k, bin_j, row_anchor, col_anchor, alpha = 0.05)
tab |
A numeric matrix or table. |
row_k |
Character. Focal row label. |
bin_j |
Character. Focal column label. |
row_anchor |
Character. Row anchor label. |
col_anchor |
Character. Column anchor label. |
alpha |
Numeric. Significance level (default 0.05). |
A one-row data.frame with columns: Race, Bin, and all CCM
columns from ccm_row().
Extracts row standard coordinates and column principal coordinates from a
Singly-Ordered Nonsymmetric Correspondence Analysis (SONSCA) fit via
CAvariants. This is the column-isometric scaling recommended for
cosine theta computation.
sonsca_coords(tab)sonsca_coords(tab)
tab |
A numeric matrix or table. Rows are nominal categories (e.g., racial groups); columns are ordered categories (e.g., score bins). |
A named list with elements:
Row standard coordinate matrix (rows = row categories).
Column principal coordinate matrix (rows = column categories).
Computes the matrix of doubly-anchored cosine theta values between all non-anchor row and column pairs in SONSCA coordinate space.
sonsca_cosines(row_coords, col_coords, row_anchor, col_anchor)sonsca_cosines(row_coords, col_coords, row_anchor, col_anchor)
row_coords |
Row coordinate matrix (from |
col_coords |
Column coordinate matrix (from |
row_anchor |
Character. Row label used as the anchor (reference). |
col_anchor |
Character. Column label used as the anchor (reference). |
A matrix of cosine theta values with rows = non-anchor row
categories and columns = non-anchor column categories. Anchor rows/
columns yield NA (zero displacement).
A 4 x 6 contingency table cross-classifying four racial groups by six discretised IQ score bins. Used in SONSCA Analysis A.
tab_IQtab_IQ
A numeric matrix with 4 rows (Race1, Race2, Race3, Race4) and 6 columns (IQ1-IQ6).
Kim, S.-K. (2026), unified SONSCA/DONSCA analysis script, Analysis A.
A 6 x 6 contingency table cross-classifying six discretised IQ bins by six discretised Vocabulary Meaning (VM) bins. Both rows and columns are ordered, making this suitable for DONSCA.
tab_IQ_VMtab_IQ_VM
A numeric matrix with 6 rows (IQ1-IQ6) and 6 columns (VM1-VM6).
Kim, S.-K. (2026), unified SONSCA/DONSCA analysis script, Analysis 3a.
A 4 x 6 contingency table cross-classifying four racial groups by six discretised Vocabulary Meaning (VM) score bins. Used in SONSCA Analysis B.
tab_VMtab_VM
A numeric matrix with 4 rows (Race1, Race2, Race3, Race4) and 6 columns (VM1-VM6).
Kim, S.-K. (2026), unified SONSCA/DONSCA analysis script, Analysis B.