mtb: Crosstab Summary with crosstab_from_list()

library(mtb)
library(knitr)
  Note
This vignette is originally generated by AI, and edited by the maintainer.

Background

[Link to Python version of the same function]

crosstab_from_list() produces a contingency table (crosstab) from a data frame, supporting multi-column row and column keys. When perct_within_index is supplied, within-group percentages are computed alongside the raw counts.

The function returns a named list:

  • count — raw counts with All margin row and column
  • percent — within-group percentages (or NULL if no perct_within_index)
  • report — formatted strings combining counts and percentages (or NULL)
  • total — group totals used for percentage calculation (or NULL)

Example data

df <- data.frame(
    A = c('foo','foo', 'foo', 'foo', 'foo', 'foo', 'bar', 'bar', 'baz', 'baz', 'baz'),
    B = c('one','one','one','one', 'one', 'two', 'two', 'one', 'one', 'two', 'two'),
    C = c('y','x','x','x', 'y', 'x', 'y', 'x', 'y', 'x', 'y'),
    D = c('apple','apple','apple','apple', 'banana', 'apple', 
          'banana', 'apple', 'banana', 'apple', 'banana'),
    E = c('red','blue','red','red', 'red', 'blue', 'blue', 'red', 'blue', 'red', 'blue'),
    value = c(0,1,1,1, 2, 3, 4, 5, 6, 7, 8)
)

The data has 11 rows and 6 columns. Columns A, B, C will serve as row keys and D, E as column keys.

Basic crosstab (counts only)

When perct_within_index is NULL, only the count table is populated.

result_count <- crosstab_from_list(
    df   = df,
    rows = c("A", "B", "C"),
    cols = c("D", "E")
)
kable(result_count$count)
A B C apple | blue banana | blue apple | red banana | red All
bar one x 0 0 1 0 1
bar two y 0 1 0 0 1
baz one y 0 1 0 0 1
baz two x 0 0 1 0 1
baz two y 0 1 0 0 1
foo one x 1 0 2 0 3
foo one y 0 0 1 1 2
foo two x 1 0 0 0 1
All All All 2 3 5 1 11

Crosstab with row-wise percentages

Setting perct_within_index = "A" computes percentages within each level of column A (a row-key variable), so each row’s counts sum to 100% within its group.

result <- crosstab_from_list(
    df                 = df,
    rows               = c("A", "B", "C"),
    cols               = c("D", "E"),
    perct_within_index = "A",
    col_margin_perct   = TRUE,
    row_margin_perct   = TRUE,
    report_type        = 1
)
kable(result$report)
A B C apple | blue banana | blue apple | red banana | red All
bar one x 0 (0%) 0 (0%) 1 (50%) 0 (0%) 1 (50%)
bar two y 0 (0%) 1 (50%) 0 (0%) 0 (0%) 1 (50%)
baz one y 0 (0%) 1 (33.3%) 0 (0%) 0 (0%) 1 (33.3%)
baz two x 0 (0%) 0 (0%) 1 (33.3%) 0 (0%) 1 (33.3%)
baz two y 0 (0%) 1 (33.3%) 0 (0%) 0 (0%) 1 (33.3%)
foo one x 1 (16.7%) 0 (0%) 2 (33.3%) 0 (0%) 3 (50%)
foo one y 0 (0%) 0 (0%) 1 (16.7%) 1 (16.7%) 2 (33.3%)
foo two x 1 (16.7%) 0 (0%) 0 (0%) 0 (0%) 1 (16.7%)
All All All 2 (18.2%) 3 (27.3%) 5 (45.5%) 1 (9.1%) 11 (100%)

report_type = 2: showing count/total

Using report_type = 2 formats each cell as count/total (percent%) instead, making the denominator explicit.

result2 <- crosstab_from_list(
    df                 = df,
    rows               = c("A", "B", "C"),
    cols               = c("D", "E"),
    perct_within_index = "A",
    col_margin_perct   = TRUE,
    row_margin_perct   = TRUE,
    report_type        = 2
)
kable(result2$report)
A B C apple | blue banana | blue apple | red banana | red All
bar one x 0/2 (0%) 0/2 (0%) 1/2 (50%) 0/2 (0%) 1/2 (50%)
bar two y 0/2 (0%) 1/2 (50%) 0/2 (0%) 0/2 (0%) 1/2 (50%)
baz one y 0/3 (0%) 1/3 (33.3%) 0/3 (0%) 0/3 (0%) 1/3 (33.3%)
baz two x 0/3 (0%) 0/3 (0%) 1/3 (33.3%) 0/3 (0%) 1/3 (33.3%)
baz two y 0/3 (0%) 1/3 (33.3%) 0/3 (0%) 0/3 (0%) 1/3 (33.3%)
foo one x 1/6 (16.7%) 0/6 (0%) 2/6 (33.3%) 0/6 (0%) 3/6 (50%)
foo one y 0/6 (0%) 0/6 (0%) 1/6 (16.7%) 1/6 (16.7%) 2/6 (33.3%)
foo two x 1/6 (16.7%) 0/6 (0%) 0/6 (0%) 0/6 (0%) 1/6 (16.7%)
All All All 2/11 (18.2%) 3/11 (27.3%) 5/11 (45.5%) 1/11 (9.1%) 11/11 (100%)

Accessing individual components

The four list elements can be accessed independently:

# Raw counts
kable(result$count)
A B C apple | blue banana | blue apple | red banana | red All
bar one x 0 0 1 0 1
bar two y 0 1 0 0 1
baz one y 0 1 0 0 1
baz two x 0 0 1 0 1
baz two y 0 1 0 0 1
foo one x 1 0 2 0 3
foo one y 0 0 1 1 2
foo two x 1 0 0 0 1
All All All 2 3 5 1 11

# Percentages
kable(result$percent)
A B C apple | blue banana | blue apple | red banana | red All
bar one x 0.0 0.0 50.0 0.0 50.0
bar two y 0.0 50.0 0.0 0.0 50.0
baz one y 0.0 33.3 0.0 0.0 33.3
baz two x 0.0 0.0 33.3 0.0 33.3
baz two y 0.0 33.3 0.0 0.0 33.3
foo one x 16.7 0.0 33.3 0.0 50.0
foo one y 0.0 0.0 16.7 16.7 33.3
foo two x 16.7 0.0 0.0 0.0 16.7
All All All 18.2 27.3 45.5 9.1 100.0

# Group totals
kable(result$total)
A B C apple | blue banana | blue apple | red banana | red All
bar one x 2 2 2 2 2
bar two y 2 2 2 2 2
baz one y 3 3 3 3 3
baz two x 3 3 3 3 3
baz two y 3 3 3 3 3
foo one x 6 6 6 6 6
foo one y 6 6 6 6 6
foo two x 6 6 6 6 6
All All All 11 11 11 11 11