This function takes input from two or more columns and allows the contents to
be merged into a single column by using a pattern that specifies the
arrangement. We can specify which columns to merge together in the columns
argument. The string-combining pattern is to be provided in the pattern
argument. The first column in the columns
series operates as the target
column (i.e., the column that will undergo mutation) whereas all following
columns
will be untouched. There is the option to hide the non-target
columns (i.e., second and subsequent columns given in columns
). The
formatting of values in different columns will be preserved upon merging.
Usage
cols_merge(
data,
columns,
hide_columns = columns[-1],
rows = everything(),
pattern = NULL
)
Arguments
- data
The gt table data object
obj:<gt_tbl>
// requiredThis is the gt table object that is commonly created through use of the
gt()
function.- columns
Columns to target
<column-targeting expression>
// requiredThe columns for which the merging operations should be applied. The first column resolved will be the target column (i.e., undergo mutation) and the other columns will serve to provide input. Can either be a series of column names provided in
c()
, a vector of column indices, or a select helper function (e.g.starts_with()
,ends_with()
,contains()
,matches()
,num_range()
, andeverything()
). A vector is recommended because in that case we are absolutely certain about the order of columns, and, that order information is needed for this and other arguments.- hide_columns
Subset of
columns
to hide<column-targeting expression>|FALSE
// default:columns[-1]
Any column names provided here will have their state changed to
hidden
(via internal use ofcols_hide()
) if they aren't already hidden. This is convenient if the shared purpose of these specified columns is only to provide string input to the target column. To suppress any hiding of columns,FALSE
can be used here.- rows
Rows to target
<row-targeting expression>
// default:everything()
In conjunction with
columns
, we can specify which of their rows should participate in the merging process. The defaulteverything()
results in all rows incolumns
being formatted. Alternatively, we can supply a vector of row IDs withinc()
, a vector of row indices, or a select helper function (e.g.starts_with()
,ends_with()
,contains()
,matches()
,num_range()
, andeverything()
). We can also use expressions to filter down to the rows we need (e.g.,[colname_1] > 100 & [colname_2] < 50
).- pattern
Formatting pattern
scalar<character>
// default:NULL
(optional
)A formatting pattern that specifies the arrangement of the
columns
values and any string literals. The pattern uses numbers (within{ }
) that correspond to the indices of columns provided incolumns
. If two columns are provided incolumns
and we would like to combine the cell data onto the first column,"{1} {2}"
could be used. If a pattern isn't provided then a space-separated pattern that includes allcolumns
will be generated automatically. Further details are provided in the How thepattern
works section.
How the pattern
works
There are two types of templating for the pattern
string:
{ }
for arranging single column values in a row-wise fashion<< >>
to surround spans of text that will be removed if any of the contained{ }
yields a missing value
Integer values are placed in { }
and those values correspond to the columns
involved in the merge, in the order they are provided in the columns
argument. So the pattern "{1} ({2}-{3})"
corresponds to the target column
value listed first in columns
and the second and third columns cited
(formatted as a range in parentheses). With hypothetical values, this might
result as the merged string "38.2 (3-8)"
.
Because some values involved in merging may be missing, it is likely that
something like "38.2 (3-NA)"
would be undesirable. For such cases, placing
sections of text in << >>
results in the entire span being eliminated if
there were to be an NA
value (arising from { }
values). We could instead
opt for a pattern like "{1}<< ({2}-{3})>>"
, which results in "38.2"
if
either columns {2}
or {3}
have an NA
value. We can even use a more
complex nesting pattern like "{1}<< ({2}-<<{3}>>)>>"
to retain a lower
limit in parentheses (where {3}
is NA
) but remove the range altogether
if {2}
is NA
.
One more thing to note here is that if sub_missing()
is used on values in
a column, those specific values affected won't be considered truly missing by
cols_merge()
(since it's been handled with substitute text). So, the
complex pattern "{1}<< ({2}-<<{3}>>)>>"
might result in something like
"38.2 (3-limit)"
if sub_missing(..., missing_text = "limit")
were used
on the third column supplied in columns
.
Comparison with other column-merging functions
There are three other column-merging functions that offer specialized
behavior that is optimized for common table tasks: cols_merge_range()
,
cols_merge_uncert()
, and cols_merge_n_pct()
. These functions operate
similarly, where the non-target columns can be optionally hidden from the
output table through the autohide
option.
Examples
Use a subset of the sp500
dataset to create a gt table. Use the
cols_merge()
function to merge the open
& close
columns together, and,
the low
& high
columns (putting an em dash between both). Relabel the
columns with cols_label()
.
sp500 |>
dplyr::slice(50:55) |>
dplyr::select(-volume, -adj_close) |>
gt() |>
cols_merge(
columns = c(open, close),
pattern = "{1}—{2}"
) |>
cols_merge(
columns = c(low, high),
pattern = "{1}—{2}"
) |>
cols_label(
open = "open/close",
low = "low/high"
)
Use a portion of gtcars
to create a gt table. Use the cols_merge()
function to merge the trq
& trq_rpm
columns together, and, the mpg_c
&
mpg_h
columns. Given the presence of NA
values, we can use patterns that
drop parts of the output text whenever missing values are encountered.
gtcars |>
dplyr::filter(year == 2017) |>
dplyr::select(mfr, model, starts_with(c("trq", "mpg"))) |>
gt() |>
fmt_integer(columns = trq_rpm) |>
cols_merge(
columns = starts_with("trq"),
pattern = "{1}<< ({2} rpm)>>"
) |>
cols_merge(
columns = starts_with("mpg"),
pattern = "<<{1} city<</{2} hwy>>>>"
) |>
cols_label(
mfr = "Manufacturer",
model = "Car Model",
trq = "Torque",
mpg_c = "MPG"
)
See also
Other column modification functions:
cols_add()
,
cols_align()
,
cols_align_decimal()
,
cols_hide()
,
cols_label()
,
cols_label_with()
,
cols_merge_n_pct()
,
cols_merge_range()
,
cols_merge_uncert()
,
cols_move()
,
cols_move_to_end()
,
cols_move_to_start()
,
cols_nanoplot()
,
cols_unhide()
,
cols_units()
,
cols_width()