All GEDCOM records are given unique identifiers known as xrefs (cross-references) to allow other records to link to them. These are alphanumeric strings surrounded by ‘@’ symbols. The tidyged
package creates these xrefs automatically:
library(tidyged)
simpsons <- gedcom(subm("Me")) |>
add_indi(sex = "M") |>
add_indi_names(name_pieces(given = "Homer", surname = "Simpson")) |>
add_indi(sex = "F") |>
add_indi_names(name_pieces(given = "Marge", surname = "Simpson")) |>
add_indi(sex = "F") |>
add_indi_names(name_pieces(given = "Lisa", surname = "Simpson")) |>
add_indi(sex = "M") |>
add_indi_names(name_pieces(given = "Bart", surname = "Simpson")) |>
add_note("This is a note")
#> Added Male Individual: @I1@
#> Added Female Individual: @I2@
#> Added Female Individual: @I3@
#> Added Male Individual: @I4@
#> Added Note: @N1@
dplyr::filter(simpsons, tag %in% c("INDI", "NOTE")) |>
knitr::kable()
level | record | tag | value |
---|---|---|---|
0 | @I1@ | INDI | |
0 | @I2@ | INDI | |
0 | @I3@ | INDI | |
0 | @I4@ | INDI | |
0 | @N1@ | NOTE | This is a note |
Note the unique xrefs in the record column.
In the above example a series of records are created (which will be explained in more detail in the proceeding articles). After each record is created, the name(s) of the individual are defined without actually explicitly referencing the Individual record. This is because they are acting on the active record. A record becomes active when it is created or when it is explicitly activated.
We can query the active record using the active_record()
function:
active_record(simpsons)
#> [1] "@N1@"
Since the last record to be created was the Note record, it is the active record. The active record is stored as an attribute of the tibble.
We can use activation to add to existing records. If we want to activate another record, we can activate it using the activate_*()
family of functions together with its xref:
simpsons |>
activate_indi("@I2@") |>
active_record()
#> [1] "@I2@"
There are many other functions in the gedcompendium
that take record xrefs as input parameters and it can be tedious to have to manually look these up. The tidyged
package offers a number of helper functions to locate specific xrefs using pattern matching:
find_indi_name(simpsons, "Bart")
#> [1] "@I4@"
find_indi_name_all(simpsons, "Simpson")
#> [1] "@I1@" "@I2@" "@I3@" "@I4@"
These helper functions begin with find_*
and act as wrappers to the more general function find_xref()
. It’s straightforward to write your own wrapper if you’re familiar with the tags used in the GEDCOM specification.
In the activation example, we would activate Marge’s record with:
simpsons |>
activate_indi(find_indi_name(simpsons, "Marge")) |>
active_record()
#> [1] "@I2@"
Note that the full name does not need to be given, since the term is partially matched. As long as it is detected in the name of the individual it will be found.
In this use case, if no match or more than one match is found, it will result in an error:
simpsons |>
activate_indi(find_indi_name(simpsons, "Simpon")) |>
active_record()
#> Error in find_xref(gedcom, search_patterns = c(INDI.NAME = pattern), multiple = FALSE, : No records found that match all patterns.
simpsons |>
activate_indi(find_indi_name(simpsons, "Simpson")) |>
active_record()
#> Error in find_xref(gedcom, search_patterns = c(INDI.NAME = pattern), multiple = FALSE, : No unique records found that match all patterns. Try being more specific.
When removing entire records, you don’t have to necessarily rely on activating them first. The same referencing techniques above can be used to remove records immediately:
simpsons |>
remove_indi(find_indi_name(simpsons, "Homer")) |>
df_indi() |>
knitr::kable()
xref | name | sex | date_of_birth | place_of_birth | date_of_death | place_of_death | mother | father | num_siblings | num_children | last_modified |
---|---|---|---|---|---|---|---|---|---|---|---|
@I2@ | Marge Simpson | F | 0 | 22 JUN 2022 | |||||||
@I3@ | Lisa Simpson | F | 0 | 22 JUN 2022 | |||||||
@I4@ | Bart Simpson | M | 0 | 22 JUN 2022 |
In all the examples you’ve seen so far the approach has been to build up the tree one record at a time. There are a number of helper functions that allow you to shortcut this laborious exercise. These functions can create multiple records at once, including Family Group records, where you can go back and add more detail. The functions are:
They all require the xref of an Individual record (or one to be activated), except for add_children()
, which requires the xref of a Family Group record. These functions do not change the active record.
Because of this, you cannot use add_children()
in a single pipeline with the other functions.
The feedback from these functions gives you the necessary xrefs to then add more detail.
To illustrate, we can build up two families starting with a spouse:
from_spou <- gedcom(subm("Me")) |>
add_indi(sex = "M") |>
add_parents() |>
add_siblings(sexes = "MMFF") |>
add_spouse(sex = "F")
#> Added Male Individual: @I1@
#> Added Family Group: @F1@
#> Added Male Individual: @I2@
#> Added Female Individual: @I3@
#> Added Male Individual: @I4@
#> Added Male Individual: @I5@
#> Added Female Individual: @I6@
#> Added Female Individual: @I7@
#> Added Family Group: @F2@
#> Added Female Individual: @I8@
The initial individual (@I1@) gets added as a child to a family (@F1@) with two parents (@I2@ and @I3@) and 4 siblings (@I4@ to @I7@). Finally, he is given a spouse (@I8@) in his own family (@F2@).
Now we have the xref of his family, we can add his two daughters:
with_chil <- from_spou |>
add_children(xref = "@F2@", sexes = "FF")
#> Added Female Individual: @I9@
#> Added Female Individual: @I10@
Now we have the records, we can use all of these xrefs to add details like names and facts.
The tidyged.utils
package contains the function add_ancestors()
to create Individual and Family Group records for entire generations of ancestors.
Record identifiers have been a topic of much discussion in the GEDCOM user community. Even though xref identifiers will be imported unchanged in the tidyged
package, some systems do create their own xref identifiers on import. So you cannot assume they will survive between systems. However, they should always be internally consistent.
A couple of other mechanisms exist for providing unique identifiers to records:
tidyged
package does not use this, nor expose it to a user;find_*_refn()
).For these reasons, neither of these mechanisms are considered to be a better alternative way of selecting records.