3  Author’s info: lifespan

MARC: 100d

3.0.1 Author lifetime (MARC 100d)

MARC field [100$d] records the author’s lifetime (typically birth and death years) in free-text form. In the Fennica data, this field exhibits substantial heterogeneity in formatting, including approximate dates, alternative values, historical notations (e.g. BCE/CE), and non-standard textual descriptions.

To enable quantitative analysis, the field is processed through a multi-step cleaning and harmonization workflow. The procedure combines information from the original Fennica records and the Kanto authority data, prioritizing structured Kanto values where available. The harmonization includes:

  • normalization of textual expressions (e.g. “noin”, “tai”, “kuollut”)
  • resolution of alternative dates by selecting a consistent representative value
  • conversion of historical notation (e.g. BCE/CE) into numeric form
  • simplification of complex patterns (e.g. bracketed values, Extended Date/Time Format)
  • removal of non-lifespan information (e.g. “toiminta-aika”, centuries such as “1700-luku”)
  • standardization into numeric birth (from) and death (till) years

The outcome is divided into two categories:

  • Accepted entries: records where at least one valid year (birth or death) could be reliably extracted and represented numerically
  • Discarded entries: records where the information could not be interpreted as a valid lifespan, or where ambiguity or format complexity prevented reliable extraction

In addition, a separate set of source conflicts is identified, where cleaned values derived from Fennica and Kanto disagree. These cases are retained for further inspection and potential feedback to the data providers.

3.1 Complete Dataset Overview

The field has 467691 / (35.5%) non-missing values and 850365 / (64.5%) with missing lifetime information.

Author age is available for 191166 records / (14.5%). Author age range is 15–108.

Author date accepted for the complete Fennica

Author date discarded for the complete Fennica

Author date source conflicts for the complete Fennica and Kanto

Harmonized dataset for author dates

3.2 Subset 1809-1917

The field has 31686 / (43.4%) non-missing values and 41363 / (56.6%) with missing lifetime information.

Author date accepted for the complete Fennica

Harmonized dataset for author dates 1809-1917