6  Genre/form 655

MARC: 655,a

6.1 Description

6.2 Description

Field 655 (Genre/Form) contains genre and form descriptors recorded as free text in the bibliographic metadata. These entries are heterogeneous: multiple values may occur within a single record (typically separated by delimiters such as |), and labels vary in spelling, formatting, and level of standardization.

To enable consistent analysis, the field was harmonized using a controlled vocabulary:

  • Controlled vocabulary matching: extracted values were matched against a curated list derived from SLM – Suomalainen lajityyppi- ja muotosanasto, a Finnish genre and form thesaurus maintained within the Finto ontology service. This provides a standardized set of genre terms in Finnish.

  • Filtering of unmatched values: only values that could be directly matched to SLM labels were retained. All other entries were classified as unrecognized and excluded from the harmonized field. These discarded values are reported separately for transparency and quality assessment.

  • Reconstruction of harmonized field: validated labels were recombined into a semicolon-separated string (harmonized) per record. As SLM is a Finnish-language vocabulary, all retained genre labels are treated as Finnish, and no additional language assignment was required.

  • Missing data handling: records without valid matches were set to NA. The proportion of missing values reflects both absence of genre information and limitations in mapping to the controlled vocabulary.

This workflow produces a standardized and reproducible representation of genre/form information, enabling reliable aggregation and comparison across records while preserving information about discarded and ambiguous cases.

6.3 Complete Dataset Overview

Unique genre/form: 1213

There are 1070036 missing values in the dataset,accounting for (81.18%) of the total.

Unrecognized genre_655s provides details of Genre/Form that were discarded, in total: 0.

Genre/Form Entries (n) Fraction (%)
väitöskirjat 75774 5.7
äänikirjat 32976 2.5
muistelmat 12565 1
tiedotuslehdet 9010 0.7
kartat 8323 0.6
elämäkerrat 4920 0.4
oppaat 4094 0.3
sukukirjat 3770 0.3
asiakaslehdet 3663 0.3
henkilöstölehdet 2751 0.2

Download genre_655 harmonized dataset

6.4 Subset Analysis: 1809-1917

Unique genre_655s (1809-1917): 376

There are 56002 missing values in the dataset,accounting for (76.66%) of the total.

Download genre_655 harmonized dataset (1809-1917)

6.4.1 Top Genre/Form for 1809-1917

Accepted Genre/Form (1809-1917).

Genre/Form Entries (n) Fraction (%)
väitöskirjat 4853 6.6
luettelot 1476 2
virret; arkkiveisut 794 1.1
esitteet 743 1
sanomalehdet 564 0.8
arkkiveisut 492 0.7
hartauskirjat 297 0.4
kokousjulkaisut 295 0.4
huumorilehdet; kausijulkaisut 256 0.4
uskonnolliset lehdet 253 0.3