11  Publication time

MARC field: 008/07-14

The publication years section offers a comprehensive summary of the dataset’s years of publication, providing an understanding of the temporal distribution of titles. The inclusion of links to uniquely accepted and discarded output tables adds a layer of transparency, allowing for detailed exploration of the refined and excluded data.

Data harmonization was performed using the polish_years_008 function. It processes and harmonizes temporal data, splitting it into columns such as publication_year, publication_from, publication_till (for serials), and publication_decade (for visualization). Links to the converted data are provided below.

The dataset includes information on missing values, represented as NA in the refined data. There are 51646 empty rows in the field 008. Discarded values, such as invalid entries coded as characters (e.g., “uuuu”, “||||”) or inconsistent data (e.g., years beyond the current year or mismatched date ranges), are excluded. This does not imply that the discarded values are incorrect; rather, they are excluded because they cannot be utilized for statistical analysis.

The summary also accounts for publication statuses (field 008/06). For accurate temporal distribution, these statuses and their implications must be carefully considered to ensure only valid dates are selected that represent the publication year and not something else.

11.1 Complete Data Overview

Publication year conversions

Publication year discarded. 1105 records are discarded where the publication date is not coded or unknown or contain ambiguous dates, such as non-numeric characters. Error list is for librarians’ use.

Download publication time harmonized dataset

Publication years is available for 1265305 documents (96%). The publication years span is 0-2026.

11.1.1 Title count per decade (log values)

11.1.2 Publication status summaries

Thу visualization of publication status field enhances understanding of how publication years are recorded. The harmonization process depended on the publication status field due to its nuanced information, which doesn’t always directly signify the start or end of publication.

Publication Status Entries (n) Fraction (%)
Single known date/probable date 1143812 86.8
Continuing resource ceased publication 51230 3.9
Publication date and copyright date 45436 3.4
Reprint/reissue date and original date 35740 2.7
Continuing resource currently published 25455 1.9
Questionable date 8432 0.6
Inclusive dates of collection 4760 0.4
Multiple dates 2231 0.2
Continuing resource status unknown 489 0
No dates given; B.C. date involved 125 0
Date of distribution etc 91 0
No attempt to code 88 0
Detailed date 76 0
Dates unknown 38 0

11.2 Subset Analysis: 1809-1917

In this segment we concentrate on the so called “long 19th century”: literary production during the years 1809-1917, when the Grand Duchy of Finland was an autonomous part of the Russian Empire.

Publication year conversions (1809-1917)

Publication year discarded (1809-1917)

Download publication time harmonized dataset (1809-1917)

11.2.1 Title count per decade

A plot depicting title counts per decade from 1809 to 1917 enriches the analysis by visually capturing the trends and fluctuations in literary output over this historical period.