6.1 Complete Dataset Overview

Original documents with titles: 1229884 / 1229925 (100%)

Original documents with missing (NA) titles 41 / 1229925 documents (0%)

Unique discarded entries in original data: 7

6.1.1 Top-20 titles

Peruskartta 1:20000 7175 0.6
Maastokartta 3376 0.3
Pitäjänkartta 1:20000 2163 0.2
Topografičeskaâ karta 1:42000 1325 0.1
Peruskartta 1:25000 1273 0.1
Topografinen kartta 1:20000 1211 0.1
Maastokartta 1:50000 788 0.1
Tienumerokartta 616 0.1
Peruskartan pienennös 1:50000 544 0
Topografinen kartta 1:50000 536 0

6.1.2 Top-20 titles for “Language material” content type

In the Fennica dataset, language material makes up the majority of the content, comprising 91.6% of the entire dataset. This includes written texts, documents, books, articles, and any other content primarily composed of language. The remaining 8.4% consists of non-language material, such as maps, music, computer files, and other types of content that do not primarily rely on language. For a more detailed breakdown of the different types of content within the dataset, you can refer to [summaries of type of records in Fennica].

Kootut teokset 475 0
Jäsentiedote 392 0
Valitut teokset 357 0
Julkaisu 349 0
Kalevala 338 0
Kungörelse 314 0
Matematiikka 300 0
Opinto-opas 293 0
Vuosikirja 280 0
Tuhattaituri 272 0

6.1.3 Title Length Over Time (1488-2020)

This plot visualizes the variation in title lengths across publication decades from 1488 to 2020. The title lengths range from 1 to1697 , highlighting how the length of titles has evolved over time.N = 1229925.

6.1.4 Title Word Count Over Time (1488-2020)

This plot visualizes the variation in title word counts across publication decades from 1488 to 2020. The title lengths range from 1 to220 , highlighting how the length of titles has evolved over time.N = 1229925.

6.2 Subset Analysis: 1809-1917

Unique accepted entries (1809-1917): 48986

Original documents with non-NA titles: 66890 / 66890 (100%)

Original documents with missing (NA) titles 0 / 66890 documents (0%)

6.2.1 Top-20 titles for years 1809-1917

Topografičeskaâ karta 1:42000 1323 2
Theses 147 0.2
Homeri Odyssea svethice reddita 113 0.2
Kalmbergin kartasto 87 0.1
Dikter 85 0.1
Fänrik Ståls sägner 77 0.1
Handlingar rörande Finlands historia kring medlet af 17:de århundradet 64 0.1
Dissertatio entomologica insecta Fennica enumerans 64 0.1
Missionsberättelser lämpade för missionsbönstunder 63 0.1
Läsning för barn 63 0.1

6.2.2 Top-20 titles for years 1809-1917 / “Language material” content type

Theses 147 0.2
Homeri Odyssea svethice reddita 113 0.2
Dikter 85 0.1
Fänrik Ståls sägner 77 0.1
Handlingar rörande Finlands historia kring medlet af 17:de århundradet 64 0.1
Dissertatio entomologica insecta Fennica enumerans 64 0.1
Missionsberättelser lämpade för missionsbönstunder 63 0.1
Läsning för barn 63 0.1
Kertomuksia Suomen historiasta 61 0.1
Fältskärns berättelser 59 0.1

6.2.3 Title Length Over Time (1809-1917)

This plot visualizes the variation in title lengths across publication years from 1809 to 1917. The title lengths range from 2 to1018. N = 66890.

6.2.4 Title Word Count (1809-1917)

This plot visualizes the variation in title lengths across publication years from 1809 to 1917. The title lengths range from 1 to151. N = 66890.