10  Publication place

MARC: 260a and 264a

Publication place information is primarily derived from MARC21 fields 260a and 264a, which record the location of publication. These fields exhibit substantial variation in spelling, language, and formatting, requiring systematic harmonization to enable consistent geographic analysis. The processing pipeline integrates normalization, synonym resolution, and external mappings to standardize place names and link them to countries and geocoordinates, while preserving traceability of ambiguous and discarded cases.

The publication place field is enriched by using 264a when 260a is missing, after which the raw place strings are harmonized with polish_place(). The harmonized place names are then linked to countries with get_country() and joined with existing geocoding data to add longitude, latitude, and mapping identifiers. The workflow saves the processed table, creates accepted and discarded summaries, records discarded non-empty original values with IDs, and produces separate accepted summaries for the 1809–1917 subset.

10.1 Complete Dataset Overview

Terms that are clearly not place names can be added to stopwords;

The synonyme list contains candidate place names that were identified during processing but ultimately rejected, rather than valid variants of accepted names. These entries typically reflect ambiguous, malformed, or non-standardized strings that could not be reliably matched to a canonical location. Some may be recoverable with additional context or external authority matching, but are conservatively excluded in the current workflow.

10.2 Publication countries

10.3 Geocoordinates

  • 86.2% of the documents were matched to geographic coordinates (based on COMHIS geomapping process).
  • 5095 unique places (88.3% of all unique places and 13.84% of all documents) are missing geocoordinates. See list of places missing geocoordinate information.

10.4 Subset Analysis: 1809-1917

  • 545 unique publication places; available for 70897 documents (97%).

  • Unique publication country for a period 1809-1917: 32; available for 69115 documents (95%).

Top-20 publication places are shown together with the number of documents.

Country Documents (n) Fraction (%)
Finland 63453 86.9
Sweden 1782 2.4
Russia 1222 1.7
USA 1110 1.5
Germany 636 0.9
France 182 0.2