Series |
Studi di archivistica, bibliografia, paleografia
Edited book | Models of Data Extraction and Architecture in Relational Databases of Early Modern Private Political Archives
Chapter | Cracking the Historical Code
Cracking the Historical Code
From Unstructured Correspondence Corpora to Computational Analysi
- Agata Bloch - Tadeusz Manteuffel Institute of History of Polish Academy of Sciences, Poland - email
- Michał Bojanowski - Kozminski University, Poland - email
- Clodomir Santana - Tadeusz Manteuffel Institute of History of Polish Academy of Sciences - email
- Demival Vasques Filho - Luxembourg Centre for Contemporary and Digital History (C2DH), University of Luxembourg - email
Abstract
The chapter addresses a methodological approach to unstructured data and discusses the potential that structured data offers in the field of historical research. The dataset, which initially consists of textual content sourced from digital collections at the Portuguese Overseas Archives in Lisbon, undergoes a preprocessing phase that forms the basis for the extraction of structured data. The authors combine history, social sciences, and computer science to convert the correspondence repository into a machine‑processable form. This transformation is supported by an interdisciplinary strategy in which they weave together elements of effective content management, topic modelling, and social network analysis.
Submitted: Oct. 3, 2023 | Accepted: Jan. 18, 2024 | Published May 22, 2025 | Language: en
Keywords Digital infrastructure • Colonial Portuguese Empire • Public correspondence • Structured data • Historical dataset
Copyright © 2025 Agata Bloch, Michał Bojanowski, Clodomir Santana, Demival Vasques Filho. This is an open-access work distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction is permitted, provided that the original author(s) and the copyright owner(s) are credited and that the original publication is cited, in accordance with accepted academic practice. The license allows for commercial use. No use, distribution or reproduction is permitted which does not comply with these terms.
Permalink http://doi.org/10.30687/978-88-6969-919-1/006
- Introduction
- Dorit Raines
- May 22, 2025
Perspectives: Historical Archives and Digital Humanities
- The Digital Historiographic Turn and the Historian’s Changing Toolkit: From ‘Facts’ and ‘Events’ to ‘Datasets’
- Dorit Raines
- May 22, 2025
-
Is There a Reception of Algorithm‑Based Research in Traditional Historical Scholarship?
Three Case Studies from Academic “Trading Zones” - Thomas Wallnig
- May 22, 2025
- The Representation of Historical Uncertainties as the Outcome of Competing and Incompatible Certainties
- Fabio Vitali, Valentina Pasqual
- May 22, 2025
- Metapolis: Spatializing Histories Through Archival Sources
- Lukas Klic
- May 22, 2025
Experiences: Historical Archives, Database and Online Publication
-
Including the Archival Context in the Historian’s Materials: The Advantages of Archival Standard Databases in Historical Research
VINCULUM Project Database and Information System Guide - Maria de Lurdes Rosa
- May 22, 2025
-
Cracking the Historical Code
From Unstructured Correspondence Corpora to Computational Analysi - Agata Bloch, Michał Bojanowski, Clodomir Santana, Demival Vasques Filho
- May 22, 2025
-
Methods and Tools of Quantification in Historical Research
Napoleonic Employment Applications as a Case Study - Valentina Dal Cin
- May 22, 2025
-
Gendered Data in Medieval and Early Modern Sources
The Gendered Networks and Digital Edgeworth Network Projects - Máirín MacCarron
- May 22, 2025
-
Extraction, Architecture and Recovery of Family Correspondence Data
The Platform “EpiCAT. Family Letters from Catalonia (Sixteenth‑Nineteenth Centuries)” - Javier Antón Pelayo
- May 22, 2025
Challenges: Graziani Archive and Omeka S
-
Historical Research and Archival Sciences in a Digital Perspective
Relational Database, Data Architecture and Data Extraction in Graziani Archives Portal - Dorit Raines
- May 22, 2025
-
Reconciling Complex Historical Records with Omeka S Relational Database
The Case of the Graziani Archive - Gabriella Desideri
- May 22, 2025
-
A Puzzle with Missing Pieces
Extracting, Deciphering, and Digitally Rearranging Data in Antonio Maria Graziani Private Archives - Carlo Baja Guarienti
- May 22, 2025
-
How to Digitally Reconstruct the History of an Early Modern Private Library?
Antonio Maria Graziani (1537‑1611) and the Vicissitudes of His Books - Luca Iori
- May 22, 2025
| DC Field | Value |
|---|---|
|
dc.identifier |
ECF_chapter_18897 |
|
dc.contributor.author |
Bloch Agata |
|
dc.contributor.author |
Bojanowski Michał |
|
dc.contributor.author |
Santana Clodomir |
|
dc.contributor.author |
Vasques Filho Demival |
|
dc.title |
Cracking the Historical Code. From Unstructured Correspondence Corpora to Computational Analysi |
|
dc.type |
Chapter |
|
dc.language.iso |
en |
|
dc.description.abstract |
The chapter addresses a methodological approach to unstructured data and discusses the potential that structured data offers in the field of historical research. The dataset, which initially consists of textual content sourced from digital collections at the Portuguese Overseas Archives in Lisbon, undergoes a preprocessing phase that forms the basis for the extraction of structured data. The authors combine history, social sciences, and computer science to convert the correspondence repository into a machine‑processable form. This transformation is supported by an interdisciplinary strategy in which they weave together elements of effective content management, topic modelling, and social network analysis. |
|
dc.relation.ispartof |
Studi di archivistica, bibliografia, paleografia |
|
dc.publisher |
Edizioni Ca’ Foscari - Venice University Press, Fondazione Università Ca’ Foscari |
|
dc.issued |
2025-05-22 |
|
dc.dateAccepted |
2024-01-18 |
|
dc.dateSubmitted |
2023-10-03 |
|
dc.identifier.uri |
http://edizionicafoscari.it/en/edizioni4/libri/978-88-6969-919-1/cracking-the-historical-code/ |
|
dc.identifier.doi |
10.30687/978-88-6969-919-1/006 |
|
dc.identifier.issn |
2610-9875 |
|
dc.identifier.eissn |
2610-9093 |
|
dc.identifier.isbn |
978-88-6969-920-7 |
|
dc.identifier.eisbn |
978-88-6969-919-1 |
|
dc.rights |
Creative Commons Attribution 4.0 International Public License |
|
dc.rights.uri |
http://creativecommons.org/licenses/by/4.0/ |
|
item.fulltext |
with fulltext |
|
item.grantfulltext |
open |
|
dc.peer-review |
yes |
|
dc.subject |
Colonial Portuguese Empire |
|
dc.subject |
Digital infrastructure |
|
dc.subject |
Historical dataset |
|
dc.subject |
Public correspondence |
|
dc.subject |
Structured data |
| Download data |