Humboldt-Universität zu Berlin · Digital History

HistoRAG
Discipline-Oriented Retrieval-Augmented Generation

A framework for redesigning RAG systems around historical methodology — preserving source sovereignty, interpretive transparency, and temporal sensitivity where standard architectures undermine them.

Noah Kim-Baumann · Torsten Hiltmann · Professur Digital History

Standard RAG wasn't built
for historical research

RAG systems are designed for factual question-answering — find the relevant passages, generate the answer. Historical scholarship demands something different: source criticism before interpretation, temporal sensitivity across decades of discourse, and transparent collaborative reasoning rather than seamless answers.

Standard RAG

Seamless pipeline

Query → retrieval → generation in one step. Source selection is a technical optimisation hidden from the researcher. Similarity-based ranking favours recent vocabulary. No built-in space for source criticism. Output presented as answers.

HistoRAG

Structured research process

Two separated phases restore the historian's workflow: a Heuristik phase for source discovery and evaluation, followed by an Analyse phase for interpretation. The researcher curates what enters computational reading. Outputs are Zwischentexte (interpretive proposals, not conclusions).

Three architectural commitments

Drawing on Agre's Critical Technical Practice, we embed disciplinary values into system architecture rather than accepting computational defaults as neutral.

① Intervention

Separated Retrieval & Generation

Heuristik → Analyse

Formally decouples corpus construction from interpretation. Researchers examine, critique, and curate retrieved sources before any computational "reading" begins, thus restoring the heuristic phase that standard RAG eliminates.

② Intervention

Temporal Windowing

Kontinuitätsannahme

Enforces proportional retrieval across time periods. Left unchecked, similarity-based search embeds presentist bias, privileging sources whose vocabulary matches modern query terms while suppressing formative periods where concepts emerged.

③ Intervention

LLM-as-Judge

Quellenbeleg

Post-retrieval evaluation against researcher-defined criteria. Turns algorithmic selection from a black box into a transparent, argumentative process with scored justifications that can be reviewed and contested.

Two-phase pipeline

HistoRAG separates the RAG pipeline into distinct phases, each with explicit researcher control points. The architecture is transferable with specific implementations configuring chunking, embedding, and evaluation criteria as well as further features for unique use-cases.

Phase 1 — Heuristik (Source Discovery & Evaluation)
Corpus
source documents + metadata
Chunking & Embedding
configurable sizes → vector store
Semantic Retrieval
cosine similarity · HNSW index · FastText expansion
Temporal Windowing
proportional retrieval per time window
LLM-as-Judge Evaluation
scored justifications · researcher-defined criteria
Ranked & Evaluated Chunks
scores + justifications + full provenance
Phase 2 — Analyse (Interpretation & Synthesis)
Curated Corpus
researcher-selected chunks + metadata
+ research questions (≠ retrieval queries)
LLM-Assisted Interpretation
thematic synthesis · pattern recognition · multi-model
Zwischentexte
interpretive judgments, not conclusions
Historian Validation & Development
verify citations · contest interpretations · supply context
Historiographic Analysis
source-grounded · transparent reasoning
Verstehen remains with the historian

Zwischentexte

HistoRAG generates what we term Zwischentexte (intermediate texts). These are not answers but interpretive proposals: they lie between retrieved sources and historical argument, offering first proposals for interpretation that the historian can verify, contest, and develop.

"The central question for LLMs in digital humanities is not whether machines can 'read' but how we design systems that make their interpretive interventions visible and contestable, thereby preserving the scholar's epistemic agency throughout."

SPIEGELragged

Our first implementation of HistoRAG, applied to computerisation discourse in Der Spiegel (1950–1979). Tracking how West German society's understanding of automation evolved — from "Elektronenhirn" to "Computer" to "EDV," and from euphoria to anxiety.

102,189
articles in corpus
30
years covered (1950–1979)
~200k
embedded chunks
ρ = 0.275
similarity ↔ relevance correlation

1964 as earlier rupture

Public anxiety about automation crystallised fourteen years before the canonical 1978 "Computer-Revolution" — surfaced through reader letters that keyword searches missed.

Rationalisierung as semantic battleground

The same term carried opposed meanings depending on speaker position — efficiency for management, existential threat for workers — a pattern visible only at corpus scale.

Class migration of anxiety

Technological anxiety migrated upward through the class structure over time, becoming socially explosive only when it reached the discourse-producing classes.

HistoRAG Instances

HistoRAG is a transferable framework. Each instance configures the architecture for a specific corpus and research context.

Access to live instances is currently restricted to the internal team for testing. If you are interested, please get in touch.

Developed at Humboldt-Universität zu Berlin

Noah Kim-Baumann

Wissenschaftlicher Mitarbeiter
Professur Digital History
Institut für Geschichtswissenschaften
Humboldt-Universität zu Berlin

ORCID ↗

Torsten Hiltmann

Professor für Digital History
Professur Digital History
Institut für Geschichtswissenschaften
Humboldt-Universität zu Berlin

ORCID ↗

Read the Paper

HistoRAG: Designing a Methodologically Informed Retrieval-Augmented Generation System for Historical Research — Demonstrated through a Case Study of Der Spiegel (1950–1979) and the Computerisation of the Early Federal Republic.

Noah J. Kim-Baumann & Torsten Hiltmann · 2026

Currently under review at the Journal for Digital History. Published as an executable notebook article (Jupyter).

Kim-Baumann, N. J. & Hiltmann, T. (2026). HistoRAG: Designing
a Methodologically Informed Retrieval-Augmented Generation
System for Historical Research — Demonstrated through a Case
Study of Der Spiegel (1950–1979) and the Computerisation
of the Early Federal Republic. [Venue TBD].