Stemmatology, an interdisciplinary endeavour

Armin Hoenen



In this paper, stemmatology for the digitisations of ancient documents
is focussed as an example of a digital humanistic approach.
A stemma codicum is a visualisation of the genealogical relationships
between manuscripts containing a historical text, which was typically
transmitted chirugraphically. The visualisation is arising from the copying
history of these texts and is ususally an acyclic directed graph with
one root node. While the humanities have developed various theories
and practises closely connected to the goal of producing a textual
edition (see for instance West (1973)), for computer science the field
remains relatively recent and is dominated by the humanities. Computationally
produced stemmata are revised according to philological
expertise (see for instance Roelli and Bachmann (2010)) and few new
editions use computationally generated stemmata. Available encoded
metadata (e.g. paleographic) is still largely missing. In fact, the humanities
have a vast research advantage in terms of research time:
As was declared as early as 1734 by a scholar named Bengel: ”a perfect
edition of the New Testament would propose a classification of
the codices for their genealogical relations”, Pasquali (1988). Since
then, most editions start to give a stemma in their preface or appendix.
Before this, according to Timpanaro (2005), tables had been in use.
There is no explicit standard yet for the visual representation of stemmata,
but some general practises such as denoting lost manuscripts at
internodes (reconstructed hypearchetypes) by Greek letters and minor
influences across manuscripts by dotted lines are visible.

While West (1973) remarks that ”the time has not yet come when
manuscripts can be collated automatically” and ”it seems, computers
can serve us best by making concordances and the more unsubtle kinds of metrical analysis”, in the later half of the 1980ies automated
stemmatology loaned methods, algorithms and software from
bio-informatic phylogeny and started to be used in textual criticism,
Christopher J. Howe and Windram (2012). To date, the methods are
not fully adapted for the needs of textual criticism. As an example, a
large number of representations are exclusively bifurcating structures
because new species usually do not split off simultaneously (apart from some cases in adaptive radiation) while manuscripts are usually not
copied just once. The adaptation of philological needs into stemmatology
is an ongoing digital humanities process.

Mehler and L¨ucking (2014) discuss the gap between Computation
and Humanities. In their paper, they mention the stemmatological
reconstruction of lost hypearchetypes as an exemplary manifestation
of a more general problem found among others in text classification,
topic detection and labeling, textual entailment, topic tracking and text
reuse. Accordingly, stemmatic problems are believed here to be generalizable
and a valid primer for similar problems in bio-informatics and
other fields. For instance, bacterial lateral gene transfer is a potentially
similar problem to the contamination problem (one manuscript having
two vorlages). Summarizing, a subfield which formerly had two participating
sciences, biology and computer science, in this case now gains
a third partner.

Generally speaking, the answer to the question who transfers the
models is multiple, there is no rule as to whether this must be a computer
scientist or an adherent of the humanities discipline, a ”halfblood”
or a team of two or more scientists. Ideally, the digital humanities
approach as an interdisciplinary approach can integrate multiple
advantages. The computer scientific advantages of effective data processing
and the results from statistical points of view onto the data
can combine with the knowledge for detail and historical context of
the humanists. As a broader example, Rayner et al. (2012) rejected
letter ngram based reading modelling as neurologically plausible simulations
on the basis of psycholinguistic research, even though they
had some predictive power. This points to a possible digital humanities
compromise, which is to take that solution which gives the best
results (computation) but is at the same time plausible (humanities).

The full paper will contain a comprehensive history of the development
of stemmatology and abstracting from that it will highlight the
co-involved digital humanities dimension.


  • Christopher J. Howe, R. C. and Windram, H. F. (2012). Responding to
    criticism of phylogenetic methods in stemmatology. SEL studies in
    English Literature, 52(DOI: 10.1353/sel.2012.0008):51–67.
  • Mehler, A. and Lücking, A. (2014). On Covering the Gap between
    Computation and Humanities. Dagstuhl Reports. forthcoming.
  • Pasquali, G. (1988). Storia della tradizione e critica del testo. Casa
    editrice Le lettere, Firenze.
  • Rayner, K., Pollatsek, A., Ashby, J., and jr. Clifton, C. (2012). Psychology
    of Reading. Psychology Press, New York/Hove.
  • Roelli, P. and Bachmann, D. (2010). Towards generating a stemma of
    complicated manuscript traditions: Petrus alfonsi’s dialogus. Revue
    d’histoire des textes, 5(4):307–321.
  • Timpanaro, S. (2005). The Genesis of Lachmann’s Method. University
    of Chicago, Chicago.
  • West, M. L. (1973). Textual Criticism and Editorial Technique: Applicable
    to Greek and Latin texts. Teubner, Stuttgart.