A discussion of the aligned translation feature in Beyond Translation illustrated by a walk through of an alignment of a select section of the Iliad.
Beyond Translation is able to publish word and phrase level alignments between source text and translation. The inaugural release of Beyond Translation (March 2023) contains born-digital aligned translations for book 1 of the Iliad and book 5 of the Odyssey, developed by Amelia Parrish and Gregory Crane. These translations have been created to explore the form and function of born-digital aligned translations that are designed from the start to illustrate the function of each word and expression in the text as fully as possible. Farnoosh Shamsian has developed not only the first born-digital translation, but the first translation direct from the Greek, of Iliad 1 into Persian.
Figure 1 illustrates the basic functionality of a born-digital aligned translation in the initial release. It starts with a default screen and selects the alignments reading mode. The default for the alignment mode presents the source text and translation aligned according to sentence breaks : “Iliad Sentence Alignment (Crane) “.
The default sentence alignments following the breaks (as marked by periods and colons) in the Monro-Allen Greek edition (Monro 1920). Selecting instead the “Iliad Word Alignment (Parrish)”, the born-digital translation of Iliad 1 by Amelia Parrish and Gregory Crane, the sentence breaks have been manually revised. Our goal was to break the text up into the smallest possible sentence units so that readers with no knowledge of Greek, or who had only just begun the study of Greek, would be able to work with the shortest base sentences possible.
This section introduces the basic functionality of the aligned translation and thus touches only very briefly upon the decisions made in creating this translation. We plan a separate publication on the translation itself.
The Parrish/Crane translation begins with a slightly smaller chunking, breaking the first sentence after line 5, rather than line 7.
Selecting the “Show unaligned tokens” option reveals which English words do not correspond to the Greek and provides a quick sense of how close source text and original can be aligned.
The word “that” and “wrath” are added to this line to capture the impact of enjambment: line 1 is grammatically complete and the audience needs to reopen their model of what they are hearing to attach oulomenên (which we render “sociopathic”) to mênis, “divine wrath.”
Otherwise, the additions reflect differences between Homeric Greek and English: (1) because Greek does not require pronouns to stand in for subjects (it is a “prodrop” language), we add “It” in line 3 (AND IT IN LINE 4 SHOULD BE RED); (2) because in Homer the Greek word ho has not yet transitioned from a pronoun to its usage primarily as a definite article in Classical Greek, we have had to add “the” to the English in line 1, 2 and 5; (3) because Greek commonly lets the audience infer ownership, we have added “their” in line 4.
The alignments can link one or more words in the original with one or more words in the translation. This particular translation was designed for those who wished to see how the underlying Greek works and it thus seeks to illustrate as much of the grammar as possible. Consider for example the Greek imperfect form, eteuche, from the verb teuchô, “to fashion or make.”
Many translators treat eteuche as a simple statement. Richard Lattimore (1951) translated it as “gave” (“gave their bodies to be the delicate feasting”). Stanley Lombardo (1997) simply translates the verb as “left” (“left their bodies to rot as feasts / For dogs and birds”). Caroline Alexander (2015) chooses “rendered” ([it] “rendered their bodies prey for dogs”).
These simple past tense verbs would, however, more accurately represent the Greek aorist tense (literally, “undefined”). The imperfect tense (to quote the recent Cambridge Grammar of Classical Greek,” p. 406, section 33.6) “normally signifies than action is ongoing or repeated.” And, indeed, repetition characterizes the events depicted by eteuche. The poem starts with a simple past/aorist verb (proiapsen, “it hurled”) but then shifts to the imperfect tense. The murderous wrath of Achilles extends of time, with warrior after warrior being converted into snacks for dogs and birds.
We, however, translate the Greek imperfect form, eteuche, as “it began to make,” emphasizing that this verb describes a process that takes place over a period of time. In fact, a more graphic and striking translation might be warranted, where the wrath fashions their bodies the way a cook might convert the bodies of animals into attractive meals for humans. The unusal word helwria, which we translate as “snacks,” is a diminutive noun that comes from the Greek verbal root hel-, “to take.” The wrath converts the warriors into small morsels that the dogs and birds can snap up, like finger food for human beings.
Most translation alignment compares a preexisting translation to its source text. The alignments can reveal immediately compelling patterns that would otherwise be invisible. Beyond Translation offers once such example. Maryam Foradi, as part of her PhD at Leipzig University, manually aligned the Persian poetry of the Divan by Hafez with an English translation. We can see how particular words in English correspond to the Persian source.
If, however, we highlight English words that do not appear in the Persian original, the text changes completely.
The translator has added references to God and to religion that have no place in the Persian. He has converted the poetry of love and wine into theological allegory, completing misrepresenting the base text and invisibly (to the normal reader) imposing his interpretation. The change is immediately clear and dramatic when the alignments are used.
Computational linguists have long used statistical methods to detect which words co-occur unusually often in source text and translation to generate alignments automatically. The results can be noisy but patterns emerge all the same from the noise.
At Perseus, David Bamman was the first person to see the potential application of automated translation alignment to Greek and Latin. He used the bilingual Greek/English and Latin/English Perseus corpora (then based on c. 7 and 5 million words of Greek and Latin each).
Bamman (2011) created a Latin/English bilingual corpus that covered more than two thousand years. He used the relative frequency of different translation equivalents as proxies to track cultural changes within the preserved Latin corpus.
The Latin word oratio can, for example, designate a “speech” (e.g., in a court room) or a “prayer”(in a religious context). In the figure below (from Bamman’s article), we can see that “speech” dominates during the early period, but the frequency of “prayer” grows steadily, presumably along with the rise of Christianity in our texts. As the early modern age develops, “speech” becomes increasingly more prominent until it surges in importance c. 1800 CE.
If we move in the other direction and see what Latin words correspond to the English term “knight,” we can see that the Latin word eques, which designates knights as a social class, is dominant in the classical period. In the middle ages, however, the English term “knight” more and more translates the Latin word miles, which designates a soldier such as the medieval knight of popular imagination.
Alignment of translations to source texts also provides new ways to compare translations with each other and with the source.
We can also use accumulated manual alignments to detect patterns as well. In the figure below, for example, we see the results from a search for what words in Persian, Greek and English have been aligned to the English word “love.” The Greek examples capture a wide range of terms and allow users to explore a broad semantic field that includes sexual desire (erôs) and the idealized love to which Christians aspired (agapê)
Once we have a substantial number of manual alignments, we can use this to train a model for next generation automatic alignment system that also builds upon more general work with large language models. Yousef (2022) describes the system he built with data from Ugarit. When we used this automatic system (http://ugarit-aligner.com/) to align the born-digital aligned translation of the opening line of the Iliad, we found the results to be almost perfect. Our translation of mênis as “godlike wrath” represents our interpretation of the base meaning of that word and so the automated system simply aligns mênis with “wrath.” The one small error is that the automated alignment did not spot that Pêlêiadeô corresponds to “son of Peleus,” not just to “Peleus.”
Aligning source texts with translations is hardly new — readers of Greek, for example, have for centuries been accustomed to seeing translations of unfamiliar source texts into a language more familiar to the reader. For centuries, educated European readers used Latin translations from Greek.
Later, as vernaculars took over, bilingual editions emerged that offered translations in English (such as the Loeb Classical Library), French (such as the Budé series), and German (Tusculum editions).
When many students in Europe and the United States were required to study Latin and, in some cases, Greek, translations at the word and phrase level were published for some canonical texts. The figure below illustrates a more readable translation (on the left) juxtaposed with an interlinear translation, explaining the translation for each Latin word.
The Beyond Translation feature to show unaligned tokens also implements in digital form a long established practice from print culture. Literal translations into multiple European languages have long used italics to show which words in the translation have no equivalent in the original.
Alexander, Caroline, translator. Homer. The Iliad: A New Translation by Caroline Alexander, Ecco, 2015.
Bamman, David, and Gregory Crane. “Building a Dynamic Lexicon from a Digital Library.” Proceedings of the 8th ACM/IEEE-CS Joint Conference on Digital Libraries - JCDL ’08, ACM Press, 2008, p. 11, https://doi.org/10.1145/1378889.1378892.
Bamman, David, and Gregory Crane. “Computational Linguistics and Classical Lexicography.” Digital Humanities Quarterly, vol. 3, no. 1, 2009, http://www.digitalhumanities.org/dhq/vol/3/1/000033.html.
Bamman, David, and Gregory Crane. “Measuring Historical Word Sense Variation.” Proceedings of the 11th ACM/IEEE Joint Conference on Digital Libraries, 2011, pp. 1–10, https://people.ischool.berkeley.edu/~dbamman/pubs/pdf/jcdl2011.pdf.
Lattimore, Richard, translator. Homer. The Iliad. University of Chicago Press, 1951.
Lombardo, Stanley, translator. Iliad: Homer. Hacket Publishing Company 1997.
Maclardy, Archibald A., The Aeneid of Virgil, Book I.: Being the Latin Text in the Original Order, with the Scansion Indicated Graphically; with a Literal Interlinear Translation; and with an Elegant Translation in the Margin; and Foot-Notes in Which Every Word Is Completely Parsed ... Hinds, Noble & Eldredge, 1901, https://catalog.hathitrust.org/Record/102178583.
Monro, D. B., and T. W. Allen, editors. Homer Vol. I. Iliad (Books I-XII). Third Edition, Oxford University Press, 1920, https://archive.org/details/homeriopera01home.
Palladino, Chiara, et al. “Using Parallel Corpora to Evaluate Translations of Ancient Greek Literary Texts. An Application of Text Alignment for Digital Philology Research.” Journal of Computational Literary Studies, vol. 1, no. 1, Dec. 2022, https://doi.org/10.48694/jcls.100.
Pizzi, Italo. Manuale Della Lingua Persiana; Grammatica, Antologia, Vocabolario. W. Gerhard, 1883, https://catalog.hathitrust.org/Record/100547672.
Yousef, Tariq, et al. Automatic Translation Alignment for Ancient Greek and Latin. 2022, https://doi.org/10.31219/osf.io/8epsy.
Yousef, Tariq, et al. “An Automatic Model and Gold Standard for Translation Alignment of Ancient Greek.” Proceedings of the Thirteenth Language Resources and Evaluation Conference, European Language Resources Association, 2022, pp. 5894–905, https://aclanthology.org/2022.lrec-1.634.