- 1 Introduction
- 2 Editorial Policy for Translations on the Whitman Archive
- 3 Basic Steps to Preparation and Publication
- 3.1 Obtain archival quality digital images of the document
- 3.2 Record the document in the Whitman Archive tracking database
- 3.3 Process images
- 3.4 Transcribe and encode
- 3.5 Validate XML file
- 3.6 Perform initial checking of transcription and encoding
- 3.7 Perform second-level checking of the transcription and encoding
- 3.8 Blessing of the document
- 3.9 Final review
- 3.10 Published
- 4 Preparing Translations of "Poets to Come" for the Whitman Archive
- 4.1 Obtaining Digital Images
- 4.2 Transcription and Basic Encoding
- 4.3 Key Tags for Encoding "Poets to Come"
- 5 Contextual Essays about Translations of "Poets to Come"
- 6 Background and Contextual Reading
This wiki is meant to provide information and resources to participants in the 2011 Obermann Center Seminar on Whitman Translation, held May 16–20, 2011, at the University of Iowa. All participants are invited to contribute to and edit this wiki.
In advance of the seminar, Ed Folsom asked participants to gather translations of a small group of shorter Whitman poems, including “A Noiseless Patient Spider,” “When I Heard the Learn’d Astronomer,” “Poets to Come,” “Trickle Drops,” “We Two Boys Together Clinging,” and “Reconciliation.” During the course of the seminar, participants decided to develop the translation section of the Whitman Archive first by focusing on all known translations of a single poem, "Poets to Come," in the languages represented by the participants—French, German, Italian, Polish, Portuguese, and Spanish. The Archive will feature encoded transcriptions of all translations of the poem in these languages along with contextual introductions prepared by the participants and a general introduction. This addition to the Whitman Archive will allow users to read the poem in these various languages, to study differences in translation within a single language, and to do comparative analysis across languages. The encoding eventually will allow collation of the poems as well as comparison of single lines from each version to other translations from the same language, to translations in other languages, and to the 1891–92 printing of the poem in Leaves of Grass.
Editorial Policy for Translations on the Whitman Archive
The current editorial policy statement for the translations section of the Archive indicates we: provide page images when possible along with transcriptions. In the transcriptions we do not attempt to capture the so-called bibliographic codes—the appearance of margins, fonts, and ornaments in the original printed documents. Most other features of the printed page are preserved: capitalization, hyphenation, punctuation, and page breaks. Our electronic transcriptions preserve typographical errors present in the original; to aid in searching and to allow for alternate forms of display, corrected forms are also included in the encoding.
Basic Steps to Preparation and Publication
The Whitman Archive has established a standard set of procedures and workflow for items that appear on the Archive. Although participants in the translation seminar are not responsible for performing all of these steps, understanding the complete workflow may be useful. The complete workflow, in standard order, follows:
Obtain archival quality digital images of the document
Whenever possible, the Archive obtains archival quality digital images of documents to be presented on the Archive. In general, scans of documents are preferable to photographs unless the documents are so fragile that they can only be photographed. Scans should be produced as 24-bit uncompressed color TIFFs with a minimum resolution of 600 dpi. If the document must be photographed, the camera should be set to save images in a RAW camera format, and items should be photographed at the highest possible megapixel setting. If you will need to photograph items yourself, please consult with Liz for additional photography tips.
Record the document in the Whitman Archive tracking database
Every item of interest to the Whitman Archive is tracked in the Whitman Archive Tracking Database, which allows us to associate one or more editors and creators with each document; the form of the original; one or more genres; whether the document is manuscript, print, or both; holding repository information; additional information about the writing surfaces of the document, including page images; and other, non-standardized information about the document in the form of notes.
As part of the image processing step, we crop the images, rotate them if necessary, and perform color-correction as necessary. In addition, we produce various derivatives (smaller resolution and smaller file size versions) of the archival quality images for presentation on the Whitman Archive.
Transcribe and encode
The document is then transcribed by hand if no electronic text already exists. In rare cases, the Whitman Archive relies on OCR-capture for initial transcription. At the same time as the document is transcribed, it is also encoded. Encoding is the process of adding additional information to the file in a machine-readable format. Text encoding uses a markup language to tag the structure and other features of a text to facilitate processing by computers.
Validate XML file
Once the file has been fully transcribed and encoded, it is validated against the Whitman Archive's schema using a computer program. This step assures that the mark-up conforms to project guidelines. The schema declares all of the tags (elements) that can be used in a Whitman Archive TEI file, the order and hierarchy in which they can appear, and the kinds of content they can contain. When you validate a document, Oyxgen both makes sure that your file is well-formed (that elements are properly nested and that you haven't failed to close an element that you've opened) and checks your encoding against the Whitman Archive schema to make sure you haven't used any illegal tags or used specific elements in places they are not allowed.
Perform initial checking of transcription and encoding
The person responsible for the initial preparation of the TEI file checks his or her transcription and encoding and notes any trouble spots before passing the file on to a more senior editor.
Perform second-level checking of the transcription and encoding
A more senior Archive staff member checks both the transcription and encoding, paying particular attention to any places in the file that have been flagged by the original transcriber.
Blessing of the document
A senior editor reviews the transcription and encoding, adds additional metadata to the file if necessary, and "blesses" the document.
Ken Price or Ed Folsom reviews the document online and suggests additional changes if necessary.
The document is published on the Whitman Archive.
Preparing Translations of "Poets to Come" for the Whitman Archive
Seminar participants will work with Whitman Archive staff to prepare translations of "Poets to Come" for publication on the Archive. Participants are responsible for two main steps: 1) obtaining archival quality digital images of the translations; and 2) completing transcription and basic encoding. Participants will also be asked to review their work before it is made publicly available on the Whitman Archive.
Obtaining Digital Images
As indicated above, whenever possible, the Archive obtains archival quality digital images of documents to be presented on the Archive. The are some key specifications for producing/requesting archival-quality images: 24-bit uncompressed color TIFFs with a minimum resolution of 600 dpi. Please follow these specs for items you scan from your own personal collections and use these exact specifications in requests to repositories. If you would like some sample text for requesting images from a repository or you need other assistance in requesting scans, please contact Liz. If you own materials that should be photographed rather than scanned, or if you are working with a repository that prefers to photograph a document rather than scan it, please contact Liz for additional information.
Transcription and Basic Encoding
To complete the transcription and basic encoding of each translation, follow the instructions below. The instructions and template are also available as a .txt file available for download [here].
1. Create a new plain-text file (using a program such as Notepad, or any plain-text editor, *not* word processing software such as MS-Word) and save it with a filename based on the translator's last name and the year of translation (for example, laforgue_1886.txt).
2. Complete the transcription of the poem and the encoding based on the template below. For some versions of the poem, you made need to use additional tags, such as for words that have end-of-line hyphenation; misspelled, idiosyncratically spelled, or archaically spelled words; lines that wrap to two or more lines; and other features in the translation that you want the computer to be able to process in some way.
3. Provide responses to the questions following the template, including details on who completed the transcription and encoding and publication information about the translation.
4. Submit the file to Liz as an email attachment.
Transcription and Encoding Template:
<head type="main-authorial"> </head>
<l n=""> </l>
<l n=""> </l>
<l n=""> </l>
<l n=""> </l>
<l n=""> </l>
<l n=""> </l>
<l n=""> </l>
<l n=""> </l>
<l n=""> </l>
- 1. What is the title of the poem as given by the translator?
- 2. Who is the translator of the poem?
- 3. Who completed the transcription and encoding of the poem? On what date(s)?
- 4. What is the complete bibliographic information for the publication? For books, provide the title of the book, the place of publication and publisher, year of publication, and page number(s) on which "Poets to Come" appears. For magazines, journals, and newspapers, provide the title of the publication, volume and issue numbers when applicable, date of publication, and page number(s) on which "Poets to Come" appears.
- 5. Are there details about the book or periodical publication you want to be included in a note? For example, if the translation of "Poets to Come" appears with several other of Whitman's poems in a magazine, you may want to record that information. Or, are there other issues or circumstances surrounding the publication of this translation that you want to describe?
Key Tags for Encoding "Poets to Come"
<pb /> [page Break]
We use the <pb/> tag to indicate page breaks. This tag is inserted at the beginning of a new page. Use the <pb/> tag in every document, even if a document is only one page long. <pb/> is an empty tag, which means that you never need to "close" <pb>, but just insert a "/" at the end of the tag. The first <pb> tag goes after the <body> tag and before the first <lg>. If there are multiple pages, simply insert a <pb> at each place in the transcription that corresponds to the beginning of a new page. Often, these will occur at the close of one linegroup (</lg>) and before the opening of another (<lg>).
<lg> [line group]
The line group tag (<lg>) is used exclusively to mark clusters of poems, single poems, and structural sub-units within them (ie, groups of lines—"sections" or "linegroups"—that constitute distinct units within a poem). If the poem has no distinguishable sub-units within it, only a single <lg> tag is needed for the entire poem; if the work has one or more sub-units, you need to mark each of those units with the appropriate <lg>. For example, for a poem divided into three linegroups, the poem itself would be tagged <lg type="poem"> and each linegroup would be tagged <lg type="linegroup">. The type attribute is required; values include "cluster," "poem," "section," and "linegroup."
Marks the title of the poem or titles of sections, as Whitman has provided them. Each <lg> can have its own <head>.
The "type" attribute on this element can take two values:
- main-authorial (the main title provided by Whitman)
- sub (subtitle provided by Whitman)
<l> [poetic line]
Used to mark a poetic line. Use <lb/> to mark points where a line wraps on the page.
A sample structure might look like this:
<l>I celebrate myself,</l>
<l>And what I assume you shall assume,</l>
<l>For every atom belonging to me, as good belongs<lb/>
<hi> [highlighted text (italics, smallcaps, underlining, etc.)]
<hi> (highlighted) marks a word or phrase as graphically distinct from the surrounding text. Typically, we use it to indicate that individual words, phrases, or sentences within larger structures such as lines and paragraphs are highlighted in the original through italicization, the use of small caps, underlining, etc. The <hi> element uses the "rend" attribute to specify the nature of the highlighting:
|Value of 'rend' attribute||Function|
|underline||Indicates underscored text|
|italic||used only in transcriptions of printed material or in project notes to mark titles of books.|
<orig>, <reg>, <sic>, and <corr> [regularized spelling and corrections]
<orig> (original form) contains a reading which is marked as following the original, rather than being normalized or corrected.
<reg> (regularization) contains a reading which has been normalized in some sense.
We often use these tags when encoding Whitman's poetry, for instance in cases where a word at the end of a poetic line is hyphenated. Because we wish both to record the lineation of the copy text and to enable searches for words that are broken by end-line hyphenation, we use the <orig> and <reg> tags to record the original and regularized readings. This will allow the original version to be displayed online while users can still search for the regularized form and be directed to the passage in question.
<sic> (latin for thus or so ) contains text reproduced although apparently incorrect or inaccurate and is used to represent a mistake by the author.
<corr> allows the encoder to provide a correction.
All of these elements are nested within the <choice> element, which groups a number of alternate readings for the same point in a text.
Example for <orig> and <reg>:
Example for <sic> and <corr>:
<sic>the incorrect way it's written</sic>
<corr>the correct way to write it</corr>
Note: Sometimes what you might think of as a spelling error would more accurately be termed an alternate spelling. For words that are spelled in idiosyncratic—though not exactly incorrect—ways, use the <orig> and <reg> tags as described above. As an example, look at Whitman's spelling of Shakespeare as "Shakspere." Since this spelling of Shakespeare's name is one he himself used (and he never, as far as we know, used "Shakespeare"), it should be encoded as follows:
Contextual Essays about Translations of "Poets to Come"
Background and Contextual Reading
- Jerome McGann and Dino Buzzetti, "Electronic Textual Editing: Critical Editing in a Digital Horizon," in Electronic Textual Editing, ed. Lou Burnard, Katherine O'Brien O'Keeffe, John Unsworth (New York, MLA: 2006). available here
- Allen H. Rennear, "Text Encoding," in A Companion to Digital Humanities, ed. Susan Schreibman, Ray Siemens, and John Unsworth (Oxford: Blackwell, 2004). available here
- Amanda Gailey, "A Case for Heavy Editing," in The American Literary Scholar in the Digital Age, ed. Amy E. Earhart and Andrew Jewell (Ann Arbor: University of Michigan Press, 2010).
- James Cummings, "The Text Encoding Initiative and the Study of Literature," in A Companion to Digital Literary Studies, ed. Ray Siemens and Susan Schreibman (Oxford: Blackwell, 2008). available here
- Mary-Jo Kline and Susan Holbrook Perdue, eds., A Guide to Documentary Editing, Third Edition (Charlottesville: University of Virginia Press, 2008)
- Text Encoding Initiative P5 Guidelines.
- "A Gentle Introduction to XML" ().