About

Encoding Guidelines

1 Introduction
2 Global
- 2.1 Header
- 2.2 Unique Identifiers
3 Basic Document Structure by Genre
4 Basic Elements for Marking Structure
5 Other Common Elements
6 Page Breaks and Image Linking
7 Spacing
8 Unusual Characters & Marks
9 Special Cases
10 External Links
11 Resources

Introduction

Use your browser's "Find (on this page)" function (Ctrl + F) to locate the term or string of characters you're looking for.

Using The Walt Whitman Archive Encoding Guidelines

These guidelines define current transcription and encoding practices of the Whitman editors and staff as we make Whitman's writings available on our website, The Walt Whitman Archive. While for the most part our practices have stabilized, discussion is ongoing, and our practices continue to evolve. The guidelines were last updated November 2016.

Two major sections make up the core of the guidelines: "Global" and "Local." Global describes those aspects of encoding that are consistent from document to document. Local addresses aspects that vary from document to document. We have designed the guidelines to be read in order, so we recommend that you first read the section on global encoding before moving on to the local. Additionally, there is an Annotated Template with the basic tagging filled in and a Resources section, which lists several useful tools and references, such as library codes, preferred citations, and links to the TEI guidelines. If you have questions or comments please email Kenneth Price or Brett Barney.

Some Basic Vocabulary

The primary audience for these guidelines is Walt Whitman Archive staff, and staff members come to the project with varying degrees of familiarity with humanities computing. The following explanations of selected basic terms are intended to help those with limited experience better understand the guidelines:

TEI (Text Encoding Initiative)

As described on the TEI Consortium hompage, "The Text Encoding Initiative (TEI) is a consortium which collectively develops and maintains a standard for the representation of texts in digital form. Its chief deliverable is a set of Guidelines which specify encoding methods for machine-readable texts, chiefly in the humanities, social sciences and linguistics. Since 1994, the TEI Guidelines have been widely used by libraries, museums, publishers, and individual scholars to present texts for online research, teaching, and preservation." In other words, the TEI is a standard for making transcriptions of complicated texts (including handwritten manuscripts) readable by computers.

Markup and Encoding

Generally, these terms refer to the "tags" that we include in our transcriptions to mark textual features in a way that allows them to be processed by a computer.

Tag

The string of characters surrounded by "<" and ">". For example: <add>. Tags typically come in pairs, an "opening" one to mark the beginning and a "closing" one to mark the end of a portion of the transcription. The example above is an opening tag. A closing tag includes a slash after the "<" to distinguish it: </add>. A pair of tags describes all of the transcription that they enclose, so if you wanted to indicate that the word "crusty" was added to a text, you would tag it like this: <add>crusty</add>.

Element

This is the core part of a tag—the first string of characters after the "<" in the open tag. For example, in <add rend="insertion">, "add" is the element.

Attribute

This is a secondary part of a tag that creates a category for further describing the element. It appears after the element name in the opening tag. An attribute must be followed by a value (see next). An element may have more than one attribute, each separated from the element name and from other attribute/value combinations by a single space.

Value

This is a word or short phrase that classifies the element in terms of a particular attribute. It is contained within quotation marks and preceded by an equal sign. For example, in the following add tag, "rend" and "place" are attributes and "unmarked" and "supralinear" are values: <add rend="unmarked" place="supralinear">.

DTD (Document Type Definition)

This is a file that functions as our rulebook. Validating (or "parsing") our markup against the DTD is one tool we use to see whether our files are properly encoded. Though our DTD implements the TEI standard, the Whitman Archive DTD is unique, customized for our specific project needs.

Nesting

This refers to the practice of enclosing pairs of tags within other pairs of tags, a basic principle of properly structuring XML documents. For example, the tag pair <TEI> and </TEI> contain almost all of the other markup within a document (see the Annotated Template for a visualization of this). It is important to nest tags properly, so that a particular element does not overlap other elements. The following is an example of improperly nested markup: <l><add></l></add>. Compare this properly nested example: <l><add></add></l>. At any given point in a document several tags may be "open," but the tag most recently opened must always close before earlier tags close.

Stylesheet

An XSLT (Extensible Stylsheet Language for Transformation) stylesheet is a file that transforms the encoded manuscript into HTML (HyperText Markup Language) for use on the web. In other words, stylesheets are what we use to make the various XML-encoded documents display in consistent and attractive ways on our site.

The Purpose of Encoding

The Walt Whitman Archive is developing in many areas simultaneously, involving work on a variety of materials, from handwritten manuscripts to printed books to drawings and photographs of Whitman. The XML encoding of manuscripts, in particular, has presented us with numerous challenges. XML encoding is not mechanical but interpretive. Sophisticated users of the Whitman Archive may wish to understand our tagging from the inside, as it were, thereby better grasping the query potential of our Archive. We wish to be overt about what the Whitman Archive has chosen, thus far, to encode for different Whitman materials. The thus far in that sentence points to a key aspect of encoding: it is a process that can go through multiple passes and layerings. Generally speaking, our approach in tagging is to encode at a structural level. Thus we tag paragraphs, lines, and the like. This tagging will enable searches on discrete parts of documents (e.g., paragraphs, lines, sections, titles) within individual documents, across printed documents, across all documents, and so forth. One could also do, say, thematic or subject-based tagging, but we have avoided that because we have felt that it would lead to endless internal debates on the project and might lead to a too-coercive editorial presence or at least to an end product that was too bound and too limited by the perspectives and the historical moment of the current creators of this site.

There are other features that it would not be controversial to tag, that would be useful information to have accessible, but that we have nonetheless chosen not to tag. For example, we have not recorded paper types, nor have we systematically noted ink and pencil colors. We can certainly imagine scholars who could make brilliant use of this information if it were recorded across all of Whitman's available documents. Still, a project such as the Whitman Archive constantly faces practical questions about what to prioritize. The magnitude of the entire undertaking is so vast that we know that we can at best hope to achieve a first pass through the material. Whitman himself sometimes thought that he left his writings for "poets to come" who would justify him and make clear his significance. Something analogous is at work in our hope that we can produce through the Whitman Archive not a monumental product but instead a monumental process that can be continued, corrected, and otherwise improved by future scholars. Other scholars with special interests in particular aspects of textuality could take our initial tagging and add additional layers that would enable various types of analysis.

Assuming continuing cooperation from libraries, we hope to make it possible for any interested person to look at images of all the known manuscripts; to search, in complicated ways, the text of those manuscripts and the rest of Whitman's work; to find out quickly where the physical documents are held; and to begin to make sense of a vast collection of important documents. The work of individual encoders is invaluable to the project, as it translates into practice the theoretical conclusions that we've reached since beginning work on the Whitman Archive in 1995. Encoding manuscripts and other documents is some of the most complex and valuable work we have underway.

Getting Started

Before you begin to encode your first manuscript, you'll need to get an assignment from one of the project editors. This process is managed by a series of spreadsheets accessible through Google Docs. See the editor of your project for a link to the relevant spreadsheet. Here, you will find a list of documents available for transcription and the unique id that you will use while encoding. After you locate an assigned document in the spreadsheet, you must mark it with your initials and "in progress" to indicate that you have begun working on it.

Once you have located a document, you will need to search for it in the Whitman Archive Tracking Database (password protected). Often, it will be useful for you to consult the notes accompanying the manuscript images, notes that were made by the institution or individual as they created the digital images. These notes describe the relationship of the individual images to other images (for example, recto/verso relationships) and, where applicable, provide the folder name and other important bibliographical information. To look at these notes (which, due to the complicated history of the project, vary in form and completeness), see the "Notes" field of the database entry for a given document.

Individual encoders use various software to create encoded documents, though oXygen XML Editor has been popular and is recommended. For basic transcription work, NoteTab may be useful. It is also important to work at a machine with imaging software that allows you to "zoom in" on complicated portions of the manuscript or do other manipulations that enable easier and more accurate transcription.

Once you have finished transcribing and encoding and have validated the file against the DTD, you must upload your file to the appropriate place on the development server, note that the document is finished by changing the status in the Google Docs spreadsheet, and return to the Whitman Archive Tracking Database to check the "status" boxes that indicate that your document has been transcribed and encoded. This procedure updates the file's status so that others may double-check it and publish it on the site.

Global

[Note: Above the Header of each document are the XML Declaration, Document Type Declaration, Schematron Declaration, and the open tag of the "root element," TEI, which contains all other elements. Please go to the annotated template to see how to insert these.]

Every XML document we create has a "header," which carries essential information about who is responsible for creating and publishing the document, the source of the text we are marking up, and kind of electronic title page. The header is analogous to a book's first few pages, which inform you of the author, publisher, copyright date, terms of publication, etc.

Since much of the information in the header is the same for all of the XML documents we create, we recommend that you use the annotated P5 template (to download the XML file, right-click on the link) to simplify your encoding of it.

Below, you will find descriptions of the main parts of the header.

The <teiHeader> has two principal components:

<fileDesc> contains a full bibliographic description of an electronic file
<revisionDesc> summarizes the revision history for a file

These elements are arranged within the <teiHeader> in this order, so the overall structure of <teiHeader> is this:

<teiHeader> <fileDesc></fileDesc> <revisionDesc></revisionDesc> </teiHeader>

<fileDesc> File description

<fileDesc> should contain the following components:

<titleStmt>
<editionStmt>
<publicationStmt>
<notesStmt>
<sourceDesc>

<titleStmt> Title statement

The title statement includes
1) the title given to the electronic work (which here always includes the subtitle provided by us: "a machine readable transcription")
2) the author (for this field, you should use the regularized version of the author name, for instance "Walt Whitman")
3) the editors (NOTE: if a contributing editor needs to be listed, you can add an @role="contributing" to the <editor> tag
4) information about others responsible for aspects of the electronic text
5) the name of the sponsors and funders.

Here is an example of a full title statement in which the original document bears a title given by Whitman:

<titleStmt> <title level="m" type="main">Song of Myself</title> <title level="m" type="sub">a machine readable transcription</title> <author>Walt Whitman</author> <editor>Kenneth M. Price</editor> <editor>Ed Folsom</editor> <respStmt> <resp>Transcription and encoding</resp> <persName xml:id="ss">Stefan Schoeberlein</persName> <persName xml:id="kc">Kirsten Clawson</persName> <persName xml:id="nnk">Nima Najafi Kianfar</persName> <persName xml:id="nhg">Nicole Gray</persName> </respStmt> <sponsor>Center for Digital Research in the Humanities, University of Nebraska-Lincoln</sponsor> <sponsor>University of Iowa</sponsor> <funder>The National Endowment for the Humanities</funder> </titleStmt>

An example for a manuscript that lacks an authorial title (to read the guidelines for assigning titles, click here):

<titleStmt> <title level="m" type="main" rend="bracketed">I see who you are</title> <title level="m" type="sub">a machine readable transcription</title> . . . </titleStmt> etc.

<editionStmt> Edition statement

The edition statement gives the current date. Example:

<editionStmt> <edition> <date>2016</date> </edition> </editionStmt>

<publicationStmt> Publication statement

The publication statement includes the unique id number <idno>, distributor <distributor>, address <address>, and a statement of rights and availability <availability>. For guidelines on how to generate a statement of rights and availability specific to the item you are working on, click here.
Example:

<publicationStmt> <idno>anc.02076</idno> <distributor>The Walt Whitman Archive</distributor> <address> <addrLine>Center for Digital Research in the Humanities</addrLine> <addrLine>319 Love Library</addrLine> <addrLine>University of Nebraska-Lincoln</addrLine> <addrLine>P.O. Box 884100</addrLine> <addrLine>Lincoln, NE 68588-4100</addrLine> </address> <availability><p>The text of the original item is in the public domain.</p><p>The text encoding and annotations were created and/or prepared by the <title level="m">Walt Whitman Archive</title> and are licensed under a <ref target="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ref> (CC BY 4.0). Any reuse of the material should credit the <title level="m">Walt Whitman Archive</title>.</p></availability> </publicationStmt>

<notesStmt> Notes statement

In some cases you may wish to provide some additional explanation of the date you have assigned to a manuscript, for instance. That information should be entered into a <notesStmt> with a <note type="project" target="#dat1">, as follows:

<notesStmt> <note type="project" target="#dat1">This manuscript was probably written in <date when="1871">1871</date> after Whitman accepted the invitation from the American Institute to compose and recite a poem at the opening of its fortieth Annual Exhibition in New York City. Whitman read the poem on <date when="1871-09-07">September 7, 1871</date>, and it was published on that date in the <hi rend="italic">New York Evening Post</hi> and on subsequent days in at least eight other newspapers.</note> </notesStmt>

<sourceDesc> Source description

The source description provides a bibliographic description of the copy text(s) used in the creation of the present electronic text. The source description will always include a <bibl> element, where we cite known bibliographic information about the original manuscript source. In cases where the original manuscript is lost or we do not have access to it and we are forced to derive our transcription from a different source, we will also include a <biblStruct> element, where we include citation information for the printed source (see example below). In cases when a <biblStruct> is needed, it will appear before <bibl>.

[Note on <bibl> versus <biblStruct>: According to the TEI Guidelines, <bibl> and <biblStruct> are both elements that contain a bibliographic citation; the latter is simply reserved for a more structured citation, in which sub-elements appear in a specific order. Our decision to encode manuscript citation information in a <bibl> (as opposed to a <biblStruct>) stems from the variability of bibliographic information for manuscript material (not all repositories, for example, assign ID numbers or titles to individual manuscripts).]

Example of a citation for a transcription derived only from a manuscript:

<sourceDesc> <bibl> <author>Walt Whitman</author> <title>Notebook LC #86</title> <date cert="medium" notBefore="1845" notAfter="1855" xml:id="dat1">Around 1850</date> <idno type="callno">MSS45443</idno> <orgName xml:id="loc">The Thomas Biggs Harned Collection of the Papers of Walt Whitman, 1842–1937, Library of Congress, Washington, D.C.</orgName> <note type="project">Transcribed from digital images of the original.</note> </bibl> </sourceDesc>

Example of a citation for a transcription derived from a print source, where the manuscript's location is unknown:

[Note: Additional information about the print source can be supplied in a <note type="project">, included after <monogr>]

<sourceDesc> <biblStruct> <monogr> <author>Walt Whitman</author> <editor>Edward F. Grier</editor> <title xml:id="nupm">Notebooks and Unpublished Prose Manuscripts</title> <imprint> <pubPlace>New York</pubPlace> <publisher>New York University Press</publisher> <date when="1984">1984</date> <biblScope type="page">202</biblScope> <biblScope type="volume">1</biblScope> </imprint> </monogr> <note type="project">Grier derives his text from <hi rend="italic">Notes & Fragments</hi>, ed. Richard Maurice Bucke (1899), 71.</note> </biblStruct> <bibl> <author>Walt Whitman</author> <title rend="bracketed">Not to dazzle</title> <date xml:id="dat1" notBefore="1845" notAfter="1855" cert="medium">Before or early in 1855</date> <orgName xml:id="med">The location of this manuscript is unknown.</orgName> </bibl> </sourceDesc>

Notes on <author> within <bibl>: For printed texts, the author name will sometimes be unlisted, or as listed will vary from the regularized version. You might, for instance, run across something signed "Walter Whitman" rather than "Walt Whitman," or you might be dealing with a nom de plume. In such cases, within the <author> tag, you can include a <choice>, where <orig> is the name as listed (or unsigned), and <reg> is the regularized or known author name. You can also add an @key to designate how you would want the author name to sort in an index. For example:

<author key="Whitman, Walt"> <choice> <orig>Mose Velsor</orig> <reg>Walt Whitman</reg> </choice> </author>

Notes on <title>: This information is about the copy text, and the <title> here (as opposed to the one in titleStmt) should be given exactly as it appears in the records of the institutional repository, no matter how imprecise or wrong-headed their conventions may seem. Many times, the most specific title for the material will be that given to the folder used to store it, since few archives assign a title to each individual item; often, therefore, the <title> given in the <sourceDesc> will be a folder label.

Note on <idno>: This provides the call number by which the object (or the larger collection) is identified at the repository. You should be able to find it in the individual catalog or finding aid for the repository in question. Click here for a list of available online finding aids.

Note on <orgName>: The institution that holds the manuscript should be cited as listed in #Preferred Citations, below.

At present, we almost always work from digital images, but we have also worked from Joel Myerson's facsmile reproductions of Whitman manuscripts (published in Joel Myerson, The Walt Whitman Archive: A Facsimile of the Poet's Manuscripts, New York: Garland, 1993.); from the Primary Source Media Whitman CD (Major Author's on CD-ROM: Walt Whitman, Eds. Ed Folsom and Kenneth M. Price, Woodbridge, CT : Primary Source Media, 1997); or from the original manuscripts themselves. Whatever the case, specific information about the image(s) and/or text(s) you rely on should be given in a <note type="project">. If you consult more than one thing, list each, separated by semicolons. (Please note that when citing Myerson, the volume #, part #, and page # change from manuscript to manuscript.)

If a document has material on the verso that constitutes another text, with another Whitman Archive ID, add a <relatedItem> within the <bibl>. <relatedItem> can be @type="text", with @xml:id, and @target corresponding to the Whitman Archive ID of the text on the verso, or @type="document", with @target corresponding to the document ID assigned to both Whitman Archive objects. Since multi-text objects are assigned a document ID, you should have at least two relatedItems. You should also add a project note explaining the relation, with @target pointing to the xml:id assigned to the relatedItem with @type="text". For example:

<relatedItem type="document" target="wwa.00115"/> <relatedItem type="text" xml:id="rel01" target="loc.00507"/> <note type="project" target="#rel01">On the back of this note is a manuscript fragment with several lines of prose that were included, with slightly-revised wording, as lines of poetry in the initial poem of the 1855 Leaves of Grass, ultimately titled "Song of Myself."</note>

On rare occasions, a <relatedItem> may refer to another leaf that is materially related to the document being edited. An example of this would be another leaf that has similar characteristics of paper and ink and seemingly some continuity in thought, but not enough to treat it as part of the same text. For a specific example, see duk.00296.

<profileDesc>

In the <profileDesc> is a list of all hands other than Whitman's that the markup declares as being in any way responsible, typically as the value of a "resp" (or "responsibility") attribute in a note. For example, if Horace Traubel wrote a note at the top of a letter, you would include the following in the <teiHeader>:

<profileDesc> <handNotes> <handNote xml:id="h1" scribeRef="#ht"><persName xml:id="ht">Horace Traubel</persName></handNote> </handNotes> </profileDesc>

(For more on this topic and how to encode non-Whitman writing on manuscripts, see below, #Writing in Hands Other than Whitman's.

<revisionDesc>

The <revisionDesc> element is used to summarize the changes that have been made to the file. If multiple changes are performed at different times, add another <change> at the top, so that changes are listed in reverse chronological order (most recent change first). To describe the tasks in our routine workflow, choose from the following terms for the content of <change>:

   * transcribed, encoded
      * checked, revised
      * edited
      * blessed

If the task is something other than these, any descriptive phrase can be used. Example:

<revisionDesc> <change when="2015-06-12" who="#nhg">converted to P5; updated header</change> <change when="2010-08-12" who="#bb">transcribed; encoded</change> </revisionDesc>

Unique Identifiers

Unique identifiers are one-of-a-kind names assigned to each electronic text we create. IDs are assigned at three levels. One ID is assigned at the level of individual texts, or single semantic units. This ID forms the basis of the file and the tracking database entry. We also assign a document ID. For further discussion of the Whitman Archive object, the unique identifiers assigned to both texts and documents (and the distinction between the two for the purposes of the Whitman Archive), see the "Organizing Principles" section of the Editorial Policy Statement and Procedures. Finally, we assign a work ID. For a discussion of the distinction between work and document, click here.

Creating and Assigning IDs

Text-level IDs are made up of a 3-character repository code plus a 5-digit number (assigned in ascending order), with the two fields separated by a dot.

Examples:

loc.00158 (a manuscript at the Library of Congress)
uva.00001 (a manuscript at University of Virginia)

Document-level IDs are made up of a wwa prefix plus a 5-digit number (assigned in ascending order), with the two fields separated by a dot.

Examples:

wwa.00001
wwa.00050

For works, IDs are made up of a xxx prefix plus a 5-digit number (assigned in ascending order), with the two fields separated by a dot.

Example:

xxx.00030 (the work ID assigned to Whitman's poem "Eidolons")

ID Databases on the Web

We use a database to track the unique identifiers and our workflow as we transcribe, encode, and upload manuscripts. This database can be accessed here.

Placement of IDs

At present, we name files according to the text-level ID, which also forms the basis for the tracking database entry for a given object. The text-level ID appears in two places in the TEI header:

As an attribute value in the TEI root element (the very first tag):

<TEI xmlns="http://www.whitmanarchive.org/namespace" xml:id="loc.00158">

As content in the <publicationStmt>:

<publicationStmt> <idno>uva.00001</idno> </publicationStmt>

For work IDs, see the section on #Work Relationships below.

Transcription File Names

To name the file when you save it, simply add the file extension ".xml" to the ID. Example: "uva.00001.xml"

Work Relationships

Note: The creation of Work IDs and the encoding of documents that associates them to those IDs are typically done by upper-level staff people and editors

We encode the relationship of an individual manuscript to a work (or works). The constituent parts of published volumes are also associated with works. As opposed to a "document," which is a particular instantiation or instance of a poem or essay or book, etc., a "work" is the abstract idea of a poem or essay or book, etc. In general, works are titled according to the last instance published in Whitman's lifetime. For example, the work "Song of Myself" refers not to any particular manuscript or printed version of that poem, but to all of the versions collectively, whether titled differently, untitled, written in prose, etc. Individual instances associated with that work include the poem printed in the "deathbed edition," titled "Song of Myself"; the initial poem in the 1855 edition of Leaves of Grass; manuscript drafts of lines included in the poem; and sections of notebooks with ideas and trial phrases that contributed to the composition of the poem.

Work Relationship Guiding Principles

Developed in meetings 2016-17 by BB, KMP, KM, and NHG.

1. A work is conceived as the abstract idea of a textual unit (a poem, for instance). For practical purposes, we have assigned Work IDs based on the titles and content associated with the final printed version of a given work that was published in Whitman's lifetime.

2. Work IDs should be assigned to manuscript instances at the smallest possible unit. This is to say, a manuscript of "Song of Myself" would be assigned the Work ID for "Song of Myself," NOT Leaves of Grass.

3. A manuscript should be assigned its own Work ID if and only if it does not connect to a published piece. If a manuscript is a draft of another manuscript that includes lines that led to a published poem, the draft manuscript would be assigned the Work ID associated with the published poem.

4. If two poems become one poem in a later edition of Leaves of Grass, the Work ID should be the one associated with the final version. Conversely, if one poem becomes two poems in a later edition of Leaves, you should include Work IDs for both of the poems it became.

5. [As decided 2/1/17 in meeting with KP, BB, KM, NHG, SF] We DO currently specify as separate works larger structures of organization or things that do not have content separate from the sub-units that comprise them (this refers to arrangements, like clusters). 6. Work IDs are assigned to each edition of Leaves of Grass. These are useful for associating such things as Whitman's title page mock-ups or notes about the ornamentation to be used in a particular edition.

Creating Work IDs

New work IDs are created only when it is determined that a particular document cannot be associated with one of the works for which we have already created an ID. The creation of new work IDs is a sacred responsibility and should only be undertaken by Brett, Kevin, or one of their assigns. To determine whether a work has already been assigned an ID, we conduct a careful search of the tracking database, which contains a record for every work that is currently represented. All work IDs begin with the prefix "xxx." When the database contains no record for the work, a new record is created. The new work is titled according to the protocols outlined below and assigned the next available "xxx.#####" ID. All works are also represented by basic TEI files, which are kept in the following directory on the server: whitmanarchive/work_files/. When a new work is created and assigned an ID in the database, a TEI file representing that work is also created and added to the work_files directory.

Assigning Titles to Works

To assign a title to a work, we consult the last Whitman-authorized instance of that Work. For published works, we consider the following (and in the following order) when determining the exact form of work titles: title as printed on the title page; title as it appears in the table of contents; title shown in the heading of the individual item in the body of the text. For most works, the title will have no title-page representation. If the title is represented in both a table of contents and in a heading, we use the capitalization, punctuation, and spacing of the table of contents, except when it represents an abbreviated form of the longer main title that appears in the heading. In those cases, we add the extra title words from the heading in the body, capitalize them according to current-day titling conventions, and include any punctuation, except terminal periods. Titles written in a mixture of full and small capitals are interpreted and rendered as though the small capitals were lower-case letters. On the rare occasion when a title appears only in all-caps (as, for example, here) we apply current-day titling conventions in creating the work title. NB: Unless they appear in the table of contents, subtitles or other words of ambiguous status that appear below the main heading in the body are not included. For example, the poem shown in this table of contents entry in the "deathbed edition" of Leaves of Grass is headed thus in the body of the text. Therefore, the work has been given the following title: To the States, To Identify the 16th, 17th or 18th Presidentiad. And the poem titles shown here and here yield this work title: A March in the Ranks Hard-Prest, and the Road Unknown.

Occasionally, the title in a table of contents differs from the title in a heading in a more substantive way, so that the two constitute different versions rather than the first being an abbreviation of the other. In those cases, we follow the title as it appears in the table of contents. For example, the title given in the table of contents of Complete Prose Works as this but in the heading of the body as this yields the following work title: Elias Hicks, Notes (such as they are). Similarly, the essay titled in the table of contents in this way but in the body like this produces this work title: Preface to English Edition "Democratic Vistas."

In some cases, the title that results from following the above protocols will be insufficient to distinguish one work from another, as Whitman sometimes gave two or more things identical or nearly identical titles. In these cases, we add disambiguating words or phrases after the title. The disambiguating word or phrase is always enclosed in square brackets. If the ambiguity can be resolved by reference to the form of the work, we append the name of the form in square brackets. Examples: Whispers of Heavenly Death [poem]; Now Finalè to the Shore [cluster]. When two works of the same type have the same title, we append the first words from the body of the work in square brackets. The number of words appended is determined according to the same rules for assigning derived titles for manuscripts; viz., we use the first words up to the end of the line or line segment, up to the first punctuation mark, or up to the end of the fifth word, whichever comes first. If this protocol produces a title that is still ambiguous, further words are added until the ambiguity is resolved. NB: While punctuation is taken into consideration as a determining factor when creating the bracketed title extensions, it is never actually included in the brackets. Examples: Good-Bye my Fancy [Good-bye my fancy I]; Good-Bye my Fancy [Good-bye my Fancy Farewell]

Many works exist only in forms that were never given a title by Whitman. The majority of such cases are manuscripts with no known connection to anything Whitman published. A few, however, are works published without titles. For such works, we derive the titles according to the same protocols we use for deriving titles for the purposes of transcribing individual manuscript instances (See #Title in the Title Statement).

Associating Instances with Works

This work relationship will be encoded within the intellectual unit of which the related text is a part. If the document contains <div>s or <lg>s, the work relationship information will be encoded immediately within the <div> or <lg> that contains the related text. If the document is not divided into smaller intellectual units (such as <div>s or <lg>s) then the work relationship information will be encoded at the top of the TEI file, immediately within the <TEI> tag. The following example, in which the work relationship information appears at the top of the file, is taken from the transcription of this manuscript:

<TEI xmlns="http://www.whitmanarchive.org/namespace" xml:id="rut.00025"> <relations> <work ref="xxx.00048" cert="high"> <p>This manuscript fragment features <ref target="#q01" type="text" xml:id="r01">several lines of prose</ref> that were included, with slightly-revised wording, as lines of poetry in the initial poem of the 1855 <hi rend="italic">Leaves of Grass</hi> (ultimately titled "Song of Myself"): "Sit awhile wayfarer, / Here are biscuits to eat and here is milk to drink, / But as soon as you sleep and renew yourself in sweet clothes I will certainly kiss you with my goodbye kiss and open the gate for your egress hence" (1855, p. 52). These lines would remain, with minor revisions, through all the various versions of "Song of Myself." </work> </relations> . . .

To encode the work relationships, we must first look up the work ID. The best reference to the work ID is the Whitman Archive tracking database. The ID, which is a string of characters beginning "xxx" and ending with a five-digit number, corresponds to a work file which will contain prose descriptions of the compositional history of the work as well as connect the transcription files with other elements of the Whitman Archive. The work ID is inserted in the relations encoding as the value of the "ref" attribute in the "work" element.

In addition to this ID, project editors also assign either "high" or "low" as the value of "cert" to describe their confidence in connecting the individual manuscript to the work. For a manuscript with lines that are identical or very close to a published poem, the certainty will be "high"; notes that describe an idea in a way that bears a general resemblance to a published poem will get a "low" certainty.

The final part of the <relations> section is a brief prose description of the relationship of the instance to the work. This editorial note should include as specific as possible information about the relationship. If the instance consists of two lines, for instance, the two lines as they appeared in the published text most appropriate to cite (from, for instance, the 1891 Leaves of Grass) should be provided. If there is something distinctive and noteworthy about the manuscript, the editor may also insert a project note within the <noteStmt>. The work relationship will be described in more general terms, along with other editorial information, within the "date note" in the header (<note type="project" xml:id="dat1">).

A new <work> element, with a prose description, is used for every work related to the document. When the document is longer and includes more than one segment that could be considered an instance or partial instance of a work, the work relations can be added in sequence within the parent structural element. For instance:

<div2> <relations> <work ref="xxx.00048" cert="high"> <p>In relation to <ref type="text" xml:id="r02" target="#q02">this line</ref> see the following line from what would become section 33 of "Song of Myself," as it appeared in the 1855 <hi rend="italic">Leaves of Grass</hi>: "what the savage at the stump, his eye-sockets empty, his mouth spirting whoops and defiance" (p. 42). This line was dropped in the 1867 edition.</p> </work> <work ref="xxx.00264" cert="low"> <p>This section, including the deleted segment about "What Lucifer felt," is relevant to the origins of "The Sleepers," though it does not appear to have clearly and directly contributed to it. In the untitled poem in the 1855 <hi rend="italic">Leaves of Grass</hi> that would later become "The Sleepers," Lucifer appears in the following passage: "Now Lucifer was not dead....or if he was I am his sorrowful terrible heir; / I have been wronged....I am oppressed....I hate him that oppresses me, / I will either destroy him, or he shall release me" (p. 74). For more about the revisions of this passage, see Ed Folsom, "Walt Whitman's 'The Sleepers,'" part of <hi rend="italic">The Classroom Electric</hi>.</p> </work> <work ref="xxx.00614" cert="low"> <p><ref type="text" xml:id="r16" target="#q16">This line</ref> features Lucifer, as does a line that appeared in "Pictures": "And this black portrait—this head, huge, frowning, sorrowful,—I think it is Lucifer's portrait—the denied God's portrait."</p> </work> </relations>

The corresponding text should be marked with <seg> (as many as necessary to avoid nesting issues). For instance:

<div2> <relations> <work ref="xxx.00120" cert="high"> <p><ref type="text" xml:id="r04" target="#q04">This segment</ref> is similar to the following, from "Poem of The Sayers of The Words of The Earth" in the 1856 edition of <hi rend="italic">Leaves of Grass</hi> (the poem was later retitled "A Song of the Rolling Earth"): "Amelioration is one of the earth's words, / The earth neither lags nor hastens" (323–324).</p> ... <ab><seg xml:id="q04">Amelioration is the blood that runs through the body of the universe.—<del rend="overstrike">I grow</del> I do not lag—I do not hasten—</seg><del rend="overstrike"><add rend="insertion" place="supralinear">—it appears to say—</add>I bide my <subst><del rend="overstrike" seq="1">time</del> <del rend="overstrike" seq="3"><add rend="unmarked" place="supralinear" seq="2">day</add></del> <add rend="unmarked" place="supralinear" seq="4">hours</add></subst> over billions <add rend="insertion" place="supralinear">of billions</add>

If the corresponding text is already in a structural element that can be assigned an xml:id, use that element rather than the seg:

<l xml:id="q02">What the <subst><del rend="overstrike" seq="1">red</del> <del rend="overstrike" seq="3"><add rend="insertion" place="supralinear" seq="2">brown</add></del></subst> savage, lashed to<lb/> the stump, <del rend="overstrike">but</del> <add rend="insertion" place="supralinear"><unclear reason="illegible" resp="#no" cert="low">spirting</unclear> <del rend="overstrike">launching</del></add> <del rend="overstrike">yelling still</del><lb/> <del rend="overstrike">his</del> <add rend="unmarked" place="supralinear">yells and</add> laughter <del rend="overstrike">to</del> at every foe</l>

Basic Document Structure by Genre

Poetry

For a document featuring one poem composed of a single group of lines, do not use a <div>. Instead, use a structure like the following:

<text type="manuscript"> <body> <lg type="poem"> <head type="main"></head> <l>There is no word . . .</l> </lg> </body> </text>

For a single poem clearly divided into smaller chunks:

<text type="manuscript"> <body> <lg type="poem"> <head type="main">One's-Self I Sing.</head> <lg type="linegroup"> <l>ONE'S-SELF I sing, a simple separate person,</l> <l>Yet utter the word Democratic, the word En-Masse.</l> </lg> <lg type="linegroup"> <l>Of physiology from top to toe I sing,</l> <l>Not physiognomy alone nor brain alone is worthy for the Muse, I<lb/> say the Form complete is worthier far,</l> <l>The Female equally with the Male I sing.</l> </lg> <lg type="linegroup"> <l>Of Life immense in passion, pulse, and power,</l> <l>Cheerful, for freest action form'd under the laws divine,</l> <l>The Modern Man I sing.</l> </lg> </lg> </body> </text>

Note: Use type="linegroup" in <lg> tags to note multiple lines clearly grouped together (e.g. a stanza) and followed by space left intentionally blank.

For a manuscript containing two or more poems:

<text type="manuscript"> <body> <div1 type="multiple poems"> <lg type="poem">[poem here; follow structure outlined above]</lg> <lg type="poem">[poem here]</lg> </div1>

Prose

Prose should be divided into paragraphs using the <p> or <ab> tag. No division tag is required in a prose-only document unless the prose is divided into separate intellectual units. For example, a manuscript requires <div1 type="section"> if it begins with one or more texts constituting an intellectual unit (e.g. an essay or a group of letters), then has a clear break (e.g., a sub-heading, a horizontal line, or white space), and is then followed by another group of texts that is distinct in form or content (e.g. one essay following another). In such a case, the discrete groups of paragraphs should be marked with <div1>s, or, if they are already nested within a larger <div1> structure, with <div2>s and so on. Except on title pages or in places where the line break affects the intellectual content of the text, line breaks <lb/> are not encoded. Also note that <lg>s are only used to mark up poetry, never prose. Example:

<text type="manuscript"> <body> <div1 type="section"> <ab>for lect. on Literature: or (Democracy)</ab> <ab>What are these, called our literary men poets & scintillation at best the literary men & literary needs of other lands—exiles here &</ab> </div1> <div1 type="section"> . . . </div1> </body> </text>

Mixed Genre

Many manuscripts contain single intellectual units which are a mixture of poetry and prose. (For an example, see the manuscript "Ashes of Roses," here. "Mixed genre," for our purposes, does NOT just mean a manuscript leaf with poetry and prose on it (for example, a poetic draft on the recto and prose on the verso). Rather, "mixed genre" signifies writing that is thematically unified, apparently part of a single draft, but made up of a mix of prose and verse, as when Whitman composes an early draft that combines trial poetic lines with prose notes or lists. For a mixed-genre manuscript, use a <div1> with "poem notes" as the value of the "type" attribute, like this:

<text type="manuscript"> <body> <div1 type="poem_notes"> . . . </div1> <div1 type="poetry"> . . . </div1> <body> <text>

Basic Elements for Marking Structure

The following elements are used to describe the structure of Whitman's poetic works:

<div1> Division

Used, with the type attribute, to mark structural units larger than the cluster or poem. Values for the type attribute include "book," "section," "contents," "poem notes," "title notes," and "multiple poems." The largest unit is marked as <div1>, and descending levels of <div> can be nested inside. Click here to read an explanation of the different type attributes that are used with <div>s when marking up Whitman documents.

<lg> Line Group

Function in the same way as <div>, but are used exclusively to mark clusters, poems, and structural sub-units within them (ie, groups of lines—"sections" or "linegroups"—that constitute distinct units within a poem). If the poem has no distinguishable sub-units within it, no further <lg>s are needed; if the poem has one or more sub-units, you need to mark each of those units with the appropriate <lg>. As with <div>s above, descending levels of <lg> are nested inside <lg>. For example, for a manuscript of a poem broken into three linegroups, the poem itself would be tagged <lg type="poem"> and each linegroup would be tagged <lg type="linegroup">. The type attribute is required; values include "cluster," "poem," "section," and "linegroup." For an example of how you would encode a poem divided into three stanzas or linegroups, click here.

<head> Head

Marks the title. Each <lg> and division tag can have its own <head> (and thus its own title).

The "type" attribute on this element is required to differentiate main titles from subtitles. Use one of these two values:

main (written by Whitman on the page)
sub (subtitle written by Whitman on the page; "sub" is only used for secondary authorial titles and must be preceded by <head type="main">)

<l> Line

Used to mark a poetic line. Use <lb/> to mark a line break.

A sample structure might look like this:

<lg type="poem"> <l>I celebrate myself,</l> <l>And what I assume you shall assume,</l> <l>For every atom belonging to me, as good belongs<lb/> to you.</l> </lg>

Indented lines

Use the "rend" attribute on the "line" tag to indicate indented value.

Possible values correspond to width of indentation, with "indented1" being the smallest indentation and "indented4" being the largest: "indented1," "indented2," "indented3," "indented4"

Example:

<l rend="indented2">Still may I hear his word.</l>

<p> Paragraph

Used to mark a paragraph within a prose text.

A sample structure might look like this:

<div1 type="section"> <p> . . .</p> <p> . . .</p> </div1>

<ab> Anonymous Block

Used to mark a prose chunk that is not identifiable as a paragraph.

Other Common Elements

<q> Quotation

For poetry quotations:

<q who="Walt Whitman" type="extract"> <floatingText> <body> <lg type="poem"> <l> . . .</l> </lg> </body> </floatingText> </q>

For prose quotations:

<q who="Walt Whitman" type="extract"> <floatingText> <body> <p> . . .</p> </body> </floatingText> </q>

<hi> Highlighted (italics, smallcaps, underlining, etc.)

<hi> (highlighted) marks a word or phrase as graphically distinct from the surrounding text. Typically, we use it to indicate that individual words, phrases, or sentences within larger structures such as lines and paragraphs are highlighted in the original through italicization, the use of small caps, underlining, etc. The <hi> element uses the "rend" attribute to specify the nature of the highlighting:

Value of 'rend' attribute	Function
underline	Indicates underscored text
italic	used only in transcriptions of printed material or in project notes to mark titles of books.
smallcaps	used to indicate smaller capital letters
circled	used to mark a segment that is circled

<orig>, <reg>, <sic>, and <corr> Regularized Spelling and Corrections

<orig> (original form) contains a reading which is marked as following the original, rather than being normalized or corrected.

<reg> (regularization) contains a reading which has been normalized in some sense.

We often use these tags when encoding Whitman's poetry, for instance in cases where a word at the end of a poetic line is hyphenated. Because we wish both to record the lineation of the copy text and to enable searches for words that are broken by end-line hyphenation, we use the <orig> and <reg> tags to record the original and regularized readings. This will allow the original version to be displayed online while users can still search for the regularized form and be directed to the passage in question.

<sic> (latin for thus or so ) contains text reproduced although apparently incorrect or inaccurate and is used to represent a mistake by the author.

<corr> allows the encoder to provide a correction.

These corrections will enable searches to use standardized spelling and not require the searcher to know, for example, that Whitman misspelled "Buildings" as "Buldings" in this manuscript.

All of these elements are nested within the <choice> element, which groups a number of alternative encodings for the same point in a text.

Example for <orig> and <reg>:

<choice> <orig>indis‑<lb/>pensable</orig> <reg>indispensable</reg> </choice>

Example for <sic> and <corr>:

<choice> <sic>the incorrect way it's written</sic> <corr>the correct way to write it</corr> </choice>

Note: Sometimes what you might think of as a spelling error would more accurately be termed an alternate spelling. For words that are spelled in idiosyncratic—though not exactly incorrect—ways, use the <orig> and <reg> tags as described above. As an example, look at Whitman's spelling of "Shakespeare" in this image. Since this spelling of Shakespeare's name is one he himself used (and he never, as far as we know, used "Shakespeare"), it should be encoded as follows:

<choice> <orig>Shakspere</orig> <reg>Shakespeare</reg> </choice>

<table> Table

This element is generally, though not exclusively, used to encode a table of contents. Here is an example for how to encode a table of contents:

<div1 type="contents"> <head type="main">Table of Contents</head> <ab> <table> <row> <cell>Introduction</cell> <cell>3</cell> </row> <row> <cell>Poem 1</cell> <cell>22</cell> </row> <row> <cell>Poem 2</cell> <cell>34</cell> </row> </table> </ab> </div1>

Note: If a table spans multiple pages, you will need to close the table before inserting the <pb/> element and then open a new table for formatting purposes. Example:

... </row> </ab> <pb facs="med.00400.577.jpg" xml:id="leaf289r" n="575" type="recto"/> <ab> <table> <row> ...

<list> List

<list> <item>[text goes here]</item> </list>

Note: If a list spans multiple pages, you will need to close the list before inserting the <pb/> element and then open a new list for formatting purposes. Example:

... <item>[text goes here]</item> </list> </ab> <pb facs="med.00400.577.jpg" xml:id="leaf289r" n="575" type="recto"/> <ab> <list> <item> ...

<gap> and <unclear> Text that is illegible, missing, or difficult to read

<gap>: This element is used when text is absolutely unreadable—when, for example, it has been torn or cut away, is obscured by deletion, or is simply illegibly written. Each <gap> needs a reason attribute, and you have the choice of two values: "cut away" or "illegible." Note: gap is an empty element (i.e, does not require a close tag).

<gap reason="cut away"/>: When a page has been torn or cut, leaving only stubs of the letters you want to transcribe, use this tag at the point in the transcription where the words would appear.
<gap reason="illegible"/>: Use this markup whenever the letters or words are present but unreadable.

<unclear>: When you believe you have an accurate reading of a difficult-to-read passage, but you are not completely confident, mark the questionable reading with the <unclear> element. Use the reason attribute to state the cause of the uncertainty in transcription, selecting from the values described above under <gap>. Use the cert (certainty) attribute to indicate the degree of confidence in the transcription. Its value will be one of the following:

low
medium
high
absolute

Also include a resp (responsibility) attribute to indicate your responsibility for the postulated reading, and as its value use your initials.

For example, if Andy Jewell is encoding a manuscript with an unclear deleted word that he thinks might be "herbage," he inserts this markup:

<unclear reason="illegible" cert="high" resp="#awj">herbage</unclear>

<supplied>

<supplied>: This element is used when printed text is unreadable, but can be obtained from another copy. In this case, you will need to point from the tag to a source in the header, as follows:

<supplied reason="illegible" source="#unl_copy">happened</supplied>

In the TEI header, you would include the bibliographical information about the specific copy as a <biblStruct>:

<biblStruct type="supplied" xml:id="unl_copy"> <monogr> <author>Walt Whitman</author> <editor>William White</editor> <title>The People and John Quincy Adams</title> <imprint> <publisher>Oriole Press</publisher> <pubPlace>Berkeley Heights, NJ</pubPlace> <date>1962</date> </imprint> </monogr> </biblStruct>

Page Breaks and Image Linking

We use the <pb> tag to indicate page breaks. This tag is inserted at the beginning of a new page, and, if available, a link to an image of the page is provided. You use <pb> tags in every document, even if they are only one page long. <pb> is an empty tag, which means that you never need to "close" <pb>, but just insert a "/" at the end of the tag. The first <pb> tag goes after the <body> tag and before the first <div> or <lg>. If there are multiple pages, i.e., more than one corresponding image, simply insert a <pb> at each place in the encoding that corresponds to the beginning of a new page. Often, these will occur at the close of one linegroup (</lg>) and before the opening of another (<lg>). Or, commonly, you will need to include a <pb> to indicate untranscribed verso material; this should be done after the <lg> or <div>closes but before the <body> tag closes.

If a page is blank it still needs to be encoded using the <pb> tag.

Each <pb> tag has three required attributes, "facs," "xml:id," and "type". Numbered pages also have the attribute "n."

Example:

<pb facs="med.00400.096.jpg" xml:id="leaf048r" n="94" type="recto"/>

The attribute "n" provides the page number as displayed on the individual page. If the page that you are encoding is unnumbered, omit the attribute (e.g. <pb facs="med.00400.096.jpg" xml:id="leaf048r" type="recto"/>).

The value of the "facs" attribute consists of the relevant image file associated with the page. In the example, the file named "med.00400.096.jpg" provides an image of page 94.

Note on "xml:id" and "type": These two attributes record the leaf on which a given page appears and whether it is a recto or a verso. The front side of the first leaf of a document will always be "leaf001r" and the back (or verso) will be "leaf001v" . In almost all cases with documents consisting of several leaves, "recto" and "verso" alternate, and each leaf has one of each. For example, the page breaks following the one in the example would feature the following values for "xml:id" and "type":

<pb facs="med.00400.097.jpg" xml:id="leaf048v" n="95" type="verso"/> <pb facs="med.00400.098.jpg" xml:id="leaf049r" n="96" type="recto"/> <pb facs="med.00400.099.jpg" xml:id="leaf049v" n="97" type="verso"/> <pb facs="med.00400.100.jpg" xml:id="leaf050r" n="98" type="verso"/> . . .

The "xml:id" value must always end in either "r" or "v"—even if there is only one image. When there is only one image, the "xml:id" value will almost always be "leaf001r."

Spacing

Recent work with stylesheets has taught us that paying attention to and regularizing the encoding of white space is important as we prepare documents for display on the site. The most important guideline is simply to be conscious of spacing as you transcribe and encode, but here are a few more specific rules to follow:

Be sure to put a space between words. Remember that, after processing, the markup will be invisible, so your transcription needs to include the spaces that separate words even when the words are separated in the XML document by one or more tags.
Avoid spaces before closing add or delete (del) tags. Since Whitman's revisions typically did not involve the addition or deletion of white space after the last word of a phrase, make sure you insert the space outside the closing add or delete tags. A properly spaced transcription looks like this:

<add rend="unmarked" place="supralinear">Song</add> of Myself.

Within <subst> structures, insert spaces between tags, unless the <subst> is an overwrite. All characters must be contained within the <subst>, so spaces outside of <subst> will be ignored. A properly spaced <subst> structure should look like this:

Song of <subst> <del rend="overstrike" seq="1">You</del> <add rend="unmarked" place="supralinear" seq="2">Myself</add> </subst>

If the <subst> is an overwrite, do not include spaces between tags. A properly spaced overwrite should look like this:

<subst><del rend="overwrite" seq="1">m<add rend="overwrite" place="over" seq="2">M</add></subst>yself

Spaces before closing <l> or <p> tags are unnecessary and should be eliminated.

Spaces before and after the em dash (—) should be eliminated.

Do not insert unnecessary spaces. Often, encoders have inserted spaces that are not part of the transcription (for example, to make the tagging more human-readable). You can use as many returns as you wish to make the markup easier to read, but please do not use the space bar.

(To learn how to encode Whitman's use of intentional space in manuscripts, go here)

Unusual Characters & Marks

XML supports only the ASCII character set, which roughly corresponds with the set of characters on a standard keyboard. Not all of the characters you might encounter in a Whitman manuscript are part of the ASCII character set, so to represent one of these unsupported characters you will need to use the appropriate Unicode number—a string of numerals that begins with an ampersand and pound sign (&#) and ends with a semicolon (;).

The table below lists the Unicode numbers we are using on the project. It is important to use the numbers for the listed characters, even when it might be possible to key them in (as with the ampersand, for example) or to use a close approximation (e.g., two hyphens to represent an em-dash). For characters not listed, Unicode numbers are NOT necessary.

For the characters in the left-hand column to display correctly, you must have a Unicode font installed on your computer.

**Common Characters**
Character	Function in Whitman	Unicode Number
=	Proofreader's mark for hyphen. WW sometimes uses "=" for compound words ("down=balls") and words split between two lines ("some=thing").	‑
—	Em (long) dash e.g., "Not these—O none of these more"	—
–	En (short) dash e.g., "The Charles E. Feinberg Collection of the Papers of Walt Whitman, 1839–1919"	–
*	An asterisk	*
&	Indicates 'and'	&
©	Copyright symbol	©
✓	Checkmark	✓
œ	Latin small ligature oe	œ
½	Used often in Bowers's system of page numbering	½
¾	Used to indicate the fraction, occasionally on manuscripts	¾
¶	Indicates a new paragraph or a new line of poetry	¶
☞	A right-pointing finger	☞
☜	A left-pointing finger	☜
☝	An up-pointing finger	☝
☟	A down-pointing finger	☟

Special Cases

Printed Texts

Texts in Periodicals

Contemporary Reviews

Current Criticism

Encoding Corrected Proofs

Manuscript Texts

Titles and Naming

Each manuscript transcription will include up to three different kinds of titles. These titles may be identical or they may be different. These titles are:

1. Title in the <titleStmt>
2. Title in the <sourceDesc>
3. Head in the <body>

Title in the Title Statement

This title, which occurs inside the TEI header, names the electronic file you are creating and should therefore be distinguished from the title of the source material. Do this by adding the phrase "a machine readable transcription" as a subtitle, as in the following example.

<titleStmt> <title level="m" type="main">Death dogs my steps</title> <title level="m" type="sub">a machine readable transcription</title> </titleStmt>

We have developed a simple set of rules for giving titles to Whitman's manuscripts. Note that this naming is IN ADDITION TO the assignment of a unique identifier. If Whitman has titled the manuscript, that title should be used in preference to anything else, including other words written above the title on the manuscript. Do not include final periods. If any ambiguity exists as to whether Whitman intended a word or words to be regarded as a title, we always proceed under the assumption that the word(s) do not constitute an authorial title. In such cases, and in all others in which no title has been written by Whitman, use the procedures outlined below to derive a main title from the first words of the manuscript and include the attribute rend with the value "bracketed," to signify that the title is one we have assigned based on the first line. An example:

<titleStmt> <title level="m" type="main" rend="bracketed">And to me each minute</title> <title level="m" type="sub">a machine readable transcription</title> </titleStmt>

To derive the title from the first words, use the first words of the manuscript that are not struck through, and go up to (but do not include) the sixth word OR the first punctuation mark OR the end of the line OR the first line break, whichever comes first. (For this purpose, hyphens, apostrophes, and quotation marks are not considered punctuation marks; parentheses, brackets, etc. are.) Exception: If following the above procedure would result in a one-word or ambiguous title, include additional words, up to but not including the sixth word OR next punctuation mark OR the end of the line OR line break, whichever comes first.

Whether the title is authorial or derived, include added words or characters, but disregard numbers (roman or arabic) AND punctuation that precede the first word (e.g., "?"). Also disregard deleted, illegible, or unclear characters, no matter where they occur. As an exception, however, include, in square brackets, characters that can be supplied with great certainty although they are no longer present because of damage to the manuscript.

Examples:

The title statement for this manuscript poem should read

<titleStmt> <title level="m" type="main">Ah, not this granite dead and cold.</title> <title level="m" type="sub">a machine readable transcription</title> </titleStmt>

The main title in the title statement for this untitled poem reads

<title level="m" type="main" rend="bracketed">Behavior—fresh</title>

For this manuscript it reads

<title level="m" type="main" rend="bracketed">wainscot, g</title>

For this manuscript

<title level="m" type="main" rend="bracketed">Bloom.—Broad-shouldered</title>

For this manuscript

<title level="m" type="main" rend="bracketed">Outdoors is the best antise[ptic]</title>

For this manuscript

<title level="m" type="main" rend="bracketed">Silence.—Years ago</title>

Don't worry if two poems have the same title. Our unique identifier for the document will enable us to locate the correct document for processing through stylesheets.

Title in the <sourceDesc>:

This title is the one given to the artifact by the holding institution. The <sourceDesc> is essentially a bibliography of information that should be sufficient for a user to locate the item that is the source of the transcription. If the title is bracketed in the online repository guide, you should bracket it in the <sourceDesc>. In some cases, the "title" in the <sourceDesc> may bear little relation to the poem—for example, it might be the title of the folder which holds the item rather than the title of the item itself (this is typically the case only for the Feinberg collection at the Library of Congress).

Head in the <body>:

If Whitman has titled the manuscript, use a <head type="main"> to tag the title. In the <head>, within the <body> of the manuscript file, encode all the additions, deletions, and substitutions as such. For the same example, the <head> would be encoded like this:

<head type="main"><hi rend="underline"> <subst> <del type="overstrike" seq="1">Beyond this</del> <add type="unmarked" place="supralinear" seq="2">Ah, not this</add> </subst> granite dead and cold.</hi> </head>

Deletions and Additions

<del> Deletions

Use <del> to mark a letter, word or passage that has been deleted by any method. Example:

<del rend="overstrike">editor</del>

The only required attribute for the <del> element is "rend." Possible values are:

overstrike: A line or lines are drawn through rejected letters, words, or passages. This is by far the most common method of deletion in Whitman manuscripts.
erasure: Whitman has erased part of the text.
hashmark: A vertical or diagonal line or lines marks through a large chunk of text (often the whole manuscript page).
pasteover: Whitman has deleted text by pasting another piece of paper on top of it.
overwrite: Letters or words are marked for deletion by being written over with other letters or words.

Note: Each of these types of deletion can occur in combination with additions; "overwrite" does by definition, and "pasteover" almost always does. For information about marking combinations of additions and deletions, see #Additions and Deletions in Combination.

Use common sense when marking deletions; if an entire line has been crossed out, for example, but the horizontal line does not physically intersect with a comma that follows the passage, you should still assume that the comma is intended to be included in the deletion. In cases of doubt, please consult Ed Folsom or Kenneth Price for their reading of the passage.

Whitman's overstrikes are usally emphatic and easily recognizable, but occasionally one sees a mark which may be either an overstrike OR a stray pen mark. In these cases, first check with Kenneth Price or Ed Folsom, and then use the optional cert attribute to indicate your degree of certainty that the passage has been deleted. For example, on the Duke manuscript "I see who you are," a few lines from the bottom, the word "editor" appears to be struck through. This might be tagged as follows:

<del rend="overstrike" cert="high">editor</del>

Occasionally you will encounter a long passage that has been deleted. For these passages, use <del> unless doing so would create a nesting problem. For example, consider again the manuscript in the example above. Here, the long vertical strike is marked simply by enclosing the entire poem with a del element thus:

. . . <body> <pb/> <del rend="hashmark"> <lg type="poem"> . . . </lg> </del> </body> . . .

<add> Additions

Use <add> to mark any part of the text whose placement, ink, etc. clearly indicate that it was added to the manuscript after the surrounding text was written.

Two attributes are required on the <add> element: "rend" and "place."

For "rend," the possible values are:

insertion: marked by a caret (which often looks like an "x").
unmarked: added without caret or other mark.
overwrite: written over earlier text.
pasteon: written on a piece of paper that is glued to paper with earlier text.

For "place," the possible values are:

supralinear: above the line.
inline: in space available on the same line as earlier text.
infralinear: below the line.
over: over the earlier letter, word, or phrase.
top: in top margin.
bottom: in bottom margin.
left: in left margin.
right: in right margin.
interlinear: between lines.

<addSpan> and <delSpan>

The rules for marking long additions are similar to those for marking long deletions. Use <add> unless doing so would create a nesting problem, and use the elements addSpan or delSpan and anchor to mark the beginning and end of a deleted passage that doesn't nest within other elements. <addSpan> has three required attributes: spanTo, rend, and place. Available values for rend and place are the same as those listed above for <add>. The value of the spanTo attribute must be the same as the value of the required xml:id attribute on the corresponding anchor element, with the addition of a hash tag or #. To create these, use "a" (for "addition") plus the next available number ("a1" for the first <addSpan>, "a2" for the second, and so on). <delSpan> has two required attributes: spanTo and rend. Available values for rend are the same as those listed above for <del>. The value of the spanTo attribute must be the same as the value of the required xml:id attribute on the corresponding anchor element, with the addition of a hash tag or #. To create these, use "d" (for "deletion") plus the next available number ("d1" for the first <delSpan>, "d2" for the second, and so on).

For example:

<delSpan rend="hashmark" spanTo="#d01"/> <ab><subst><del seq="1" rend="overstrike">and</del><add seq="2" rend="unmarked" place="supralinear">I</add></subst> said to my soul When we become the <del rend="overstrike">god</del> enfold<subst><del seq="1" rend="overwrite">ing</del><add seq="2" rend="overwrite" place="over">ers</add></subst><add rend="unmarked" place="supralinear">of</add> all these <add rend="unmarked" place="supralinear">orbs,</add> and open to the life and delight and knowledge of every thing in them, or of them, shall we be filled and satisfied? and the answer was</ab> <ab>No, when we fetch that height, we shall not be filled and satisfied but shall look as high beyond.</ab> <anchor xml:id="d01"/>

Alternation

When Whitman writes multiple words to choose between without deleting either, use an <alt/> element. In order to do so, you will need to assign structural elements to both of the word options, and then point to them from the <alt/> element. In the <alt/> element, you will also designate a weights attribute, in which you will include information about the probability that an alternative would occur. For example, in a case where it is equally likely that either word would be used:

<seg xml:id="w1">arterializes?</seg>
    <add place="supralinear" rend="unmarked" hand="#h1" xml:id="w2">vitalizes</add>
    <alt target="#w1 #w2" weights="0.5 0.5"/>

Transpositions Noted by Arrows or Asterisks

Some manuscripts have brackets, arrows, and/or a series of asterisks to indicate Whitman's desire to move a line or lines to a different place in the poem. To encode this phenomenon, we use the <transpose> element. In this example, Whitman has bracketed one line and indicated with an asterisk in the margin that the line should be moved down. The encoding for this section follows, with the tagging most pertinent to the transposition in bold.

<metamark function="transpose" place="left" target="#tr01">* down</metamark> <transpose rend="bracketed" anchored="1" xml:id="tr01" target="#tr02"> <l>Of the native scorn of grossness<lb/> and gain there, (O it lurks<lb/> in me night and day—What<lb/> is gain, after all, to savage-<lb/> ness and freedom?)</l> </transpose> <l>Of immense spiritual <subst><del type="overstrike" seq="1">things</del> <add type="unmarked" place="supralinear" seq="2"> results</add></subst>, future years,<lb/> inland, spread there each side of<lb/> the Anahuacs,</l> <l>Of these Leaves established there, and<lb/> well understood there.—</l> <milestone unit="undeclared" rend="horbar-full"/> <metamark xml:id="tr02" function="transpose" place="left" target="#tr01">take down *</metamark>

Explanation:

We use <metamark> to transcribe any marginal characters that indicate the transposition.
The metamark target points to the segment to be moved, which is given an xml:id.
The anchored attribute is used to note whether or not the manuscript clearly indicates where the line(s) are to be moved. (We know of at least one example where this is not known.) This attribute has one of two possible values: 1 (meaning "yes") or 0 (meaning "no").
If the value of the anchored attribute is 1, a target attribute is required; it points to an <anchor> that is placed at the target—the point in the manuscript to which the part is to be moved. If the value of anchored is no, the target attribute is not required. If there is a metamark at the point where the transposition text is to be moved, the metamark can be assigned an xml:id, as it has been in this example, and the transpose @target can point to the metamark.

- NOTE TO KEVIN WHEN HE EDITS THESE INSTRUCTIONS: see loc.00082 for an example of when we don't define what type of metamark it is (we just leave metamark as a self-closing tag and the text "[transposition mark]" gets inserted by the stylesheet)

Additions and Deletions in Combination

<subst> Substitutions

Very frequently, Whitman's additions are not merely appended to earlier text but are substituted for earlier text. It is our policy to link the deleted and added portions by marking each as a reading sequence within a <subst> element. Each <subst> will contain at least two sequences and may contain up to five. The <subst> element requires no attributes; the alternative reading sequences are marked by the "seq" attribute within either the <del> or <add> elements, the value of which is a single-digit number that indicates the relative order in which the present reading is presumed to have been written. Example:

<l>Old Asia's <subst> <del rend="overstrike" seq="1">self</del> <add place="supralinear" rend="unmarked" seq="2">there</add> </subst> with venerable. . .</l>

For obvious reasons, usually the first reading will contain a deletion and the second reading will contain an addition. This is not always true, however. You may come across instances where multiple variants are left undeleted. In this case, the first reading will contain just the transcribed word(s).

Overwriting

You will sometimes encounter a substitution in which a letter or word has been overwritten with another letter or word. In such cases, the value of the place attribute on the add element is "over"; the value of the "rend" attribute for both the <add> and <del> elements is "overwrite." For example, consider this manuscript excerpt, in which Whitman has changed the "e" from upper- to lower-case. The markup for this word is:

<subst> <del rend="overwrite" seq="1">e</del><add rend="overwrite" place="over" seq="2">E</add></subst>ach

Pasting

Fairly frequently, Whitman made substitutions by pasting one page or scrap over another. Treat such cases as you would other substitutions, by using the <subst>, <del>, and <add> elements. For the "rend" attribute on <del>, use the value "pasteover." The "rend" attribute of the <add> should be given the value "pasteon," and the "place" attribute should be given the value "over." Example:

. . . <subst> <del rend="pasteover" seq="1"><l>And that night O you happy . . .</l> <add rend="pasteon" place="over" seq="2"><l>And that night, while all . . .</l></add> </subst> . . .

NOTE: in cases where pieces of paper have simply been glued together, with no material pasted over, use <milestone unit="glued"/>

Nesting <add> and <del> Tags

Of course, not all combinations of <add> and <del> tags are substitutions. For instance, you may come across an example where an addition was deleted. In such a case, the <add> element should be nested within the <del> element:

from <del rend="overstrike"><add rend="unmarked" place="supralinear">this</add> base

There are a number of other ways in which Whitman combined additions and deletions — probably too many to cover each one separately here. You should be able to handle almost all situations you encounter by applying these principles and rules:

Nesting operates on a radial principal, working from the center out. For additions and deletions, this means that when the boundaries of a deletion and an addition are the same, the <add> should be nested within the <del>. This indicates 1) that the material was added; and 2) that the addition itself was deleted. If, however, only part of an addition has been deleted, the <del> will, of course, be nested inside the <add>.
<add>s and <del>s, in various combinations, can be nested within one another, with no theoretical limit to the "depth" of that nesting. So it's entirely possible to have, for example, a <del> within an <add> within an <add> within a <del>.
<subst>s, however, should never be nested inside other <subst>s, even though you will occasionally encounter situations which seem to call for such markup. Because <subst>s within <subst>s would create difficulties for computer processing, our project policy is to mark only the "highest level" substitution as such and to mark interior substitutions only with <add> and <del>, as appropriate.
All <add>s and <del>s must nest properly within any other elements that are present. In particular, you should be careful not to straddle line boundaries.

Intentional Inline Spaces

Whitman will occasionally leave a blank space within a line of poetry or a paragraph of prose, apparently making room for the perfect word that he has yet to discover. To encode these spaces, insert a <space> element with two attributes, dim (dimension), the value of which will almost always be "horizontal"; and extent, the value of which is expressed as a number of letters, determined by the size of the letters surrounding the space on the manuscript. The encoding for this manuscript would look like this:

action, <space dim="horizontal" extent="9 letters"/> in husky

In other cases, Whitman will leave a blank line that indicates the intentional blank space (apparently in addition to his major poetic innovations, Whitman also developed the Mad Lib). To encode this phenomenon, use the same strategy as above, but add a "rend" attribute with the value "underline." Therefore, this example would be tagged like this:

. . . . As in Visions of <space dim="horizontal" extent="7 letters" rend="underline"/> at<lb/> night—

Notes and Other Writings in Whitman's Hand

Notes

You will sometimes come across writing on the manuscript page that is not part of the text of the poetry manuscript proper, but instead a note of some sort about it. For example, this note follows a poem, at the bottom of the page. Whitman also often uses question marks to indicate uncertainty about word choice. This sort of material is encoded using the <note> element. Note also takes two required attributes: type and place. To distinguish the writing as Whitman's, use the value "authorial" for the type attribute. For the place attribute choose from the following values: "bottom," "infralinear," "inline," "inline," "interlinear," "left," "opposite," "over," "right," "supralinear," or "top."

The example should be marked up as follows:

<note type="authorial" place="bottom"> <subst> <del type="overstrike" seq="1">sent to</del> <add type="unmarked" place="supralinear" seq="2">pub in</add> </subst> Herald early in Feb. '88 </note>

Sometimes, Whitman will visually separate his notes from the rest of the text by drawing a boundary line, as in this example. When this happens, you need to add the rend="circled" attribute and value to <note>. (The value circled is used even though Whitman's boundary line often does not make a proper geometric "circle"). The encoding for the example would read:

<note type="authorial" place="top" rend="circled">follow copy strictly</note>

Note that <p> is only used within <note> when there are multiple paragraphs within the note.

Reverse-Side Notes

Occasionally, Whitman will write a prose note about the poem on the reverse-side of the manuscript leaf, such as a note to the printer or a comment on the poem's placement in a larger work. These notes, though they are on the reverse side, are encoded basically the same way as the notes described above are encoded, with a few minor adjustments:

The <note> tag is inserted after the <pb> tag that identifies the verso (i.e., the one with "xml:id='leaf001v'")
The value of the place attribute is "inline"

Miscellaneous Writing

The note element is also used to mark other writing on the page that, while not strictly a note, is not part of the text. Examples include: page numbers, addition or subtraction problems, and question marks.

Whitman's use of proofreading/typesetting marks is a special case. To encode a manuscript that has such marks, you should first decide whether the marks are being used to indicate a) changes to the base text, or b) emphasis of a typographic feature.

It is our current policy to encode only marks of the first type. Examples include the caret ( ^ ) to indicate an addition, the paragraph mark (¶) to indicate a new paragraph (in prose) or line (in verse), and the number sign ( # ) to indicate extra space. For instructions on encoding the caret, see #<add> Additions. For instructions on encoding the paragraph mark, see #Metamarks. The paragraph mark should be encoded as a named character entity (see chart 3.7)—¶. For instructions on encoding space markers, see #Notes.

At present we have chosen not to encode marks of the second type, though we may in a future stage of the project return and add representations of them to our markup. Examples of marks that you should not encode include triple-underlining to emphasize capitalization and horizontal curved brackets used to indicate lack of spacing between parts of hyphenated words.

Metamarks

The metamark element contains or describes any kind of graphic or written signal within a document the function of which is to determine how it should be read rather than forming part of the actual content of the document. We use metamark when Whitman uses the paragraph mark (¶) to indicate a new paragraph (in prose) or line (in verse); when Whitman uses the word "over" to denote continuous text on another surface; or to denote any deletion mark that is not accounted for by <del> or <delSpan>. <metamark> can also be used to describe marks with unknown function. <metamark> requires a @function attribute and can be a closed element if the mark cannot be represented typographically or with a character entity. For example:

<metamark function="paragraph">¶</metamark>

<metamark function="unknown"/>

The metamark element is also used to encode transposition marks that indicate relocation of blocks of text. See #Transpositions Noted by Arrows or Asterisks.

Horizontal Lines

Whitman fairly often draws a line to signal the beginning or the end of a unit of text. These lines range from full page width to small bars at the left, right, or in the center. You can see an example of a small center line at the bottom of this manuscript and an example of a small line at the left at the top of this one. We take these lines to indicate some kind of division, though we make no claims about the sort of unit(s) they define. They are encoded using the empty milestone element, with "undeclared" as the value of the unit attribute and the following as possible values of the rend attribute:

horbar-full
horbar-short-right
horbar-short-left
horbar-short-center

The first example above should be encoded as follows:

<lb/>We never separate again.—</l> <milestone unit="undeclared" rend="horbar-short-center"/> </lg1>

Brackets

Whitman sometimes used brackets to group lines or other bits of text, as in this example. These should be indicated by using the <metamark> element. In this case we use <metamark> as an empty element, and it requires the attributes "function," "rend", and "spanTo". The value of @function will be "group". The value of @rend will vary depending on the type of bracket. The possible values of rend are listed below. And, as with <addSpan> and <delSpan>, you must always mark the end of the bracketed section with an empty <anchor> element bearing an xml:id attribute whose value corresponds with the value of the "spanTo" attribute.

If the bracket is written in a hand different than the text it is bracketing include a "resp" attribute on <metamark>, with a value corresponding to the xml:id declared within <handNote>.

This is the markup for the example above:

<metamark function="group" rend="bracket-left" spanTo="#s2"/> <l>I rate myself high—I receive no small sums;</l> <l>I must have my full price—whoever enjoys me.</l> <anchor xml:id="s2"/>

Possible values of "rend" for brackets

bracket-left
bracket-right
curlyBrace-right
curlyBrace-left
singleLine-left
singleLine-right
doubleLine-left
doubleLine-right

Page Numbers

Whitman's use of page numbers—combined with the history of manuscript dispersal—means we are left with both ambiguous and reliable page numbering on Whitman manuscripts. By "ambiguous," we mean manuscripts with a number, like "43," written in the top corner but no corresponding "42" or "44"; by "reliable," we mean multiple-leaf manuscripts with an ordered numbering of each leaf (ordered numbering does not mean an uninterrupted sequence that begins with "1"; instead, it means any discernable numbering system that reliably determines the leaf order).

We handle these two types of page numbers in different ways. For ambiguous numbers, we use <note>. For reliable page numbering, we add an attribute to the Page Break, or <pb> element. Specifically, we add an "n" attribute with a value that corresponds to the number written on the page. So, if a three-leaf manuscript is numbered "2," "3," "4," then the <pb>s would have n="2", n="3", and n="4".

Section Numbers

Sometimes you will encounter a manuscript with distinctly numbered sections, as in this example. These sections are different than linegroups, as they are typically numbered or otherwise clearly marked, and they often contain multiple linegroups. To handle section numbers, add a <head> tag immediately after the <lg> tag to note the "head" of that section. Here's how you would encode the example manuscript :

... <lg1 type="poem"> <head ... > <lg2 type="section"> <head type="main">1</head> <lg3 type="linegroup"> <l>Come, said the Muse,</l> .... </lg3> <lg3 type="linegroup"> ... </lg3> <lg3 type="linegroup"> </lg3> </lg2> <lg2 type="section"> <head type="main">2</head> ... </lg2> ... </lg1>

Writing in Hands Other than Whitman's

Poetry Manuscripts

Title Page Manuscripts (Titles Only)

Some manuscripts have only titles, with no content to follow those titles, or are pages with several trial titles that Whitman never used (for an example, click here). For these unusual manuscripts, we have a different <div1> type, "title notes."

<text type="manuscript"> <body> <div1 type="title notes"> . . . </div1>

To read about the unique markup used in Title Page manuscripts, go here.

Prose Manuscripts

Glued paper

For manuscripts that consist of more than one piece of paper glued together, use a <milestone> element, as follows:

<milestone unit="glued"/>

Correspondence

All transcription and encoding should be done in a text or XML editor such as NoteTab or Oxygen XML Editor. If possible, you should use Oxygen, since the template has been designed for use with Oxygen, and we have not set up updated, P5 clips or validation scenarios for NoteTab. All computers in the Whitman Archive offices at UNL have Oxygen installed, as do the computers in the Whitman office at Iowa. The following instructions assume you are using Oxygen. Do not use word processing software such as Microsoft Word; doing so will cause a number of problems.

Template and Sample Files

1. The Whitman Archive TEI P5 template for correspondence and five sample files are available for download as a compressed zip file here. Download the zip file and save to a location on your computer that you'll remember. Unzip the directory.
2. Copy the file "correspondence_template" to the following folder: C:\Program Files\Oxygen\frameworks\tei\templates\P5\
3. Put the five sample files, bos.0006, duk.00321, duk.00355, duk.00370, and loc.00560, in a location on the computer that is convenient for you (for example, I've put the files here for my reference: C:\liz\correspondence\tei\samples)

Before beginning any transcription or encoding work, review the template and sample files, paying particular attention to text within . This text will appear in green if you are using Oxygen. The template includes content that will be the same in all files as well as instructions on providing information specific to individual letters. The sample files are fully transcribed and annotated letters that have been completed based on the template.

<opener> and <dateline>

<closer> and <signed> Signatures

Annotating

Before they are published on the Whitman Archive, all of the letters, including those derived from the Miller or other volumes as well as the letters we are editing for the first time, will be annotated. Initially, the letters derived from Miller will include annotations only for those items Miller annotated. We will revise to correct and update Miller's existing annotations as necessary during this first stage. At a later stage, we will add annotations for people, places, events, or other features of the letter that Miller did not annotate as part of his editorial treatment of the letters. For letters that have received no prior editorial treatment, we will be able to draw on Miller's annotations in many cases (we have the support of Miller's estate). In other cases, we will need to draft annotations from scratch. Generally, we are focusing on personal and place names, events, and references to Whitman's works for this round of annotation. Additional notes may gloss confusing or absent dates on a letter or clarify details about the sender or recipient of a letter. Our goal is to have at least a basic level of annotation for every letter, rather than an exhaustive level of annotation for a subset of the correspondence.

Once you know a point in the text that needs an annotation and you have the annotation drafted, you will need to add some additional mark-up to the file, along with the text of the annotation.

The basic markup is:

 <note type="editorial" resp="wwa" xml:id="n1">[text of the note]</note>

Each note will be encoded within a <note> tag, and the tag will take the attributes type, resp, and xml:id.

Within the body of a letter, the note type will always be "editorial."

The value for @resp will depend on the source of the note:

If it is one that we are using unchanged from Miller, the value of @resp will be "miller."
If it is an annotation based on Miller, but which we have updated or corrected, the value of @resp is "wwa."
Similarly, if the note is one that we've written here at the wwa, the value of @resp is "wwa."

@xml:id provides a unique identifier for the note and will take the format "n[#]." Give the first note in the document the xml:id of "n1," and number subsequent notes chronologically. Note that the order of the notes in the document does not need to correspond chronologically to their xml:ids. For example, the third note in the document does not necessarily need to have an xml:id of "n3." In many cases, these numbers will correspond, but they do not have to, and you should not spend time reordering them if they get out of sync. The crucial things is that the values of xml:id are unique. The document will not validate if they are not.

For example, if we want to add a note about Whitman's poem "A Carol of Harvest" to the existing encoded transcription of loc.01278, Francis P. Church to Walt Whitman, 8 August 1867:

          <body>
                <pb xml:id="leaf001r" facs="loc.01278.003.jpg" type="recto"/>
                <opener>
                    <dateline rend="right">
                        <date when="1867-08-08">Aug 8 [186]7</date>
                    </dateline>
                    <salute>My dear Sir:</salute>
                </opener>
    
                <p>I was very much satisfied to receive your fine Harvest Carol this morning. It seems
                    to me to rank with the very best of your poems, and that it
    
                    <pb facs="loc.00167.004"
                        xml:id="leaf02r" type="recto"/> is sure to last.</p>
    
                <p>I am sorry to say that it comes too late to be put as the second article & will
                    need to be put in one of the later signatures. But where ever it is, it cannot be
                    hidden.</p>
    
                <p>You shall have the proof promptly.</p>
    
                <pb xml:id="leaf001v" facs="loc.01278.005.jpg" type="recto"/>
    
                <closer>
                    <salute>I am very truly yours</salute>
                    <signed>F. P. Church</signed>
                    <signed>Walt Whitman</signed>
                </closer>
    
                <pb facs="loc.00167.006" xml:id="leaf02v" type="verso"/>
            </body>

we would add the encoding and text of the note to the point it refers to in the letter. In this case, adding the annotation after "I was very much satisfied to receive your fine Harvest Carol this morning" seems reasonable. So, we would have:

    <body>
                <pb xml:id="leaf001r" facs="loc.01278.003.jpg" type="recto"/>
                <opener>
                    <dateline rend="right">
                        <date when="1867-08-08">Aug 8 [186]7</date>
                    </dateline>
                    <salute>My dear Sir:</salute>
                </opener>
    
                <p>I was very much satisfied to receive your fine Harvest Carol this morning.
    <note type="editorial" resp="wwa" xml:id="n1">
                  Whitman was first recommended to the <hi rend="italic">Galaxy</hi>'s publishers,
    William Conant Church and Francis Pharcellus
                  Church, by his friend, William Douglas O'Connor who suggested John Burroughs' article titled,
    "Notes on Walt Whitman as Poet
                  and Person." A portion of that article appeared in the <hi rend="italic">Galaxy</hi>
    during its first year of publication, in
                  December 1866. Eventually, prompted by Whitman's growing reputation, the Churches proposed
    that Whitman himself write a poem
                  concerning the theme of "harvest" for their publication. In August 1867 Whitman submitted
    "A Carol of Harvest for 1867" and the
                  poem was printed in the September edition. "A Carol of Harvest" was reprinted in
    the October 1867 issue of
                  <hi rend="italic">Tinsley's Magazine</hi>. Whitman revised the poem for
    <hi rend="italic>Passage to India</hi> (1871). After some
                  further revision, the poem appeared as "The Return of the Heroes" in
    <hi rend="italic">Leaves of Grass</hi> (1881–82).</note>
                  It seems to me to rank with the very best of your poems, and that it
    
                    <pb facs="loc.00167.004"
                        xml:id="leaf02r" type="recto"/> is sure to last.</p>
    
                <p>I am sorry to say that it comes too late to be put as the second article & will
                    need to be put in one of the later signatures. But where ever it is, it cannot be
                    hidden.</p>
    
                <p>You shall have the proof promptly.</p>
    
                <pb xml:id="leaf001v" facs="loc.01278.005.jpg" type="recto"/>
    
                <closer>
                    <salute>I am very truly yours</salute>
                    <signed>F. P. Church</signed>
                    <signed>Walt Whitman</signed>
                </closer>
    
                <pb facs="loc.00167.006" xml:id="leaf02v" type="verso"/>
            </body>

Translations

Special characters

Many languages feature special characters that cannot simply be typed using a keyboard configured for American English. These characters need to be encoded using Unicode. For instance, if you wanted to encode the character "ñ" (n with tilde), which is frequently used in Spanish for example, you would need to insert the Unicode characters "ñ". You can look up the relevant code in lists such as this one.

Non-Latin alphabets

If you are encoding a translation in a language that does not use the Latin alphabet, for instance Russian or Hebrew, you will need to use a special character set that has to be declared at the top of the XML file.

Examples:

For texts using the Latin alphabet, the first two lines at the top of an XML file would look like this:

<?xml version="1.0" encoding="ASCII"?> <?oxygen RNGSchema="http://www.whitmanarchive.org/downloads/wwa-1.2.rng" type="xml"?>

If you want to encode a text written in the Cyrillic alphabet, you would have to replace the word "ASCII" in the first line with the code for the Windows character set for Cyrillic, "Windows-1251":

<?xml version="1.0" encoding="Windows-1251"?> <?oxygen RNGSchema="http://www.whitmanarchive.org/downloads/wwa.rng" type="xml"?>

For texts written in the Hebrew alphabet, the character set would be "Windows-1255."

Other alphabets use the following character sets:

Arabic: Windows-1256
Greek: Windows-1253
Turkish: Windows-1254
Baltic languages: Windows-1257
Vietnamese: Windows-1258

Marginalia

Header

The header is done in conformity with the standard WWA TEI header, keeping in mind that the following elements may need special attention:

Identification Numbers

In <publicationStmt>, each document transcribed must have an identification number. In the header template, the identification number <idno> is indicated as duk.####. To enter the id number of your file, refer to the Whitman Archive Manuscript Marginalia spreadsheet. Enter the number in the "WWA_ID" column. Hence, if the document you are encoding is "Reading Shakespeare," you would enter the id number as <idno>duk.00378</idno>. Transcriptions of documents held at other repositories will have other codes, beginning with the first free ID number after those in the poetry manuscript database.

Location Finding

In <sourceDesc>, you will need to find the document's bibliographic finding number. This is the code used by the repository holding the document. Refer to the WWA tracking list spreadsheet to locate the "Finding #" column. Thus, the document "Reading Shakespeare" would be identified as <idno type="callno">II-5 102</idno>. If no number is available in the tracking spreadsheet, leave this space empty.

Source Information

In <sourceDesc>, you may need to include additional <bibl>s for base text or pasteons that are clippings. Add as much information as you can find, assigning the additional <bibl>s an xml:id and type to indicate either “pasteon,” for clippings that have been pasted on, or “base,” for base documents. A listing for a pasteon might read:

              <bibl type="pasteon" xml:id="r2">
                      <author></author>
                      <title level="a"></title>
                      <title level="j"></title>
                      <date></date>
                      <biblScope type="vol"></biblScope>
                      <biblScope type="pp"></biblScope>
                      <note type="project"></note>
                  </bibl>

HandNotes

<profileDesc> is the place where we can indicate the main authorial hands responsible for the document. Under <handNotes>, indicate a <handNote scribeRef="#ww">, with scribeRef pointing to the author xml:id (assigned either in the titleStmt, in the case of Whitman, or in an accompanying <persName>) because it identifies the person responsible for the hand. The medium can be described by noting whether Whitman or another contributor wrote in pencil or in ink, and the color of ink. Hence, <handNote scribeRef="#ww" medium="black-ink">. It is necessary to provide an xml:id for each hand so that the hands can be referenced as needed throughout the document. If additional hands are to be indicated (to include unknown hands that have written on the document as well as hands more or less contemporary with Whitman's such as Bucke's or Traubel's), they will be added to the list of handNote scribeRefs. Different media in Whitman’s hand and typescript documents should also be added to the list as separate hands. A common <handNote> tag group might read:

 <handNotes>
      <handNote xml:id="h1" scribeRef="#ww" medium="black-ink"/>
      <handNote xml:id="h2" scribeRef="#ww" medium="pencil"/>
      <handNote xml:id="h3" scribeRef="#unk" medium="type"><persName xml:id="unk">unknown</persName></handNote>
      <handNote xml:id="h4" scribeRef="#ht" medium="black-ink"><persName xml:id="ht">Horace Traubel</persName></handNote>
    </handNotes>

Describing the Physical Features of the Document

Following the header is the description of the document proper. We follow the example of the extant Walt Whitman Archive Encoding Guidelines in establishing the structural elements of the XML document, with variations as follows:

The <text> type of each document we encode should be listed as "marginalia."

Nested inside the <body> of the <text> element is <pb/>, which assigns each page an xml:id, notes the image file, listed as facs, and indicates whether the page is a recto or verso.

For example:

       <text type="marginalia">
              <body>
                  <pb xml:id="leaf001r" facs="duk.xxxxx.001" type="recto"/>
                        <div1 type="section">
                            <head type="main"><handShift new="#h1"/>DOCUMENT
                            TITLE</head>
                        </div1>

Title Page

For longer works with notations on the title page, the title page material should be included in a segment that appears before <body>, titled <front>. This section should not include any divs. The following is an example of a title page encoding structure:

       <text type="marginalia">
              <front>
                  <pb xml:id="leaf001r" facs="loc.03449.001.jpg" type="recto"/>
                  <titlePage>
                  <note type="authorial" place="top right" resp="#h1">Walt Whitman</note>
                  <docTitle><titlePart type="main"><handShift new="#h2"/>OCCASIONAL
                  PIECES OF POETRY.</titlePart></docTitle>
                  <byline>BY <docAuthor>JOHN G. C. BRAINARD</docAuthor>.</byline>
                  <milestone unit="undeclared" rend="horbar-short-center"/>
                  <epigraph><cit><quote who="Bunyan"><l>Some said, "John, print it;" others said, "Not
                  so;"—</l>
                  <l>Some said, "It might do good;" others said, "No."</l></quote>
                  <bibl><hi rend="italic">Bunyan's Apology.</hi></bibl></cit></epigraph>
                  <milestone unit="undeclared" rend="horbar-short-center"/>
                  <docImprint><pubPlace>NEW-YORK:</pubPlace>
                  PRINTED FOR <name>E. BLISS AND E. WHITE.</name>
                  <hi rend="italic">Clayton & Van Norden, Printers.</hi>
                  <milestone unit="undeclared" rend="horbar-short-center"/>
                  <docDate when="1825">1825</docDate>.</docImprint></titlePage></front>

Divs

See the Walt Whitman Archive Encoding Guidelines for an explanation of when to use <div>, which requires a type attribute. <div> marks intellectual units of a text. If the document you are encoding has multiple discernable intellectual units, separate them with <div>s; otherwise (if the document seems to be a single intellectual unit), omit <div> completely. Each <div> needs an @type: this can be "pasteon," "notes," "section," for miscellaneous structural divisions; "verso," for unrelated verso material; or "letter," for distinct intellectual content in the form of a letter.

Structure of Printed Text

Printed text with Whitman's annotations covers a wide array of generic conventions. Some of these are consistent with modules used by the Whitman Archive; some are not. Should you run across a structure that does not seem to fit the available elements, ask an editor whether there are additional TEI tags available that would be appropriate to that document.

HandShifts

In a document with multiple hands, note changes from one hand to the next with the <handShift> element. Each handShift should specify the new hand with the “new” attribute. So, for instance, you would encode a shift from hand #h1 to hand #h2 as <handShift new=”#h2”/>.

1. To designate a starting hand, begin with a <handShift> inside the first structural element (see example above).

2. <handShift> will only apply when the parent element has substance, or text. So for deletions, if the deletion is an overstrike and the overstrike is the only thing in a different hand, you would use the hand attribute within the deletion, rather than a separate handShift. For example, if the word “contract,” written in hand #h1, was deleted by hand #h2, you would encode it as:

        <del rend="overstrike" hand="#h2">contract</del>

whereas if "contract" was an insertion in hand #h2, while the rest of the text was in #h1, the encoding would look like:

         "The social <add rend="insertion" place="supralinear"><handShift new="#h2"/>contract</add><handShift new="#h1"/>

is his natural liberty..."

3. Notes include a resp attribute that specifies the hand they are written in (<note type="authorial" resp="#h2" place="top">), so you do not need to include handShifts for notes.

Adding Pasteons

Pasteons should be included in the encoding approximately in the sequence that they are encountered in the document, when possible. However, if the pasteon is flippy, or if it covers existing text on the base leaf, encode the entire leaf (noting the segments missing or covered because of the paste-on as <gap reason=”covered”>, and add the pasteon(s) with the structure described below after the entire verso of the base layer has been encoded.

1. <Add> takes the "rend" attribute instead of the "type" attribute. So, for example, where you once encoded a paste-on as <add type="pasteon">, you would now encode it as <add rend="pasteon">.

2. Within the <add> element for clippings, note the source of the clipping, which should reference the appropriate <bibl> in the header.

Thus, an added clip should start off looking something like this:

  <add rend="pasteon" source="#r2">

3. If the pasted-on object is a newspaper clipping or is taken from any source other than Whitman's own writing, it will require a <q> that includes two attributes: @type, with a value of either "written" or "spoken" (or, if you're not sure, "ambiguous."), and @who, with a value stating who is speaking or writing the quote (in the case of a newspaper clipping, this will be the piece's author, if known; if the speaker/writer is unknown, the value of @who will be "unknown"). The <q> will be followed by a <floatingText> element. @rend on floatingText should designate whether the pasteon is “flat” or “flippy.” Then, within <floatingText> will be a <body> element, and then any textual elements that follow. You will also need to include a new <pb>, within the <body>, with a cropped image of only that paste-on. For example:

  <add rend="pasteon" source="#r1">
      <q who="unknown" type="written">
      <floatingText rend="flippy">
      <body>
      <pb xml:id="leaf002r" facs="duk.xxxxx.003" type="recto"/>
      <p>TEXT</p>

4. You can indicate material on the front and back sides of flippy documents with <pb/>s, as you would for a standard leaf.

Thus, an added clip that IS a flippy should start off looking something like this:

  <add rend="pasteon" source="#r1">
      <q who="unknown" type="written">
      <floatingText rend="flippy">
      <body>
      <pb xml:id="leaf002r" facs="duk.xxxxx.003" type="recto"/>
      <p>TEXT</p>
      <pb xml:id="leaf002v" facs="duk.xxxxx.004" type="verso"/>
      <p>TEXT</p>

5. If the material that is on the back side of a flippy document is unrelated, you may omit the content (consult editors with questions about this). Maintain the “flippy” designation, and the <pb>, but for the verso simply include an empty <div> tag of type=”verso”, as follows:

  <add rend="pasteon" source="#r1">
      <q who="unknown" type="written">
      <floatingText rend="flippy">
      <body>
      <pb xml:id="leaf002r" facs="duk.xxxxx.003" type="recto"/>
      <p>TEXT</p>
      <pb xml:id="leaf002v" facs="duk.xxxxx.004" type="verso"/>
    <div2 type=”verso”></div2>
      <p>TEXT</p>

6. If a paragraph or other structural unit is split across two paste-ons, note the continuation using an indication of <prev> and <next>:

   <p xml:id=”p1a” next=”p1b”>CONTENT OF PARAGRAPH HERE</p>
      </body>
</floatingText>
</q>
</add>

      <add rend="pasteon" source="#r1">
      <q who="unknown" type="written">
      <floatingText rend="flippy">
      <body>
      <pb xml:id="leaf002r" facs="duk.xxxxx.003" type="recto"/>
      <p xml:id=”p1b” prev=”p1a”>TEXT</p>

Page Numbers

In general, if they are consistent over a series of pages, page numbers can be included in the <pb> tag as follows:

   <pb xml:id="leaf002r" facs="duk.00182.002" type="recto" n="2"/>

If there is only a single number on a page, however, or if you are not sure whether it is a page number, or if the number appears in an unusual format (e.g. 2nd; 2d), you can put it in a note instead:

   <note type="authorial" resp=”#h1” place="top right">2d</note>

Notes

Notes require the “type” attribute (either authorial or editorial), an indication of the hand or the author in “resp,” and an indication of “place” if possible (left, right, top, bottom, inline, supralinear, &c.). Only notes added by the Whitman Archive should be listed as editorial. Notes should be encoded differently depending on whether or not they have a clear target. If the note does have a clear target in the text (if the text is bracketed, for instance), you can point to the bracketed or underlined text in the note. An example of a note that does have a clear target, where the targeted text has been identified as a with an xml:id, is:

   <note type="authorial" resp="#h2" place="right" target="#s1">Egotism</note>

If the note has a numbered inline reference, you can assign the note an xml:id that you point to from the reference with an indication that the note is anchored. If the note is a footnote, there is no need to indicate resp. You might encode the two parts as follows:

  <ref target="#n1" xml:id=”ref1”>(1)</ref>
      …
      <note xml:id="n1" target=”#ref1” type="authorial" place="bottom" anchored=”true”>
      —(1.) The ablest researches into public rights often simply
      consist of the history of past abuses, and we bewilder ourselves
      to no purpose when we take the pains to study
      them deeply.—(<hi rend="underline">Traite des Interests, France</hi>; Marquis Argenson.)—This
      is precisely what Grotius has done.—</note>

Include notes as closely as possible to where they appear in the text. You can nest notes in structural elements, like <p> or <ab>, as necessary. You should specify a “place” if possible. For example:

   <note type="authorial" resp="#h1" place="top">Assyria & Egypt</note>

If the note has been circled, underlined, or otherwise marked in the same hand that the note itself is in, you can note that within <note> using rend:

    <note type="authorial" resp="#h1" place="right" rend=”circled”>Is this so?</note>

If the circle or underline is in a different hand, you can note that by nesting a <hi> within the <note>:

    <note type="authorial" resp="#h1" place="top"><hi rend=”circled” hand=”#h2”>Assyria & Egypt</hi></note>

Brackets and Underlines

Brackets around text (print or manuscript) should be encoded as <span>. For the <span> element, you will need to indicate what type of bracket it is (“bracket-left” or “curlyBrace-right”), designate “hand” as necessary, assign an xml:id, and provide an indication of the material that is bracketed, as follows:

   <span rend="bracket-right" hand="#h1" xml:id="s1" from="#s1" to="#s2"/>
      Once having become separate
      from hieroglyphics, alphabetic writing
      itself underwent numerous differentiations
      —multiplied alphabets were produced;<anchor xml:id="#s2"/>

For a segment that includes a structural unit with an xml:id, you can omit the xml:id in the <span> and point. The attribute “from” specifies the beginning of the passage being annotated; if not accompanied by a “to” attribute, it specifies the entire passage, as follows:

   <span rend="bracket-left" hand="#h1" from="#lg1"/>
                                  <lg xml:id=”lg1”><l>Envy wears the mask of love, and, laughing
                                  sober fact to scorn,</l>
                                  <l>Cries to weakest as to strongest, "Ye are equals,
                                  equal-born."</l></lg>

If text has brackets on both sides, you can nest the <span>s to reflect that, as follows:

    <span rend="bracket-left" hand="#h1" from="#lg1"/><span rend="bracket-right" resp="#h2" from="#lg1"/>
                                  <lg xml:id=”lg1”><l>Envy wears the mask of love, and, laughing
                                  sober fact to scorn,</l>
                                  <l>Cries to weakest as to strongest, "Ye are equals,
                                  equal-born."</l>
                                  <anchor xml:id="s2"/><anchor xml:id="s4"/>

Underlines should be marked using the <hi> element. Text marked as underlined would look like <hi rend=”underline” hand=”#h1”>. If you need to point to underlined text, in the case of a corresponding note, you should add an xml:id within the <hi> (<hi rend=”underline” hand=”#h1” xml:id=”u1”>).

Formeworks and Running Heads

For printed texts, you can indicate the running heads or formework material just within the <pb> as follows:

   <pb xml:id="leaf001r" facs="duk.00400.001" type="recto"/>
      <fw type="date" place="top left">1847.]</fw>
      <fw type="pageNum" place="top right">533</fw>
      <fw type="header" place="top"><hi rend="italic">Washington and his Generals.</hi></fw>
      <fw type="sig" place="bottom">3*</fw>

[Especially relevant for OWU Scrapbook]: Regularizing Spelling of Countries and Cities

Spelling of cities and countries may need to be normalized (as in the case, for example, of "Guatimala" that needs to be regularized as "Guatemala"). Where there has been a political transformation (for example, in the case of "Persia"), it is not necessary. We regularize according to American standards for geographical names (one good resource to check the standard denominations, if in doubt, especially for foreign cities, is the Getty Thesaurus: http://www.getty.edu/research/tools/vocabularies/tgn/index.html).

Attribute Designations (for marginalia)

Add @rend

insertion
unmarked
overwrite
pasteon
lasso (if a line is drawn from the caret to the insertion)

Bibl @type

pasteon
base

FloatingText @rend

flat
flippy

Gap @reason

[NOTE: <gap> includes @quantity or @atLeast, and @unit, as appropriate]

if something is unavailable rather than illegible, use @atLeast="1" and @unit="chars" (instead of @quantity and various other values)

cut away
illegible
covered

Hi @rend

underline
doubleUnderline
italic
smallcaps
circled

Milestone @rend

horbar-short-left
horbar-short-right
horbar-short-center
horbar-full
vertbar-short-top
vertbar-short-bottom
vertbar-full

Note @rend

vertical (for notes written sideways)

Span @rend

bracket-left
bracket-right
curlyBrace-right
curlyBrace-left
singleLine-left
singleLine-right
doubleLine-left
doubleLine-right
wavyLine-right
wavyLine-left

Metamark @function

transpose
uncertainty (for the "?" that Whitman sometimes uses to indicate uncertainty about a particular word choice)
deletion
deletion
group (for marks such as parenthetical brackets used to group words or lines. See, for example, the parentheses here and here, which indicate the boundaries textual units to be transposed.)
paragraph
unknown
location (in cases like "over")
highlight

External Links

Because links to websites outside the Whitman Archive often fail when content is removed or URLs are altered, and we do not have the staffing resources to conduct regular checks and updates for a wide range of such links, in general we have avoided linking out to external websites. We do have a links page where we have collected and offer a series of external links that are checked regularly and updated as necessary. We have also included links to specific Walt Whitman Quarterly Review materials elsewhere on the Whitman Archive in the past. If you are creating a link to materials on the WWQR website, be sure to use the unique, stable URL specific to the item. In most cases this will involve a "doi" in the URL. For instance: https://doi.org/10.13008/0737-0679.2275.

We currently include an external link to the Brooklyn Daily Eagle issues offered by the Brooklyn Public Library on both the external links page and the page devoted to Whitman's newspaper editing. Eventually we hope to obtain and provide users access to our own images of those issues.

Consult a senior editor or a director before adding any other links to external websites.

Resources

Repository Codes

These codes are used to create unique identifiers for documents. Email Brett or Nikki if the repository you're looking for isn't listed.

**Library Codes**
Library/Collection	Location	Code
American Antiquarian Society	Worcester, MA	aas
Amherst College	Amherst, MA	amh
Anderson Galleries		and
Boston College Library	Chestnut Hill, MA	bcl
Boston Public Library	Boston, MA	bpl
Boston University	Boston, MA	bos
Bowdoin College	Brunswick, ME	bow
Brigham Young University	Provo, UT	byu
British Library	London, UK	brl
British Museum	London, UK	bms
Brooklyn Historical Society	Brooklyn, NY	bhs
Brown University	Providence, RI	brn
Bryn Mawr College	Bryn Mawr, PA	bmr
Buffalo & Erie County Public Library	Buffalo, NY	bec
Burlington Public Library (Iowa)	Burlington, IA	bur
Camden County Historical Society	Camden, NJ	chs
Clark University	Worcester, MA	clu
Clifton College	Bristol, UK	clc
Columbia University	New York, NY	col
Connecticut State Library	Hartford, CT	csl
Cornell University	Ithaca, NY	cor
D.B. Weldon Library	Ontario, CA	dbw
Dartmouth College	Hanover, NH	dar
Duke University	Durham, NC	duk
East Carolina University	Greenville, NC	ecu
Edward Carpenter Collection		ecc
Emory University	Atlanta, GA	emu
Folger Shakespeare Library	Washington, DC	fol
Harvard University	Cambridge, MA	har
Haverford College	Haverford, PA	hav
Hillwood Museum	Washington, DC	hlm
Hirshhorn Museum and Sculpture Garden, Smithsonian Institution	Washington, DC	hmu
Historical Society of Pennsylvania	Philadelphia, PA	hsp
Huntington Public Library	Huntington, NY	hpl
The Huntington Library, Art Collections, and Botanical Gardens	San Marino, CA	hun
Indiana University	Bloomington, IN	inu
Iowa Historical Museum	Des Moines, IA	ihm
J. Paul Getty Museum	Malibu, CA	pgm
Johns Hopkins University	Baltimore, MD	jhu
Kansas State Historical Society	Topeka, KS	khs
Knox College	Galesburg, IL	knx
Library of Congress	Washington, DC	loc
Lincoln Memorial Library		lml
Liverpool Central Library	UK	lcl
Maine Historical Society	Portland, ME	meh
Manchester University	Manchester, UK	man
Marietta College	Marietta, OH	mar
Mason City Public Library	Mason City, IA	mcl
Massachusetts Historical Society	Boston, MA	mas
Middlebury College	Middlebury, VT	mid
Mills College	Oakland, CA	mil
Minnesota Historical Society	St. Paul, MN	mnh
Missouri Historical Society	St. Louis, MO	mhs
Musee de la Cooperation Franco-Americaine	FR	mcf
National Archives	Washington, DC; College Park, MD	nar
New Hampshire Historical Society	Concord, NH	nhh
New Jersey Historical Society	Newark, NJ	njh
New York Historical Society	New York, NY	nyh
New York Public Library	New York, NY	nyp
New York University	New York, NY	nyu
Newberry Library	Chicago, IL	nby
Nineteenth-Century Shop		ncs
Northwestern University	Evanston, IL	nwu
Oberlin College	Oberlin, OH	obl
Ohio Wesleyan University	Delaware, OH	owu
Pierpont Morgan Library	New York, NY	pml
Princeton University	Princeton, NJ	pru
Private Collection		prc
Reference Library Bolton		bol
Rhode Island Historical Society	Providence, RI	rdi
Rosenbach Museum and Library	Philadelphia, PA	rml
Rowan University	Glassboro, NJ	row
Royal Library of Copenhagen		rlc
Rutgers University	New Brunswick, NJ	rut
Sagamore Hill		sgh
Salisbury House	Des Moines, IA	sal
Smith College	Northampton, MA	smi
Southern Illinois University	Carbondale, IL	siu
St. John's Seminary	Brighton, MA	sjs
St. Lawrence University	Canton, NY	stl
Stanford University	Palo Alto, CA	sta
Syracuse University	Syracuse, NY	syr
Temple University	Philadelphia, PA	tem
Union College	Schenectady, NY	uco
University at Buffalo	Buffalo, NY	buf
University of California, Berkeley	Berkeley, CA	ucb
University of California, Los Angeles	Los Angeles, CA	ucl
University of Chicago	Chicago, IL	uch
University of Illinois-Springfield	Springfield, IL	uis
University of Iowa	Iowa City, IA	iow
University of Kansas	Lawrence, KS	uka
University of Kentucky	Lexington, KY	uky
University of Nebraska-Lincoln	Lincoln, NE	unl
University of North Carolina	Chapel Hill, NC	unc
University of Pennsylvania	Philadelphia, PA	upa
University of Rhode Island	Kingston, RI	uri
University of Richmond	Richmond, VA	rid
University of Rochester	Rochester, NY	uro
University of South Carolina	Columbia, SC	usc
University of Texas, Austin	Austin, TX	tex
University of Tulsa	Tulsa, OK	tul
University of Virginia	Charlottesville, VA	uva
Washington University	St. Louis, MO	wau
Walt Whitman Birthplace	Huntington Sta, NY	wwb
Walt Whitman House	Camden, NJ	wwh
The Walter Hampden Memorial Library	NY	whm
Wellesley College	Wellesley, MA	wel
Wesleyan University	Middletown, CT	wes
Western Carolina University	Cullowhee, NC	wcu
Williams College	Williamstown, MA	wil
Wisconsin Historical Society	Madison, WI	whs
Yale University	New Haven, CT	yal

Preferred Citations

These citations need to be used when creating EAD documents and when completing the bibliographic information in <sourceDesc>. NOTE: if you do not see a collection here, the citation should be the same as the name of the repository, listed next to the abbreviation above. If you do not see the repository listed here or in the above table, contact Brett, or Nikki for the correct citation information.

**Preferred Citations for Institutions Holding Whitman Materials**
Library/Collection	Preferred Citation
American Antiquarian Society	Bolton-Stanwood Family Papers, American Antiquarian Society
Amherst College	Amherst College Archives and Special Collections
Boston Public Library	The Walt Whitman Collection, Boston Public Library
Boston University	The Alice and Rollo G. Silver Collection in the Special Collections at Boston University
Brigham Young University	L. Tom Perry Special Collections, Brigham Young University
British Library	By permission of The British Library
Brown University	John Hay Library, Brown University
Columbia University (Moncure Conway Papers)	Moncure Daniel Conway Papers 1847-1907, Rare Book and Manuscript Library, Columbia University in the City of New York
Columbia University (Engel Collection)	Solton and Julia Engel Collection of Literary Letters, Manuscripts and Drawings 1832-1935, Rare Book and Manuscript Library, Columbia University in the City of New York
Columbia University (Kempner Collection)	Alan H. Kempner Collection of Literary Letters and Manuscripts 1809-1981, Rare Book and Manuscript Library, Columbia University in the City of New York
Columbia University (Walt Whitman Collection)	Walt Whitman Papers ca. 1842-1969, Rare Book and Manuscript Library, Columbia University in the City of New York
Columbia University (Talcott Williams Collection)	Walt Whitman Documents 1884-1890, Rare Book and Manuscript Library, Columbia University in the City of New York
D.B. Weldon Library	Richard Maurice Bucke Collection, Department of Rare Books and Special Collections, The D.B. Weldon Library, University of Western Ontario
Dartmouth College	The Rauner Special Collections Library, Dartmouth College
Duke University, Trent Whitman Collection	Trent Collection of Whitmaniana, David M. Rubenstein Rare Book & Manuscript Library, Duke University
Emory University	Robert W. Woodruff Library, Special Collections Department, Emory University
Harvard University, Houghton Library	Manuscripts Department, Houghton Library, Harvard University
Haverford College	Haverford College Quaker and Special Collections
The Huntington	The Huntington Library, Art Collections, and Botanical Gardens
The Huntington Public Library	Treasures From Walt Whitman, Huntington Public Library, Huntington, New York
Indiana University	Courtesy, The Lilly Library, Indiana University, Bloomington, Indiana
Johns Hopkins University	Special Collections, The Milton S. Eisenhower Library, The Sheridan Libraries, The Johns Hopkins University
Kendall Reed Collection	Private collection of Dr. Kendall Reed
Knox College	Special Collections and Archives, Knox College Library, Galesburg, Illinois
Library of Congress, Feinberg Collection	The Charles E. Feinberg Collection of the Papers of Walt Whitman, 1839-1919, Library of Congress, Washington, D.C.
Library of Congress, Harned Collection	The Thomas Biggs Harned Collection of the Papers of Walt Whitman, 1842-1937, Library of Congress, Washington, D.C.
Library of Congress, Hannah Whitman Heyde Papers	Hannah Louisa Whitman Heyde Papers, 1853-1892, Library of Congress, Washington, D.C.
Library of Congress, Spofford Papers	Ainsworth R. Spofford Papers, Library of Congress, Washington, D.C.
Library of Congress, Papers of Robert G. Ingersoll	Papers of Robert G. Ingersoll, Library of Congress, Washington, D.C.
Library of Congress, Papers of Benjamin Holt Ticknor	Papers of Benjamin Holt Ticknor, Library of Congress, Washington, D.C.
Library of Congress, Papers of Louise Chandler Moulton	Papers of Louise Chandler Moulton, Library of Congress, Washington, D.C.
Library of Congress, Institute of Aerospace Sciences archives, 1783-1962	Institute of Aerospace Sciences Archives, 1783-1962, Library of Congress, Washington, D.C.
Library of Congress, Burton Norvell Harrison family papers	Burton Norvell Harrison Family Papers, Library of Congress, Washington, D.C.
Library of Congress, Newspaper & Current Periodical Reading Room	Newspaper & Current Periodical Reading Room, Library of Congress, Washington, D.C.
Mills College	Albert M. Bender Collection, Special Collections Department, F. W. Olin Library, Mills College
The Morgan Library	The Pierpont Morgan Library, New York
National Archives and Records Administration	Records of the Adjutant General's Office, National Archives and Records Administration
New York Public Library, Berg Collection	The Henry W. and Albert A. Berg Collection of English and American Literature, New York Public Library
New York Public Library, Oscar Lion Papers	The Oscar Lion Papers, 1914–1955, New York Public Library, New York, N.Y.
New York Public Library, Poetry Society of America Records	Poetry Society of America Records, Manuscripts and Archives Division, The New York Public Library
New York Public Library, Alfred Williams Anthony Collection	Alfred Williams Anthony Collection, Manuscripts and Archives Division, The New York Public Library
Northwestern University	Charles Deering McCormick Library of Special Collections, Northwestern University
Ohio Wesleyan Univeristy	The Bayley-Whitman Collection, Ohio Wesleyan University, Delaware, OH
Princeton University	Manuscripts Division, Department of Rare Books and Special Collections, Princeton University Library
Princeton University (Robert H. Taylor Collection)	Robert H. Taylor Collection, Princeton University
Rutgers University	Special Collections and University Archives, Rutgers University Libraries, Rutgers, The State University of New Jersey
Southern Illinois University	Special Collections Research Center, Southern Illinois University Carbondale
St. Lawrence University	Ruth Atwood Black Collection of Alexander Black and Edith O'Dell Black, St. Lawrence University Library, Canton, N.Y.
Syracuse University Library	Walt Whitman Collection, Special Collections Research Center, Syracuse University Library, Syracuse, N.Y.
Temple University	Rare Books and Manuscripts, Special Collections, Temple University Libraries, Temple University
Union College (Stillman Letters Collection)	Schaffer Library at Union College in Schenectady, NY
University of California, Berkeley	Walt Whitman Collection, The Bancroft Library, University of California, Berkeley
University of California, Los Angeles	William Andrews Clark Memorial Library, University of California Los Angeles
University of Chicago	Special Collections, University of Chicago
University of Iowa	University of Iowa Special Collections and University Archives
University of Kansas	Walt Whitman Collection, University of Kansas, Kenneth Spencer Research Library
University of Manchester	Reproduced by courtesy of the University Librarian and Director, The John Rylands Library, The University of Manchester
University of Nebraska-Lincoln	University of Nebraska-Lincoln Archives and Special Collections
University of Pennsylvania	Walt Whitman Collection, 1842-1957, Rare Book & Manuscript Library, University of Pennsylvania
University of Rhode Island	University of Rhode Island Special Collections and Archives
University of Texas, Humanities Research Center	The Walt Whitman Collection, Harry Ransom Humanities Research Center, The University of Texas at Austin
University of Tulsa	Walt Whitman Ephemera, University of Tulsa
University of Virginia	Papers of Walt Whitman (MSS 3829), Clifton Waller Barrett Library of American Literature, Albert H. Small Special Collections Library, University of Virginia
Walt Whitman House	Walt Whitman House, Camden, N.J.
Walter Hampden Memorial Library	The Walter Hampden Memorial Library, The Players, New York
Washington University	George N. Meissner Collection, Department of Special Collections, Washington University Libraries, Washington University
Yale	Yale Collection of American Literature, Beinecke Rare Book and Manuscript Library

Link to TEI Guidelines

TEI P5 Guidelines

Link to unScripting Whitman, a Whitman Handwriting Tool

Whitman's Handwriting Examples