3. Local

Encoding that Varies from Document to Document

3.1 Spacing
3.2 Deletions
3.3 Additions
3.4 Additions and Deletions in Combination
3.5 Illegible or Missing Text
3.6 Hyphenation and Non-Standard Spelling
3.7 Unusual Characters and Marks
3.8 Graphically Distinctive Text
3.9 Signatures and Dates
3.10 Other Writing in Whitman's Hand
3.11 Writing in Others' Hands
3.12 Cutting and Pasting
3.13 Page Breaks
3.14 Encoding Corrected Proofs
3.15 Encoding Prose
3.16 Encoding Lists
3.17 Manuscripts That Are Neither Poetry Nor Prose
3.18 Enigmas
3.19 Work Relationships and Date Information

3.1 Spacing

Recent work with stylesheets has taught us that paying attention to and regularizing the encoding of white space is important as we prepare manuscripts for display on the site. The most important guideline is simply to be conscious of spacing as you transcribe and encode, but here are a few more specific rules to follow:

Be sure to put a space between words. Remember that, after processing, the markup will be invisible, so your transcription needs to include the spaces that separate words even when the words are separated in the XML document by one or more tags.

Avoid spaces before closing <add> or <del> tags. Since Whitman's revisions typically did not involve the addition or deletion of white space after the last word of a phrase, make sure you insert the space outside the closing <add> or <del>. A properly spaced transcription looks like this:
<add type="unmarked" place="supralinear">Song</add> of Myself.

Within <app> structures, insert the spaces between the closing <add> or <del> tag and the closing <rdg> tag. All characters must be contained within the "reading" or <rdg>, so spaces outside of <rdg> will be ignored. A properly spaced <app> structure should look like this:
Song of <app>
<rdg varSeq="1">
<del type="overstrike">You</del> </rdg>
<rdg varSeq="2">
<add type="unmarked" place="supralinear">Myself</add> </rdg>
</app>

Spaces before closing <l> or <seg> tags are unnecessary and should be eliminated.

Spaces before and after the em dash (—) should be eliminated.

Do not insert unnecessary spaces. Often, encoders have inserted spaces that are not part of the transcription (for example, to make the tagging more human-readable). You can use as many returns as you wish to make the markup easier to read, but please do not use the space bar.

[To learn how to encode Whitman's use of intentional space within lines, go here]

3.2 Deletions

What to Mark

Attributes

Deletion of Longer Passages

What to Mark

Use <del> to mark a letter, word or passage that has been deleted by any method. Use common sense when marking deletions; if an entire line has been crossed out, for example, but the horizontal line does not physically intersect with a comma that follows the passage, you should still assume that the comma is intended to be included in the deletion. In cases of doubt, please consult Ed Folsom or Kenneth Price for his reading of the passage.

Attributes

type is the only required attribute for the del element. Possible values are:

overstrike: A line or lines are drawn through rejected letters, words, or passages. This is by far the most common method of deletion in Whitman manuscripts.

erasure: Whitman has erased part of the text.

hashmark: A vertical or diagonal line or lines marks through a large chunk of text (often the whole manuscript page).

pasteover: Whitman has deleted text by pasting another piece of paper on top of it.

overwrite: Letters or words are marked for deletion by being written over with other letters or words.

NOTE: Each of these types of deletion can occur in combination with additions; "overwrite" does by definition, and "pasteover" almost always does. For information about marking combinations of additions and deletions, see section 3.4, "Additions and Deletions in Combination."

Whitman's overstrikes are usally emphatic and easily recognizable, but occasionally one sees a mark which may be either an overstrike OR a stray pen mark. In these cases, first check with Kenneth Price or Ed Folsom, and then use the optional cert attribute to indicate your degree of certainty that the passage has been deleted. For example, on the Duke manuscript "I see who you are," a few lines from the bottom, the word "editor" appears to be struckthrough. This might be tagged as follows:
<del type="overstrike" cert="80%">editor</del>

Deletion of Longer Passages

Occasionally you will encounter a long passage that has been deleted. For these passages, use <del> unless doing so would create a nesting problem. For example, consider again the manuscript in the example above. Here, the long vertical strike is marked simply by enclosing the entire poem with a del element thus:
. . .
<body>
<pb/>
<del type="hashmark">
<lg1 type="poem">
. . .
</lg1>
</del>
</body>
. . .

Imagine, however, a different scenario, one in which a deleted passage consists of the last two words of one line segment <seg> and all of the next line segment. Because using <seg><del></seg><seg></seg></del> would violate the nesting rule, using paired <del> </del> tags to mark this kind of deletion is unacceptable. Instead, you must use the elements delSpan and anchor, the first to mark the beginning of the deleted passage and the second to mark the end. As with <del>, the type attribute is required for <delSpan>. The to attribute is also required, since it provides a "pointer" to the anchor. The value of the to attribute must be the same as the value of the required id attribute on the corresponding anchor element. To create these, use "d" (for a deletion) plus the next available number ("d1" for the first <delSpan>, "d2" for the second, and so on). Please note: if the manuscript calls for multiple <delSpan> elements, you will need to use distinct identifiers for each anchor. In other words, you may only use "d1" or "d2" once in a document; subsequent identifiers will have to be "d3," "d4," etc.

Consider as an example the Duke manuscript "To be at all". If we ignore other complexities for the moment, the multi-segment deletion in the line that begins "One no more than" should be marked as follows:

. . .
<seg>and out of me <delSpan type="overstrike" to="d1"/>of me more bliss</seg>
<seg>than I thought the spheres</seg>
<seg>could carry.</seg>
</l>
<anchor id="d1"/>

3.3 Additions

What to Mark

Attributes

Addition of Longer Passages

Transpositions Noted by Arrows or Asterisks

What to Mark

Use <add> to mark any part of the text whose placement, ink, etc. clearly indicate that it was added to the manuscript after the surrounding text was written.

Attributes

Two attributes are required on the <add> element: type and place.

For type, the possible values are:

insertion: marked by a caret (which often looks like an "x").

unmarked: added without caret or other mark.

overwrite: written over earlier text.

pasteon: written on a piece of paper that is glued to paper with earlier text.

For place, the possible values are:

supralinear: above the line.

inline: in space available on the same line as earlier text.

infralinear: below the line.

over: over the earlier letter, word, or phrase.

margintop: in top margin.

marginbot: in bottom margin.

marginleft: in left margin.

marginright: in right margin.

interlinear: between lines.

Addition of Longer Passages

The rules for marking long additions are similar to those for marking long deletions. Use <add> unless doing so would create a nesting problem, and use the elements addSpan and anchor to mark the beginning and end of a deleted passage that doesn't nest within other elements. <addSpan> has three requried attributes: to, type, and place. Available values for type and place are the same as those listed above for <add>. The value of the to attribute must be the same as the value of the required id attribute on the corresponding anchor element. To create these, use "a" (for "addition") plus the next available number ("a1" for the first <addSpan>, "a2" for the second, and so on).

Transpositions Noted by Arrows or Asterisks

Some manuscripts have brackets, arrows, and/or a series of asterisks to indicate Whitman's desire to move a line or lines to a different place in the poem. To encode this phenomenon, we use the <transpose> element. In this example, Whitman has bracketed one line and indicated with an asterisk in the margin that the line should be moved down. The encoding for this section follows, with the tagging most pertinent to the transposition in bold.

<note type="authorial" place="marginleft">-down</note>
<transpose rend="bracketed" anchored="yes" target="t1">
<l><seg>Of the native scorn of grossness</seg>
<seg>and gain there, (O it lurks</seg>
<seg>in me night and day—What</seg>
<seg>is gain, after all, to savage-</seg>
<seg>ness and freedom?)</seg></l>
</transpose>
<l><seg>Of immense spiritual <app><rdg varseq="1"><del type="overstrike">things</del></rdg>
<rdg varseq="2"><add type="unmarked" place="supralinear"> results</add></rdg></app>, future years,</seg>
<seg>inland, spread there each side of</seg>
<seg>the Anahuacs,</seg></l>
<l><seg>Of these Leaves established there, and</seg>
<seg>well understood there.—</seg></l>
<anchor id="t1"/>
<milestone unit="undeclared" rend="horbar"/>
<note type="authorial" place="marginleft">take down-</note>

Explanation:

We use <note> to transcribe any marginal characters that indicate the transposition.

When Whitman brackets the part to be moved, we add rend="bracketed" to the <transpose> tag.

The anchored attribute is used to note whether or not the manuscript clearly indicates where the line(s) are to be moved. (We know of at least one example where this is not known.) This attribute has one of two possible values: yes or no.

If the value of the anchored attribute is yes, a target attribute is required; it points to an <anchor> that is placed at the target—the point in the manuscript to which the part is to be moved. If the value of anchored is no, the target attribute is not required.

3.4 Additions and Deletions in Combination

Substitutions

Overwriting

Pasting

Nesting <add> and <del>

Substitutions

Very frequently, Whitman's additions are not merely appended to earlier text but are substituted for earlier text. It is our policy to link the deleted and added portions by marking each as a reading <rdg> within an <app> element. ("App" is short for "apparatus entry." For information about the use of this element in other contexts, see chapter 19 of the TEI Guidelines.) Each <app> will contain at least two <rdg>s, and may contain up to five. The <app> element requires no attributes; the rdg element requires the varSeq ("variant sequence") attribute, the value of which is a single-digit number that indicates the relative order in which the present reading is presumed to have been written. For this example line segment, the markup would look like so:
<l><seg>Old Asia's
<app>
<rdg varSeq="1"><del type="overstrike">self</del></rdg>
<rdg varSeq="2"><add place="supralinear" type="unmarked"> there </add></rdg>
</app>
with venerable</seg>
. . .

For obvious reasons, usually the first reading will contain a deletion and the second reading will contain an addition. This is not always true, however. You may come across instances where multiple variants are left undeleted. In this case, the first <rdg> will contain no other elements, just the transcribed word(s).

Or you may find that a second reading was added but subsequently rejected in favor of the first. In this case, the second <rdg> will contain both an <add> and a <del>. (See "Nesting <add> and <del>" below for an explanation of how to arrange these.)

If the state of the manuscript makes it difficult to determine with certainty the order of the readings, a resp attribute is available for the <rdg> element; the value of this attribute can be used to identify (by initials) the person responsible for asserting an order of readings.

Overwriting

You will sometimes encounter a substitution in which a letter or word has been overwritten with another letter or word. In such cases, the value of the place attribute on the add element is "over"; the value of the type attribute for both the <add> and <del> elements is "overwrite." For example, consider this manuscript excerpt, in which Whitman has changed the "e" from upper- to lower-case. The markup for this word is:

<app>
 <rdg varSeq="1">
 <del type="overwrite">e</del>
 </rdg>
 <rdg varSeq="2">
 <add type="overwrite" place="over">E</add>
 </rdg>
</app>ach

Pasting

Fairly frequently, Whitman made substitutions by pasting one page or scrap over another. Treat such cases as you would other substitutions, by using the app, rdg, del, and add elements. For the type attribute on <del>, use the value "pasteover." The type attribute of the <add> should be given the value "pasteon," and the place attribute should be given the value "over." For an example, look at this manuscript leaf.
. . .
<app><rdg varSeq="1"><del type="pasteover">
<l><seg>And that night O you happy </seg>
<seg>waters, I heard you beating</seg>
<seg>the shores – But my heart</seg>
<seg>beat happier than you – for</seg>
<seg>he I love is returned and </seg>
<seg>sleeping by my side,</seg></l>
<l><seg>And that night in the stillness</seg>
<seg>his face was inclined toward</seg>
<seg> me while the moon's clear</seg>
<seg>beams shone,</seg></l>
<l><seg>And his arm lay lightly over my</seg>
<seg>breast – And that night I</seg>
<seg> was happy.</seg></l>
</del></rdg>
<rdg varSeq="2"><add type="pasteon" place="over">
<l><seg>And that night, while all</seg>
<seg>was still, I heard the</seg>
<seg>waters roll slowly continually</seg>
<seg>up the shores</seg></l>
<l><seg>I heard the hissing rustle of</seg>
<seg>the liquid and sands, as directed</seg>
<seg>to me, whispering, to congratulate</seg>
<seg>me, – For the friend I love lay</seg>
<seg>sleeping by my side,</seg></l>
<l><seg>In the stillness his face was in-</seg>
<seg>clined towards me, while the</seg>
<seg>moon's clear beams shone,</seg></l>
<l><seg>And his arm lay lightly over my</seg>
<seg>breast – And that night I was happy.</seg></l>
</add></rdg></app>
. . .

Nesting <add> and <del>

Of course, not all combinations of <add> and <del> are substitutions. Consider this manuscript excerpt. To indicate that the addition was deleted, the add element should be nested within the del element:
from
<del type="overstrike">
 <add type="unmarked" place="supralinear">this</add>
</del>
base

There are a number of other ways in which Whitman combined additions and deletions—probably too many to cover each one separately here. You should be able to handle almost all situations you encounter by applying these principles and rules:

Nesting operates on a radial principal, working from the center out. For additions and deletions, this means that when the boundaries of a deletion and an addition are the same, the <add> should be nested within the <del>. This indicates 1) that the material was added; and 2) that the addition itself was deleted. If, however, only part of an addition has been deleted, the <del> will, of course, be nested inside the <add>.

<add>s and <del>s, in various combinations, can be nested within one another, with no theoretical limit to the "depth" of that nesting. So it's entirely possible to have, for example, a <del> within an <add> within an <add> within a <del>.

<app>s, however, should never be nested inside other <app>s, even though you will occasionally encounter situations which seem to call for such markup. Because <app>s within <app>s would create difficulties for computer processing, our project policy is to mark only the "highest level" substitution as such and to mark interior substitutions only with <add> and <del>, as appropriate.

All <add>s and <del>s must nest properly within any other elements that are present. In particular, you should be careful not to straddle line or line segment boundaries.

To understand how to approach a complicated series of additions and deletions, take a look at this manuscript line segment. Note that it shows two substitutions: the word "for" replaces "to," and the phrase "time's hourly ceaseless" replaces "the varied." The fact that "hourly" and "ceaseless" have then been deleted adds a further complication. This is how the segment should be encoded:

<seg>No more
 <app>
 <rdg varSeq="1">
 <del type="overstrike">to</del>
 </rdg>
 <rdg varSeq="2">
 <add type="unmarked" place="supralinear"> for</add>
 </rdg>
 </app>
him
 <app>
 <rdg varSeq="1">
 <del type="overstrike">the varied,</del>
 </rdg>
 <rdg varSeq="2">
 <add type="insertion" place="supralinear">
time's
 <del type="overstrike">hourly ceaseless</del>
 </add>
 </rdg>
 </app>
mightiest,</seg>

3.5 Illegible or Missing Text

Often while encoding, we find words or marks that we cannot decipher, or we postulate readings that we do not feel completely confident about. Since we have decided to encode all of Whitman's text, no matter how indecipherable, we have tags to help us record illegible or missing text.

<gap>: for completely unreadable text

<supplied>: for text that is currently unreadable, but that has been supplied by another source

<unclear>: for text illegible enough to render your transcription questionable

<gap>: This element is used when text is absolutely unreadable, when, for example, it has been torn or cut away, obscured by deletion, or is simple illegible. Each <gap> needs a reason attribute, and you have the choice of three values, "cut away,""deletion, illegible," or "illegible."Note: gap is an empty element (i.e, does not require a close tag).

<gap reason="cut away"/>: When a page has been torn or cut, leaving only tantalizing stubs of the letters you want to transcribe, as in this example, use this tag at the point in the transcription where the words would appear.

<gap reason="deletion, illegible"/>: When deleted words are illegible (typically because of Whitman's overstrike) insert a <gap> tag in place of The unreadable words. For an example of this sort of circumstance, click here.

<gap reason="illegible"/>: When characters on the page are not deleted, but are simply impossible to make sense of, as in this example, where the characters preceding the question marks cannot be resolved, use the "illegible" value for the reason attribute.

<supplied>: Sometimes a secondary source can supply a reliable transcription of text that is at present illegible, as for example when a transcription was done by an institution or editor before damage occurred. The supplied element is used in such situations. Enclose that part of the text that has been supplied in the supplied element, and use the reason attribute values listed for <gap> above to state the cause of the loss of text. Also insert a source attribute with a value that notes your source for the supplied text. For example, this excerpt is from our transcription of "Ashes of Roses":


Are we to have a National Hy
<supplied reason="cut away" source="Library of Congress transcription">
mn by
 <orig reg="Centennial">Cen-<orig>
</supplied>
tennial time?

<unclear>: When you believe you have an accurate reading of a difficult-to-read passage, but you are not completely confident, mark the questionable reading with the unclear element. Use the reason attribute to state the cause of the uncertainty in transcription, selecting from the values described above under <gap>. Use the cert (certainty) attribute to indicate the degree of confidence in the transcription. Its value will be a numeric percentage (e.g., "95%"). Also include a resp (responsibility) attribute to indicate your responsibility for the postulated reading, and as its value use your initials.

For example, if Andy Jewell is encoding a manuscript with an unclear deleted word that he thinks might be "herbage," he inserts this markup:
<unclear reason="deletion, illegible" cert="70%" resp="awj">herbage</unclear>

**Remember, when the value of a "resp" attribute indicates a hand other than Whitman's, a note must be included in the <profileDesc> within the Header. Go here to read more about how to do this.

3.6 Hyphenation and Non-Standard Spelling

Segments with end-hyphenation

Because we wish both to record the lineation of the copy text and to enable searches for words that are broken by end-line hyphenation, we use the <orig> tag with the reg attribute to record the original and regularized readings. Tag such instances in the following way:
<l>
<seg> . . . and the dying <orig reg="emerging">emerg-</orig></seg>
<seg>ing from gates,</seg>
</l>

Non-Standard Spelling

The sic element is used to represent a mistake by the author. The required attribute corr provides a correction. These corrections will enable searches to use standardized spelling and not require the searcher to know, for example, that Whitman misspelled "Buildings" as "Buldings" in this manuscript. This word should be marked up in this way:
<sic corr="Buildings">Buldings</sic>

Sometimes what you might think of as a spelling error would more accurately be termed an alternate spelling. For words that are spelled in idiosyncratic—though not exactly incorrect—ways, use the orig tag and its required reg attribute. This element and attribute pair works in the same way as sic and corr; that is, Whitman's spelling is transcribed and the standardized spelling is recorded as the value of the reg attribute. As an example, look at Whitman's spelling of "Shakespeare" in this image. Since this spelling of Shakespeare's name is one he himself used (and he never, as far as we know, used "Shakespeare"), it should be encoded as follows:
<orig reg="Shakespeare">Shakspere</orig>

3.7 Unusual Characters and Marks

XML supports only the ASCII character set, which roughly corresponds with the set of characters on a standard keyboard. Not all of the characters you might encounter in a Whitman manuscript are part of the ASCII character set, so to represent one of these unsupported characters you will need to use the appropriate Unicode number—a string of numerals that begins with an ampersand and pound sign (&#) and ends with a semicolon (;).

The table below lists the Unicode numbers we are using on the project. It is important to use the numbers for the listed characters, even when it might be possible to key them in (as with the ampersand, for example) or to use a close approximation (e.g., two hyphens to represent an em-dash). For characters not listed, Unicode numbers are NOT necessary.

For the characters in the left-hand column to display correctly, you must have a Unicode font installed on your computer. If you see boxes for some or all of the entries there, you can try downloading and using Bitstream Cyberbase.

Character Function in Whitman Unicode Number

= Proofreader's mark for hyphen. WW sometimes uses "=" for compound words ("down=balls") and words split between two lines ("some=thing").
PLEASE NOTE that ‑ is used only when Whitman uses "="; if he uses the standard hyphen ("-"), just key it in. ‑

— Longer dash e.g., "Not these—O none of these more"
PLEASE NOTE that there should be no spaces before or after the dash, regardless of how the spacing appears on the page. —

& Indicates "and" &

* An asterisk *

© Copyright symbol ©

✓ Checkmark ✓

½ Used often in Bowers's system of page numbering ½

¾ Used to indicate the fraction, occasionally on manuscripts ¾

¶ Indicates beginning of new paragraph or a new line of poetry ¶

ñ Spanish-language character, n with tilde ñ

ó An "o" with an acute accent mark (to capitalize, change to Ó) ó

é An "e" with an acute accent mark (to capitalize, change to É) é

è An "e" with a grave accent mark (to capitalize, change to È) è

☞ A right-pointing finger ☞

☜ A left-pointing finger ☜

☝ An up-pointing finger ☝

☟ A down-pointing finger ☟

3.8 Graphically Distinctive Text
Underlined words: Underlined words require the "rend" attribute with the "underline" value. The "rend" attribute is global (can be used on any element), but typically you will use it with a <head> element, the <signed> element, or, if the underlined words are in the middle of a line, the <hi> element. However, if an entire line or linegroup is underlined, the element can be used on <l> or <lg>. For example, in this manuscript the underlined words in the first line would be encoded like this: <hi rend="underline">the necessity of</hi>

Dotted-underlined words: Occasionally, you may encounter a manuscript with a word that has been deleted and underlined with a series of dots. This is a printer's mark for "I don't want to delete this word after all; please leave it in," or, "stet." To handle these instances, which are rare, we surround the dotted-underlined word and the <del> with the <restore> element and use the "rend" attribute. For example, in this manuscript the dotted-underlined words are encoded like this:
<app>
<rdg varSeq="1">
<del type="overstrike">baleful</del>
</rdg>
<rdg varSeq="2">
<add place="supralinear" type="unmarked">
<restore rend="dotted"><del type="overstrike">mortal</del></restore>
</add>
</rdg>
</app> coals,

Line indentation: Though in most cases Whitman begins lines at the left edge of the writing space, he sometimes uses line indentation in distinctive ways, as in this copy of "O Captain! My Captain!" To encode this indentation, we add the "rend" attribute to the <l> element. The value of "rend" is "indented" plus a number that indicates the relative length of the indentation. For the shortest indentation, we use "indented1"; for the longest, "indented4."

Reference Chart for Use of the "Rend" Attribute

Value of 'rend' attribute Function in Whitman

underline indicates underscored text

circled used when text, typically within a note in the margin, is surrounded by a circular line in order to separate it from other text

bracketed used in <title> within the <titleStmt> to distinguish derived titles

italic used only in transcriptions of printed material or in project notes to mark titles of books.

dotted used with <restore>

indented1
indented2
indented3
indented4 added to <l> when Whitman uses staggered indentation at the beginnings of lines; the numbers indicate relative amount of indentation (1=shortest, 4=longest)

horbar-full
horbar-short-right
horbar-short-left
horbar-short-center used in <milestone> to indicate various positions and lengths of horizontal separators

3.9 Signatures and Dates

Bylines
Bylines that immediately follow the title should be encoded using the byline element. Insert it after the head element and before the first <l>. The byline shown here is encoded as follows:

. . .
<head type="main-authorial" rend="underline">Up, lurid stars!</head>
<byline><hi rend="underline">By Walt Whitman</hi></byline>
<l>Up, lurid stars! martial constellation!</l>
. . .

Signatures at the Bottom
A signature that comes after the last line should be marked with the signed element. In order to properly unite the signature with the poem being signed, <signed> is included within <closer>. The <closer> ought to close before <lg1> closes.

The following example is based on this manuscript. . . .
<seg><add type="insertion" place="supralinear">scented</add> roses blooming.</seg>
</l>
<closer>
<signed rend="underline">Walt Whitman</signed>
</closer>
</lg1>
</body>
</text>
</TEI.2>

Dated Manuscripts
A date which Whitman has written on a manuscript in order to note the composition date or occasion date (as when he writes on Washington's birthday or on the death of General Sheridan) is encoded within the <dateline> tag. <dateline> can occur wherever Whitman has written the date, typically either after the <head> or within the <closer> after the signature.

Each time you use a <dateline> tag, you also need to use the <date> tag with the value attribute. The <date> element should contain only the date. The value of value is the normalized date, put into this form: YYYY-MM-DD. If, as in the example below, you need to encode a date range, the value attribute will have both dates, separated by a slash.

The following example is based on this manuscript.
. . .
<lg1 type="poem">
<head type="main-authorial" rend="underline">The Sobbing of the Bells—</head>
<dateline>(Midnight <date value="1881-09-19/1881-09-20">Sept: 19-20 1881</date>)—</dateline>
<milestone unit="undeclared" rend="horbar"/>
<l><seg>The sobbing of the bells, the sudden death‑news</seg>
<seg>everywhere</seg></l>
. . .

When is a date a <dateline> and when is it a <note>?: We use the <dateline> element to note a date that is to be published with the poem and is part of the poem's meaning. As mentioned above, Whitman most often inserted these sorts of dates under the head or under his signature. On other manuscripts, Whitman has written a date that is not a <dateline> but instead is to be treated as a <note> on the page. Most of the time, these <note>s will be distinguished by their placement on the manuscript page (in a margin, a corner, or otherwise beyond the layout of the poem proper) and by their ambiguous relationship to the lines of poetry.

3.10 Other Writing in Whitman's Hand

Notes

Reverse-Side Notes

Unrelated Reverse-Side Writing

Miscellaneous Writing

Horizontal Lines

Brackets

Page Numbers

Section Numbers

Intentional Inline Spaces

Notes

You will sometimes come across writing on the manuscript page that is not part of the text of the poetry manuscript proper, but instead a note of some sort about it. For example, this note follows a poem, at the bottom of the page. This sort of material is encoded using the <note> element. Note also takes two required attributes: type and place. To distinguish the writing as Whitman's, use the value "authorial" for the type attribute. For the place attribute choose from the following values: "margintop," "marginbot," "marginleft," "marginright," "inline," "supralinear," or "interlinear."

The example should be marked up as follows:

<note type="authorial" place="marginbot">
 <app>
 <rdg varSeq="1">
 <del type="overstrike">sent to</del>
 </rdg;>
 <rdg varSeq="2">
 <add type="unmarked" place="supralinear">pub in</add>
 </rdg>
 </app>
Herald early in Feb. '88
</note>

Sometimes, Whitman will visually separate his notes from the rest of the text by drawing a boundary line, as in this example. When this happens, you need to add the rend="circled" attribute and value to <note>. (The value circled is used even though Whitman's boundary line often does not make a proper geometric "circle"). The encoding for the example would read:
<note type="authorial" place="margintop" rend="circled">follow copy strictly</note>

Note that is only used within <note> when there are multiple paragraphs within the note.

Reverse-Side Notes

Occasionally, Whitman will write a prose note about the poem on the reverse-side of the manuscript leaf, such as a note to the printer or a comment on the poem's placement in a larger work. These notes, though they are on the reverse side, are encoded basically the same way as the notes described above are encoded, with a few minor adjustments:

The <note> tag is inserted after the <pb> tag that identifies the verso (i.e., the one with "id='leaf01v'")

The value of the place attribute is "inline"

Unrelated Reverse-Side Writing

If reverse-side writing is in Whitman's hand, we encode it, regardless of its content. You will need to create a new <head> and, therefore, a second <title> in the <titleStmt> to deal with the separate intellectual unit on the reverse-side. It should be encoded to the same level as any other Whitman manuscript.

Miscellaneous Writing

The note element is also used to mark other writing on the page that, while not strictly a note, is not part of the text. Examples include: page numbers, addition or subtraction problems, and question marks.

Whitman's use of proofreading/typesetting marks is a special case. To encode a manuscript that has such marks, you should first decide whether the marks are being used to indicate a) changes to the base text, or b) emphasis of a typographic feature.

It is our current policy to encode only marks of the first type. Examples include the caret ( _^ ) to indicate an addition, and the paragraph mark (¶) to indicate a new paragraph (in prose) or line (in verse). For instructions on encoding the caret, see 3.3 Additions. The paragraph mark should be encoded as a named character entity—¶.

At present we have chosen not to encode marks of the second type, though we may in a future stage of the project return and add representions of them to our markup. Examples of marks that you should not encode include triple-underlining to emphasize capitalization and horizontal curved brackets used to indicate lack of spacing between parts of hyphenated words.

Horizontal Lines

Whitman fairly often draws a line to signal the beginning or the end of a unit of text. These lines range from full page width to small bars at the left, right, or in the center. You can see an example of a small center line at the bottom of this manuscript and an example of a small line at the left at the top of this one. We take these lines to indicate some kind of division, though we make no claims about the sort of unit(s) they define. They are encoded using the empty milestone element, with "undeclared" as the value of the unit attribute and the following as possible values of the rend attribute:

horbar-full

horbar-short-right

horbar-short-left

horbar-short-center

The first example above should be encoded as follows:

<seg>We never separate again.—</seg></l>
<milestone unit="undeclared" rend="horbar-short-center"/>
</lg1>

Brackets

Whitman sometimes used brackets to group lines or other bits of text, as in this example. These should be indicated by using the span element. Like <addSpan> and <delSpan>, is an empty element (i.e., consists of a single open tag), but it works in a slightly different way. In addition to the to attribute, from and value are also required. Assign the span an id—use the formula s+number—and give the "from" the same id. This, in essence, means "start from here.") The value of the value attribute will always be "bracketed." And, as with addSpan and delSpan, you must always mark the end of the bracketed section with an empty anchor element bearing an id attribute whose value corresponds with the value of the 's to attribute. This is the markup for the example above:

<l>I rate myself high—I receive no small sums;</l>
<l>I must have my full price—whoever enjoys me.</l>
<anchor id="s2"/>

Page Numbers

Whitman's use of page numbers—combined with the history of manuscript dispersal—means we are left with both ambiguous and reliable page numbering on Whitman manuscripts. By "ambiguous," we mean manuscripts with a number, like "43," written in the top corner but no corresponding "42" or "44"; by "reliable," we mean multiple-leaf manuscripts with an ordered numbering of each leaf (ordered numbering does not mean an uninterrupted sequence that begins with "1"; instead, it means any discernable numbering system that reliably determines the leaf order).

We handle these two types of page numbers in different ways. For ambiguous numbers, we use <note>. For reliable page numbering, we add an attribute to the Page Break, or <pb> element. Specifically, we add an "n" attribute with a value that corresponds to the number written on the page. So, if a three-leaf manuscript is numbered "2," "3," "4," then the <pb>s would have n="2", n="3", and n="4".

Section Numbers

Sometimes you will encounter a manuscript with distinctly numbered sections, as in this example. These sections are different than linegroups, as they are typically numbered or otherwise clearly marked, and they often contain multiple linegroups. To handle section numbers, add a <head> tag immediately after the <lg> tag to note the "head" of that section. Here's how you would encode the example manuscript :

...
<lg1 type="poem">
<head ... >
<lg2 type="section">
<head type="main-authorial">1</head>
<lg3 type="linegroup">
<l>Come, said the Muse,</l>
....
</lg3>
<lg3 type="linegroup">
...
</lg3>
<lg3 type="linegroup">
</lg3>
</lg2>
<lg2 type="section">
<head type="main-authorial">2</head>
...
</lg2>
...
</lg1>

Intentional Inline Spaces

Whitman will occasionally leave a blank space within a line of poetry, apparently making room for the perfect word that he has yet to discover. To encode these spaces, insert a <space> element with two attributes, dim (dimension), the value of which will almost always be "horizontal"; and extent, the value of which is expressed as a number of letters, determined by the size of the letters surrounding the space on the manuscript. The encoding for this manuscript would look like this:

action, <space dim="horizontal" extent="9 letters"/>in husky

In other cases, Whitman will leave a blank line that indicates the intentional blank space (apparently in addition to his major poetic innovations, Whitman also developed the Mad Lib). To encode this phenomenon, use the same strategy as above, but add a "rend" attribute with the value "underline." Therefore, this example would be tagged like this:
. . . .
<seg>As in Visions of <space dim="horizontal" extent="7 letters" rend="underline"/> at</seg>
<seg>night—</seg>

3.11 Writing in Others' Hands

Writing on the Front of the Leaf

Writing on the Back of the Leaf

Other Cases

Writing on the Front of the Leaf

Use the note element to mark letters, words, etc. that someone other than Whitman has written on the manuscript page you are encoding. As explained in section 3.9 above, this element requires both type and place attributes. For the value of type use "editorial." The possible values of the place attribute are the same as for other notes: "margintop," "marginbot," "marginleft," "marginright," or "interlinear." In addition, non-authorial notes should be given a resp attribute, with a two- or three-letter code to identify the responsible hand. Hands to which codes have been assigned are listed in the following table. Should you encounter a hand not accounted for, please email Brett Barney and Andy Jewell.

Name primary location of mss code

Bowers, Fredson UVa fb

Traubel, Horace Library of Congress, Feinberg Collection ht

unknown n/a unk

This note, written by Traubel in the bottom right corner of a poem manuscript, is encoded as follows:

<note type="editorial" resp="ht" place="marginright">
For Francis Howard Williams May 1896 Traubel
</note>

Resp Values Any time the resp attribute is used (whether on <note> or on <unclear>, etc.) in the body of a document, you must also include in the header some information that explains each of the values you've used. To do this, add a profileDesc (profile description) element directly before the <revisionDesc> (revision description). Inside the <profileDesc>, create a pair of handList tags and one hand element for each different value given to a resp attribute in the file. hand has two required attributes: scribe and id. The value of scribe is the full name of the person being identified as responsible for the written note. The value of id corresponds with the value assigned to the resp attribute in the body. The following excerpt is taken from the file that contains the note discussed above:

<profileDesc>
<handList>
<hand scribe="Horace Traubel" id="ht"/>
</handList>
</profileDesc>
<revisionDesc>

Writing on the Back of the Leaf

Most cases of reverse-side writing in others' hands arise from Whitman's re-use of paper. Common examples include envelopes, fan letters, and government stationery. For such situations, we use the note element with the value "project" on the type attribute to describe (not transcribe) the non-authorial writing. Unlike "authorial" and "editorial" <note>s—which should be transcribed as close as possible to their position in the manuscript—these notes must be placed within the <teiHeader>. To do this,

insert a <notesStmt> immediately before the opening source description tag (<sourceDesc>);

inside the <notesStmt> create a <note> with no place attribute, the value "project" on the type attribute, and the additional attribute target;

find the page break that indicates the beginning of the reverse side;

copy the value of the page break's id attribute as the target attribute on the project note;

write a short description of the reverse-side writing

Note: You need not use s to enclose these project notes, unless you want to write more than one paragraph of description. Here's how the envelope example above should be encoded:

<teiHeader>
<fileDesc>
. . .
<notesStmt>
<note type="project" target="leaf01v">Verso of manuscript leaf is addressed to Walt Whitman, Camden, New Jersey, postmarked September 25, 1890.</note>
</notesStmt>
<sourceDesc>
<bibl>
<author>Walt Whitman</author>
. . .
</teiHeader>
. . .
<pb corresp="loc.00046.001" id="leaf01v" type="verso"/>
. . .

Other Cases

Materials which accompany manuscripts (notes, transcriptions)

For now, we have decided not to transcribe or encode any of the items not written by Whitman—for example, notes about the text or transcriptions—that are sometimes stored with manuscripts. Eventually, these may be encoded as separate documents and linked to the relevant manuscript, but this work will happen at a later time.

Pasted clippings

See section 3.12 below for a discussion of how to encode clippings that Whitman has incorporated into his own manuscripts.

3.11 Cutting and Pasting
We distinguish among three sorts of pasting in Whitman's manuscripts. Please look over the description of each and decide which describes the instance you're encoding.

One page pastes over another, deleting old material and adding new. (See 3.4, "Additions and Deletions in Combination.")

Paper has been pasted together to provide more writing space. (See below.)

Whitman pastes a clipping onto his manuscript. (See below.)

Pasting that Extends the Writing Area

To represent the seam of the two pages that have been joined, use the (empty) milestone element. Include the unit attribute with the value "glued." This example shows a manuscript that calls for such markup.

Clippings Pasted to the Manuscript

Sometimes Whitman pastes others' material to his manuscripts. For example, in "Ashes of Roses," Whitman has pasted a newspaper clipping in the lower left-hand corner of the first leaf. In these cases, use the <add> element and its available "type" attribute (if a manuscript's peculiarity requires it, as when pasted-on material begins in the middle of an <l> and crosses <l> boundaries, you may also use <addSpan>). In this particular case, the author of the newspaper clipping is "unknown."

<add place="marginbot" hand="unk" type="pasteon">
Are we to have a National Hy<supplied source="Library of Congress transcription">
mn by <orig reg="Centennial">Cen-<orig></supplied> tennial time?
</add>

Please remember that hands other than Whitman's must be declared in the <profileDesc>.

3.13 Page Breaks

Page Breaks (<pb>) are inserted in the encoding whenever you begin the transcription of a new page (including the first one). You use <pb> tags in every document, even if they are only one page long. <pb> is an empty tag, which means that you never need to "close" <pb>, but just insert a "/" at the end of the tag.

Each <pb> tag has three required attributes, "corresp," "id," and "type". The "corresp" attribute indicates the file that contains the page image, so you'll need to assign, as its value, the unique id with a three-digit suffix that indicates the page number (.001, .002, etc)—the image files will be given this name before they are mounted on our site. The "id" attribute identifies the page by "leaf" (or piece of paper) number and side—"r" for "recto" (front) or "v" for "verso" (back). The "type" attribute classifies the page as either "recto" or "verso." The "id" value must always end in either "r" or "v"—even if there is only one image. When there is only one image, the "id" value will almost always be "leaf01r."

Here is an example of what the <pb> tag looks like:
<pb corresp="loc.00008.001" id="leaf01r" type="recto"/>

The first <pb> tag goes after the <body> tag and before the first <div> or <lg>. If there are multiple pages, i.e., more than one corresponding image, simply insert a <pb> at each place in the encoding that corresponds to the beginning of a new page. Often, these will occur at the close of one linegroup (</lg1>) and before the opening of another (<lg1>). Or, commonly, you will need to include a <pb> to indicate untranscribed verso material; this should be done after the <lg> or <div> closes but before the <body> tag closes.

How to Handle Unusual Document Order: In some instances, Whitman has written a single poem on the rectos of several leaves that also have poetic lines on the verso that are not part of the same poem. In this case, you must encode in a way that preserves the intellectual unity of the poem on the rectos. To do that, you will have to break the typical order of <pb> "id" values. That is, instead of "leaf01r," then "leafo1v," "leaf02r", "leaf02v", etc., encode the pages in an order that preserves the integrity of each poem. For example, if you have a manuscript with a poem written across the rectos of three leaves and other poetic lines written on the versos of leaves 1 and 3, the <pb> will have id attributes ordered like this: leaf01r, leaf02r, leaf03r, leaf01v, leaf03v. It is done this way to ensure that the material on the rectos of leaves 1-3 are all contained within the same <lg1 type="poem">

Remember: For every <pb> you insert, you need to insert an Entity Declaration in the header.

3.14 Encoding Corrected Proofs

Many manuscripts in various collections combine printed text and handwritten correction, as in this example. We have developed a an encoding procedure for these manuscripts that make distinctions between the two types of texts (printed and handwritten).

All "mixed media" mss of the kind described are assigned "prepub-proof" or "postpub-proof" as the value of the type attribute in the <text> element. Pre-publication proofs are the most common: typically, they are detatched sheets of paper with a typescript rendering of a poem. Post-publication proofs are what we are calling Whitman's hand-revised copies of published works. So, for example, if a copy of the 1876 edition of Leaves served, after being annotated, as the printer's copy for the 1881-1882 edition, we would call that a post-pub proof. Note that we are using the word "proof," in a way that is broader than is usual in the publishing world, to describe the proof-like functioning of a document.

The base text for these manuscripts is assumed to be printed, so we explicitly declare the medium of only the handwritten <add>s, <del>s, and <note>s.

We use the hand attribute on <add>, <del>, and <note> to record handwritten bits.

The handwritten "hand" is declared in the <teiHeader>'s <profileDesc> in the same way that we're already doing it for non-Whitman writing on mss. (E.g., when we have <note type="editorial" resp="ht" place="marginright"> for one of Traubel's ms. annotations, the <profileDesc> has <hand scribe="Horace Traubel" id="ht">). But instead of putting the elaborated description in the scribe attribute, we put it in an "ink" attribute.

Consonant with our practice of letting pass distinctions between colors of ink or between ink and pencil, all handwritten bits will share the same value for the hand attribute.

Whitman often changes the proof inline and also adds a marginal note, as when he adds a comma inline and puts "<," in the margin. In these cases, do not double-encode his corrections. Marking the addition inline is sufficient.

When Whitman notes the insertion of space with a "#", use <add> with the <space> element, noting whether or not it is vertical or horizontal space (in proofs it will most often be vertical), and noting the approximate size of the space, using "lines" as the measuring unit (relative to surrounding line-heights). An example would be:
<space dim="vertical" extent="1 line"/>

When Whitman uses a curled line to correct inverted letters or words, use the <transpose> element to note his re-ordering. For example, if you see in a corrected proof, you would encode it this way:
b<anchor id="t1"/>a<transpose hand="h1" anchored="yes" target="t1">e</transpose>rd

Whitman will often use specialized proofreading marks to note common changes, as when he uses a triple-underline to note his desire to capitalize an uncaptalized word. In these cases, you encode using an <app> structure and "edit" as the value of the type attribute on <add> and <del>. For example, if you saw , you would encode it like this:

<app><rdg varSeq="1"><del type="edit">l</del></rdg><rdg varSeq="2"><add type="edit">L</add></rdg></app>

We use a controlled vocabulary for both the value of the hand attribute on <add>, <del>, and <note>, and for the ink attribute on <hand>: ink="handwritten"; hand="h1". An example follows.

<profileDesc>
<handList>
<hand id="h1" ink="handwritten"/>
</handList>
</profileDesc>
. . .
<text type="prepub-proof">
<body>
. . .
<add type="insertion" place="supralinear" hand="h1"> . . .</add>
. . .
</body>
. . .

3.15 Encoding Prose

Even though we are focusing on Whitman's poetry, the manuscripts will sometimes contain prose that you will need to represent. You might encounter prose in Whitman's poetry manuscripts in a few different ways:

Prose notes about the verse on the same leaf:
Sometimes Whitman will have lines of prose on the same leaf that he has used for poetic composition. This is ususally deemed a mixed genre manuscript and requires the "poem notes" value for the <div1> type attribute. Occasionally, though, the prose will be a note about the poem. In that instance, consult the section in the guidelines about authorial notes.

Prose with imagery or language that was later incorporated into a poem, or prose notes about an idea for a poem:
For prose writings that relate to poetic work, you will simply encode the manuscript as a prose only document.

Prose unrelated to the poetic lines on the leaf:
Prose unrelated to the verse on the leaf, which almost always is on the verso of the manuscript, is discussed in the Unrelated Reverse-Side Writing section of the guidelines.

Transcribed prose is always encoded in the same way: the text is surrounded by tags and all line breaks and line segments are ignored.

3.16 Encoding Lists

Whitman sometimes made lists of words or phrases that led, in one way or another, to his poems; for example, this list has been traced to Whitman's poem "When Lilacs Last in the Dooryard Bloom'd."

To encode these lists, we use the <list> element, which contains the <item> element. The structure is pretty straightfoward: at the beginning of the list, open the <list> element; then, if appropriate, insert a <head> tag; finally, encode each item on the list with an <item> tag, which is nested within <list>. The following sample encoding is based upon the manuscript example above:
. . .
<list>
<item>sorrow (saxon)</item>
<item>grieve</item>
<item>sad</item>
<item>mourn (sax)</item>
. . .
</list>

3.17 Manuscripts That Are Neither Poetry Nor Prose

Encoding Lines that are Neither Verse nor Prose

Encoding Title Pages

Encoding Lines that are Neither Verse nor Prose

Occasionally, you will run across a manuscript with lines that appear to have all the layout qualities of verse (hanging indentation, initial capitalization, etc.) but otherwise seem like prose notes about an idea for a poem (for an example, click here). Rather than imprecisely calling such manuscripts either "prose" or "verse," we have developed a method that acknowledges the indeterminacy of the genre. Specifically, we use the "anonymous block," or <ab> element in place of either <l> or and mark line segments. Also, the <div1> type is the same we use for "mixed genre" manuscripts, "poem notes." Here's an example of how the encoding would look (click here to look at the manuscript):

<-- markup is simplified -->
<text type="manuscript">
<body>
<div1 type="poem notes">
<head type="main-authorial">incidents for (Soldier in the Ranks)</head>
<milestone unit="undeclared" rend="horbar"/>
<ab>
<seg>describe a group of men coming off the</seg>
<seg>field after a heavy battle, the grime,</seg>
[. . .]
</ab>

**IMPORTANT: We don't want to use this tagging indiscriminately; it should only be used when the editors cannot decide if material is prose or poetry. Therefore, if you come across a manuscript that you think fits the description above, please consult with either Ken or Ed before you use the <ab> markup.

Encoding Title Pages

If you encounter a manuscript that has only titles on it, like this one, please use this variation on the <ab> tagging described above (note the different <div1> type and attributes for the <ab> elements):
<-- markup is simplified -->
<text type="manuscript">
<body>
<div1 type="title notes">
<ab type="title" rend="underline">American to Old-World Bards</ab>
<ab type="subtitle" rend="underline">A reminiscence from reading Walter Scott</ab>
</div1>

**NOTE: Only use "rend='underline'" if the title is indeed underlined as it is in this example. Otherwise, omit that attribute.

3.18 Enigmas

What not to Encode

Things not Covered by the Guidelines

Dealing with Uncertainty

What not to Encode

Although we are attempting to accurately encode all of the important aspects of Whitman's texts, absolute comprehensiveness is not our goal. Specifically, we have decided, for now at least, to ignore

ink blots, smudges, and stray pen marks;

pin holes;

embossing;

variations in ink, pencil, or paper;

distinctions between single and multiple overstrikes

Things not Covered by the Guidelines

The encoding practices articulated in these guidelines have evolved over the last several years, and that evolution has mainly been driven by the needs of encoders, editors, programmers, and consultants as we have worked to create and deliver electronic transcriptions of individual manuscripts. When we first began, every manuscript presented a host of new challenges that had to be addresssed before the encoding could be completed. Naturally, the pace of change has slowed as the number of encoded manuscripts has grown. Even so, new puzzles continue to present themselves occasionally, and you should not be unduly disturbed if you find yourself in an encoding dilemma for which the guidelines seem to give no guidance. In such cases, you should write or call Brett Barney or Andy Jewell as soon as possible and explain as clearly as you can the nature of the difficulty. If the problem is indeed new, you may be asked to draft a summary of the issues to be sent to the members of the listserv.

Dealing with Uncertainty

Encoding manuscript materials is difficult for a number of reasons, and you will no doubt sometimes feel confused or indecisive. If the difficulty is one of reading or interpreting, consult with Ed Folsom or Kenneth Price. If, instead, the problem has to do with markup, consult with Brett Barney or Andy Jewell.

If the problem is relatively minor, in that it doesn't prevent your continuing to work on other parts or aspects of the manuscript, you might decide to handle the problem in the best way you know how and leave yourself and the editors a detailed comment—in the file itself—about the problem and how you've handled it. Then you and they can return to it later. Do this by writing the comment and then surrounding it with these characters:


This sequence of characters signals the computer's SGML processor to ignore everything that comes between the first and last marks.

Alternatively, you can write your comment and wrap it in a <what> element. This is an element we have borrowed from the Brown Women Writers Project, and it can be used in essentially the same way as the "commenting out" convention explained above.

3.19 Work Relationships and Date Information
*this encoding is typically inserted by upper-level staff people and editors*
As part of the header, we encode the relationship of the individual manuscript to a "work" (or "works"). As opposed to a document, which is a particular instatiation of a poem or book, etc., a "work" is the abstract idea of a poem or book, etc. We name the work according to the last instance published in Whitman's lifetime. For example, the work "Song of Myself" refers not to any particular manuscript or printed version of that poem, but to all of the versions collectively. Individual documents of that work include: the poem printed in the "deathbed edition," titled "Song of Myself"; the first, untitled version of the poem in the 1855 edition of Leaves of Grass; manuscript drafts of lines included in the poem; and notebooks that contain ideas and trial phrases that contributed to the composition of the poem.

We encode this work relationship at the beginning of the transcription file, immediately before the <teiHeader> element. The following example is taken from the transcription of this manuscript):

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE TEI.2 PUBLIC "-//UVA::IATH//DTD whitman.dtd (Whitman Archive)//EN" "whitman.dtd" [
<!ENTITY loc.00213.001 SYSTEM "loc.00213.001.jpg" NDATA jpeg>
<!ENTITY xxx.00358 SYSTEM "xxx.00358.xml" NDATA xml>]>
<TEI.2 id="loc.00213" type="doc">

<relations>
 <work entity="xxx.00358" cert="high">
 This manuscript is a draft of "Life and Death," which was published first in the New York
 <hi rend="italic">Herald</hi>, <date value="1888-05-23">May 23, 1888</date>.
 
 </work>
</relations>

<teiHeader>
. . .

To encode the work relationships, we must first look up the work ID. A table of work IDs can be found here in the reference section of the Encoding Guidlines. The ID, which is a string of characters beginning "xxx" and ending with a five-digit number, corresponds to a work file which will contain prose descriptions of the compositional history of the work as well as connect the transcription files with other elements of the Whitman Archive. The ID is inserted two places: in the entity declaration and as the value of the "entity" attribute in the "work" element.

In addition to this ID, project editors also assign either "high" or "low" as the value of "cert" to describe their confidence in connecting the individual manuscript to the work. For a manuscript with lines that are identical or very close to a published poem, the certainty will be "high"; notes that describe an idea in a way that bears a general resemblence to a published poem will get a "low" certainty.

The final part of the <relations> section is a brief prose description of the publication history of the work. This editorial note will be displayed along with the transcription on the site. If there is something distinctive and noteworthy about the manuscript, the editor may also insert a project note within the <noteStmt>.

A new <work> element, with a prose description, is used for every work related to the document. All dates within the prose description are tagged with a <date> element with a "value" attribute that records the date in the form YYYY-MM-DD, (or, if appropriate, just YYYY). Any titles that are normally italicized need to be marked with <hi rend="italic">

Dating the Manuscript

We have recently begun inserting information in the transcription files that will allow us both to sort manuscripts by composition date and to provide users with a note about that date. Here is an example, taken from the same manuscript transcription as the sample above:
. . .
<notesStmt>
 <note type="project" target="dat1">This manuscript was probably composed in the spring of
 <date value="1888">1888</date> shortly before it was published.
 </note>
</notesStmt>
<sourceDesc>
 <bibl>
 <author>Walt Whitman</author>
 <title>Life and Death</title>
 <date certainty="high" value="1888" id="dat1">1888.</date>
 <orgName>The Charles E. Feinberg Collection of the Papers of Walt Whitman, 1839-1919,
 Library of Congress, Washington, D.C.
 </orgName>
 <note type="project">Transcribed from our own digital image of original manuscript.</note>
 </bibl>
. . .

There are two major steps to inserting the dating information. The first is the insertion of the <date> element within <bibl>. Within this element we insert either the year of composition ("1888"), the year plus qualifying language ("About 1888"), a date range ("1867-1888"), or whatever brief description of the date is appropriate. (We have decided to only list the year in this space, even when month and day information is available.) The <date> element takes three required attributes: "certainty," "value," and "id." The value of "certainty" can be low (when we only have a conjectural date to offer), high (when we are fairly certain about the composition date), or absolute (when Whitman dated the manuscript or there is some other very conclusive evidence of its composition date). The value of "value" is a regularized date, written as "YYYY" or, in the case of a range, "YYYY/YYYY" (the first year represents the earliest year in the range; the second year represents the latest year). The value of "id" is always "dat1."

The second step is to insert a prose note within the <noteStmt> that describes the reasoning behind our dating of the manuscript. This <note> takes two attributes: "type='project'" and "target='dat1'" (required). The "target" attribute allows us to associate the dating information with the <date> in the <sourceDesc>. The prose note will display below the transcription, giving users fuller information about when Whitman wrote the manuscript.

Character	Function in Whitman	Unicode Number
=	Proofreader's mark for hyphen. WW sometimes uses "=" for compound words ("down=balls") and words split between two lines ("some=thing"). PLEASE NOTE that ‑ is used only when Whitman uses "="; if he uses the standard hyphen ("-"), just key it in.	‑
—	Longer dash e.g., "Not these—O none of these more" PLEASE NOTE that there should be no spaces before or after the dash, regardless of how the spacing appears on the page.	—
&	Indicates "and"	&
*	An asterisk	*
©	Copyright symbol	©
✓	Checkmark	✓
½	Used often in Bowers's system of page numbering	½
¾	Used to indicate the fraction, occasionally on manuscripts	¾
¶	Indicates beginning of new paragraph or a new line of poetry	¶
ñ	Spanish-language character, n with tilde	ñ
ó	An "o" with an acute accent mark (to capitalize, change to Ó)	ó
é	An "e" with an acute accent mark (to capitalize, change to É)	é
è	An "e" with a grave accent mark (to capitalize, change to È)	è
☞	A right-pointing finger	☞
☜	A left-pointing finger	☜
☝	An up-pointing finger	☝
☟	A down-pointing finger	☟

Value of 'rend' attribute	Function in Whitman
underline	indicates underscored text
circled	used when text, typically within a note in the margin, is surrounded by a circular line in order to separate it from other text
bracketed	used in <title> within the <titleStmt> to distinguish derived titles
italic	used only in transcriptions of printed material or in project notes to mark titles of books.
dotted	used with <restore>
indented1 indented2 indented3 indented4	added to <l> when Whitman uses staggered indentation at the beginnings of lines; the numbers indicate relative amount of indentation (1=shortest, 4=longest)
horbar-full horbar-short-right horbar-short-left horbar-short-center	used in <milestone> to indicate various positions and lengths of horizontal separators