Skip to main content

The Walt Whitman Archive at Ten: Some Backward Glances and Vistas Ahead

Whitman famously said, "Missing me one place search another, / I stop somewhere waiting for you." These days he seems to be waiting everywhere: a recent Google search yielded 805,000 hits for the phrase "Walt Whitman"; Yahoo! claimed over a million. The Walt Whitman Archive was the first hit in both searches; also highly-rated were the Library of Congress's site devoted to the recovered notebooks and the homepages of the Walt Whitman Arts Center and Walt Whitman High School. I wonder what sort of content is featured on some of the lowest-rated sites, too. Unfortunately, both search engines link only to the first 1000 sites. Of course, there's plenty of variety in the one-tenth of one percent of the sites that are linked. We can view photos from a "walking tour of" the poet's "old haunts in Manhattan," read an essay on "Walt Whitman, Prophet of Gay Liberation," find out the latest activities of the Walt Whitman Sailing Society, or download seemingly countless term papers (including a "6 page analysis of 'Leaves of Grass'"). On the web Whitman is interpreted, summarized, marketed, defended, and vilified. Some sites provide miniature lessons in collecting Whitman or in History of the Book scholarship; some use his image to sell clocks, lapel pins, coffee mugs, and mouse pads; others invoke Whitman in the creation of original art. On the web we find poems, paintings, and engravings inspired by Whitman. One of the more audacious artistic uses of Whitman is the Flash animation "Walt Whitman" by performance artist My Robot Friend, in which a techno-pop song with a raucous beat is joined with flying images of Whitman, excerpts from "Salut au Monde!" and photographs of nude teenage boys. And now we have new possibilities for Whitman studies with the Mickle Street Review . That is, with an online journal we have opportunities for born-digital critical and creative responses to Whitman, responses that can be expressly designed for web presentation and that take full advantage of the medium.

The web serves many functions. It creates fresh opportunities for expressiveness; makes locating an elusive quotation relatively easy via simple string searching; and allows ordinary people to participate in critical debates. This is all valuable. But there is also an extraordinary amount of junk on the web. Typically what appears on the web is idiosyncratically built and irregularly maintained. Stuff moves around, stuff vanishes, stuff can't be trusted: it's a questionable environment, at best, for serious scholarship. Little of the academic material on the web has undergone peer review. Yet online publication has advantages, too, and I hope to clarify some of them in these reflections on the Whitman Archive as a scholarly research tool.

The Whitman Archive ranks highly on Google because we were an early web presence—early adopters seem to have a huge advantage in Google rankings. Ed Folsom and I have described the Archive as a "large electronic research and teaching tool that sets out to make Whitman's vast work, for the first time, easily and conveniently accessible to scholars, students and general readers." The Whitman Archive was begun in 1995 and has had a long-standing affiliation with the Institute for Advanced Technology in the Humanities at the University of Virginia (IATH). A large team (located primarily at the University of Iowa, University of Nebraska-Lincoln, University of Virginia, Duke University, and the University of North Carolina-Chapel Hill) is attempting to edit the poet's voluminous printed materials and his chaotic and radically dispersed manuscripts. We're attempting to edit Whitman with the same rigor that characterizes the best examples of print editions while grappling with the demands of complex and rapidly changing new technology. This year of anniversaries—Leaves of Grass is 150 and the Archive is 10—provides an occasion to summarize our progress thus far, describe some of our plans and hopes, and offer ways to contextualize Whitman on the web. After a decade of work, we have completed only approximately one-fourth of the work we have outlined.

The project began fairly early in the development of the web. Tim Berners-Lee had been creating HTML, HTTP, and the first web pages for a couple of years at CERN, a particle physics laboratory, when, in August 1991, he first publicized his new World Wide Web project. In 1993 the Mosaic web browser 1.0 was released, and soon public interest was sparked. That same year, 1993, I had a brief conversation with a colleague at Texas A&M, Jimmie Killingsworth, about the possibilities of electronic media for presenting the multiplicity of Whitman's texts. Ed Folsom was also attentive to the implications of new technical developments. When he drafted his 1994 essay "Prospects for the Study of Walt Whitman" he included a discussion of the possibilities of electronic editing. None of our musings took concrete form, however, until after I had moved to the College of William & Mary. One day in 1995 Charles Green and another graduate student, David Donlon, strolled into my office and talked about dreams of editing Shakespeare or Whitman or something big and vast on the web. As students in a class with Terry Meyers, they had just visited IATH at Virginia, and they were converts. I tried to dampen their enthusiasm by telling them about the multiple levels of difficulty we would confront (only some of which I recognized at the time), and I tried to convey a sense of the magnitude and complexity of the Whitman corpus. We decided nonetheless to go forward, and if we hit a wall, we'd stop. David Donlon soon lost interest in the project, but Charles Green grew ever more excited, and his dedication was invaluable in those early days. We began to build what we were then calling the Walt Whitman Hypertext Archive (we later dropped the word Hypertext when, by about 2000, the word itself had come to seem both overly "hyped" and dated). Charlie and I were working with little technical expertise, little guidance, and very little access to the network—the English department at William & Mary had only one room, a kind of glorified study carrel that had network access. Thirty-some faculty members and a dozen or so graduate students shared a single network connection. Before long, however, we had the skeletal structure of the Whitman Archive developed. This was phase one of the Archive, appearing with a burlap background (what we've come to refer to simply as the brown site). This was our look from 1995-2000.

One key early decision, made just a couple of months after we began work, was to contact Ed Folsom and suggest that we get together at a conference to talk. I knew the project would be stronger if it could be collaborative across institutions, and I couldn't think of anyone better than Ed to help envision its possibilities and help direct its development. Having two established scholars involved would help our professional credibility—not a small matter when working in a medium that even now struggles for acceptance in the academy (witness, for example, the resistance of tenure and promotion committees to online work and the MLA's refusal to include online archives in its bibliography of scholarship).

The project began with no funding whatsoever. In those stone soup days we worked opportunistically with files we happened to have in electronic form: Ed had placed online all of the photos of Whitman for storage, so we built a framework for displaying them. I had recently published a book, Walt Whitman: The Contemporary Reviews, and, since this material was all out of copyright and Cambridge University Press was willing to provide me with the electronic files, we were able to put the reviews online quickly too. Meanwhile we began to publish electronic versions of Leaves of Grass, providing both page images and text.

We developed an HTML site, what I'd now call a prototype for a serious site, though at the time we didn't think of it as a practice run. As is now widely recognized, HTML is not well-suited to scholarly purposes. There are many problems with HTML: one key problem is that it is a "display-descriptive" markup language that tells a web browser whether to make something italic or 14 point type or blue but does not declare what the structure of a text is. To render something in italic is not to say that it is a foreign word, or a word of emphasis, or a title. If you don't declare what a thing is (rather than how it looks), you can't retrieve it in searching, you can't easily compare it to other things of the same kind, and you can't redisplay it in a different way for a different purpose. Our first undertaking did not rigorously adhere to best practices in humanities computing, nor were we much attuned to international standards. There were also problems with the navigation of the site. For example, once you entered deeply into, say, the Works section of the Archive, there was no quick way to move to another section. You had to back out via several clicks of the back button because there was no navigation bar present on all the pages. On the positive side, we made a fair amount of content available, and despite flaws behind the curtains, we gained positive publicity in the Chronicle of Higher Education and Washington Post.

In 1996 Primary Source Media, a commercial publisher, approached us with plans to accomplish in roughly a year what we had planned to do much more gradually. They could invest in the project resources that we didn't have. They were interested in producing a marketable product quickly rather than a painstakingly accurate archive—full of annotations, introductions, and other scholarly features—more slowly, as has always been our inclination. We decided to work with Primary Source Media, and the result was a useful product, Major Authors on CD-Rom: Walt Whitman. The CD was marketed primarily to libraries and is now no longer being sold. There were some important consequences from this undertaking. We persuaded Primary Source Media to donate the out of copyright texts of Leaves of Grass to the Whitman Archive by arguing that their sales would be based on making available, in searchable form, the New York University Press Collected Writings of Walt Whitman. The Primary Source Media alliance has been invaluable in our editing of Whitman's manuscripts.

Another key step in our development occurred in 2000, just before I moved to the University of Nebraska-Lincoln, when we received our first NEH grant. We made a conscious decision at that time to add only structured data to the site and to gradually redo older HTML parts of the site, as time allowed. With the assistance of Rob Nelson and later Brett Barney, the second phase of the Archive was initiated with the development of the blue site. The new interface significantly improved navigation within the site.

As of April 2005 we have all six American Editions of Leaves of Grass completely transcribed and posted as XML files. (XML—or more specifically the Text Encoding Initiative implementation of XML—is the de facto international standard for serious humanities computing projects.) Users can access the entire file or particular chunks—individual titles or clusters. Each page of transcription is accompanied by a reproduction of the text in facsimile, allowing users to check our transcriptions and to study what Jerome McGann calls bibliographic codes, the way a text makes meaning through non-linguistic textual features such as margins, typeface, ornamentation, and so on. When time allows, we expect to add introductions to the various editions that will provide information on the composition and reception history, variations among different issues of the same edition, and explanations of key features that our work has uncovered. For example, it was the need to categorize the material on the page in XML encoding that led us to the realization that what had long been regarded as twelve untitled poems in 1855 were not in fact untitled.

We also intend to make the British editions of Whitman's poetry readily available. In fact, they have been available for quite some time, though they've been relegated to the back scenes. Some of you may know that the brown site featured William Michael Rossetti's Poems by Walt Whitman and Ernest Rhys's edition of Leaves of Grass. These volumes were originally contributed to the Archive by Ed Whitley. We are in the process of adjusting Whitley's valuable work so that it conforms to the current encoding practices of the Archive. The Rossetti edition has been converted to XML and will be mounted in the coming months, after being thoroughly vetted. The Rhys edition is also in our future plans, though that conversion work is not yet scheduled.

In addition to these American and English editions of Leaves of Grass, we have gathered and processed approximately 4000 high-resolution, archival quality image files of Whitman's poetry manuscripts. These working drafts are documents of rare importance. By the end of the summer we expect to have a complete archive of images. One of the surprising facts is that after a half century of work on the NYUP Collected Writings and after a great mass of peripheral material had been meticulously edited and annotated, Whitman's poetry manuscripts remain unaccounted for in that print edition—left uncollected, neither transcribed nor annotated, not even listed anywhere. We feel fortunate to be the editors who are able to give these documents sustained attention for the first time.

Of course having digital images is one thing and transcribing, encoding and annotating them is another. We have a manuscript tracking database that helps us keep track of the flow of work through various stages: transcribed and encoded; checked; edited; and "blessed"—the last term meaning that a text has gone through every stage of checking, conforms to our project's highly articulated encoding practices and displays properly on the website. This process is slow and painstaking because careful transcription of messy documents is by its nature time-consuming. Moreover, we are dealing with unusually complex material for web presentation. We have a growing list of publicly available transcriptions of poetry manuscripts, currently numbering over 80. These are fully transcribed, encoded, and proofread multiple times. An additional twenty-three manuscripts have been completed; we are withholding them until we can fix bugs that keep them from displaying properly. Approximately 250 additional manuscripts have been completely transcribed and encoded and are now in various positions in the pipeline as we do the methodical checking before public presentation.

Most of the manuscripts we've edited thus far are brief, rarely more than one or two leaves in extent. But there are important exceptions. Whitman's notebooks constitute a special category of manuscripts. A team at Iowa headed by Ed Folsom is transcribing and encoding all of Whitman's notebooks. The incomparable "earliest" notebook has been completed and is now being vetted. Meanwhile Andy Jewell of the University of Nebraska Libraries is working on what may be the single longest Whitman manuscript, the so-called "Blue Book," Whitman's annotated copy of the 1860 Leaves of Grass that he used while preparing the 1867 edition. This item famously cost Whitman his job in Washington, and it is an artifact of fascinating complexity, with its many annotations, deletions, multiple changes of mind, and tipped in passages. This document raises a host of editing and technical questions. Even at this early stage of the work, we have uncovered dozens of ways in which Arthur Golden's "facsimile" published by New York Public Library differs from the original document.

In order to gain bibliographic control of our manuscript editing project we developed, thanks to funding from the Institute of Museum and Library Services, an "Integrated Finding Guide to Walt Whitman's Poetry Manuscripts." We are creating, in partnership with many institutions, a comprehensive guide to all Whitman manuscript materials, one place where a scholar, or any user, can go to search through the all of the documents describing the manuscript holdings of dozens of repositories and find exactly what is needed. In doing so, we are demonstrating how Encoded Archival Description (EAD) can pull together dispersed collections to create a single, scholarly-oriented view or collocation of the materials. We are also addressing an unresolved issue in digital scholarship, namely how to integrate all of the digital representations of intellectually related materials—Text Encoding Initiative XML transcriptions, JPEG images, and EAD item descriptions, for example. It may be worth mentioning that our project has a history of re-tooling standards to do what we think makes most sense for Whitman scholarship. For example, our creation of an integrated guide, rather than focusing on a particular actual collection or collections, represents another example of how our work stretches the capabilities of international standards.

Significant progress has been made on additional areas of the Archive as well. We have added a section with texts by Whitman's disciples, with plans to present many of the key contemporary accounts written by those closest to the poet (sometimes with the poet's own active involvement). This effort is being headed by Matt Cohen of Duke University, in consultation with the team at Nebraska. The Disciples area currently offers biographical sketches of three of the most important figures—Horace Traubel, John Burroughs, and William Douglas O'Connor—as well as some of their Whitman-related works. Development of the Traubel section of this part of the Archive is proceeding quickly; the transcription and encoding of volumes 1 and 4 of Horace Traubel's 9-volume With Walt Whitman in Camden has been completed, and a preliminary interface has been designed. Volume 1 is now live on the site, and volume 4 will be posted soon. If Matt Cohen's schedule holds—and so far he has moved along at an impressive clip—all nine volumes will be done within the next two years. This will be a great advantage for scholars who have long wished for a way to locate the nuggets hidden in these poorly indexed volumes.

Work also continues on other fronts: periodicals, interviews, reviews, and images of Whitman. Susan Belasco, my colleague at the University of Nebraska, has made significant strides in presenting the periodical printings of Whitman's poetry. Over the course of his career, Whitman published approximately 150 poems in over 40 different periodicals. Studying these poems as they first appeared in print has not been a practical possibility for scholars. Soon it will be. In addition, Brett Barney of the University of Nebraska is working to develop a new section of the Archive devoted to interviews done with Whitman, gathering for the first time these important documents published mostly in newspapers. Transcription and encoding of three dozen of the approximately fifty known interviews has been finished and will be added to the site later this summer as the first installment in this section of the Archive. Here as elsewhere the effort to divide Whitman's work into categories leads to difficult questions: should the Traubel material all be regarded as one big glob, a four-year, 5,000 page interview? A further puzzle: does an interview require the involvement of at least two people? That is, do we have an interview when we encounter a manuscript in Whitman's hand offering up both the questions and the answers? Charles Green at the University of North Carolina at Chapel Hill is working with me to update the Contemporary Reviews section of the Archive. This involves both transforming the encoding from HTML to XML and adding new content: we are transcribing and encoding more than a dozen newly discovered reviews for inclusion. Ed Folsom's team is gradually updating the section on images of Whitman, replacing lower quality reproductions of photographs with higher quality images and adding new information to the annotations. We are also in the process of making this part of the site searchable.

Folsom and his team at Iowa continue to regularly update the bibliographic database, an extraordinary resource which now holds approximately 5,000 records. The bibliography could well be expanded not just forward as new material is published but backwards as well. All of the core bibliographic data in Scott Giantvalley's bibliography from 1838 to 1939 and Donald Kummings's bibliography from 1940-1975 could be added to the site. The entries themselves are facts and are not protected by copyright; the annotations, however, are under copyright. One task for the future is to approach Kummings and Gale to seek permission to publish the annotations along with the entries. Giantvalley has passed away, so what we can do about his annotations is an open question.

The Whitman Archive presents a modest amount of recent criticism: currently we display 80 entries from the Walt Whitman Encyclopedia, two essays by Martin Murray, and an account of the controversy over the sequence of love poems known as "Live Oak, with Moss." Most recent criticism is entangled with copyright issues, so rapid development of this part of the site is unlikely. There are some opportunities, however. In the future we'd like to make available all back issues of Walt Whitman Quarterly Review, for example. We also plan to offer online some full-length critical books for which we have secured copyright. We'll start with books written or edited by the Archive staff. A book Ed Folsom and I have forthcoming, Rescripting Walt Whitman: An Introduction to his Life and Work, will appear, as will my own To Walt Whitman, America. We'll also add Rethinking the 1855 Leaves of Grass, a volume emerging out of the Nebraska sesquicentennial conference held March 31-April 2, 2005. I expect we will want to present additional books as time, money, and copyright allow. My advice to Whitman scholars would be to hang on to your electronic rights. We rarely need to give up our right to publish our own intellectual labor. Scholars need to get in the habit of not giving away copyright on their work so that we can realize together what Whitman can be on the web: if he is going to have the expansive web presence many of us would like to see it will be because our community of Whitman scholars makes a concerted effort to keep the texts and criticism open, accessible, and free or, failing that, as affordable as possible.

What will be some of the key challenges for the Whitman Archive in the future? Sustaining ourselves financially is crucial. This work won't progress adequately unless we are able to keep our talented staff. The challenge is to pay people when we offer a "free" site that brings in no revenue. It's good to remember that the site is free to the end user, but it certainly isn't free to produce. Hardware, software, digital scans, travel, phone, consultation with technical experts—not to mention salaries—all add up. We have thus far managed to develop the Whitman Archive via the strong support of several universities and the generosity of federal funding agencies and one private foundation. Whether we will be able to sustain this economic model into the future is an open question. One promising development is that the University of Nebraska is mounting an effort to build a permanent endowment for the Archive. We hope that this will be successful and that an endowment will free us from subsisting year-to-year on a patchwork of temporary funds.

We've now been five years in phase two of the Archive, 2000-2005, and again arguments for a redesign are becoming compelling. The reason for even considering a redesign boils down to the difference between frames and tables. In the current view, one of our typical pages is actually made up of four distinct frames. The frameset is stable, allowing the navigation bar to remain in one constant place on the screen as the text scrolls down. But there are disadvantages to this design. Printing is a problem since a printer ordinarily will print a sheet for each frame. Searching is an even bigger problem: a person who uses an internet search engine to find, say, Pfaff's at the Whitman Archive will get the page shorn of the navigation bar. Another problem is that all the poetry manuscripts, or all the reviews, have the same URL, thus making it difficult for people to cite a particular spot in the Archive. (There is a way to get a more specific URL by right clicking, but most users are probably unaware of this possibility.)

Most of these problems could be resolved through the use of tables, which are becoming a more common way for large web sites to deal with similar display issues. Each page will print more easily, have a unique and visible URL, and will overall be more easily navigable. Older browsers do not support all of the newly developed features of tables, such as having an unmoving header that stays at the top of the page as the user scrolls through the material. We are currently trying to craft a layout that will improve upon our current one for users of newer browsers without sacrificing navigability for those with older browsers. This difficulty of design is one that we consistently face: what sorts of technological capabilities can we reasonably expect from our users? It is important to move our design forward with technology and tastes to keep current. But at what point can we fairly put the onus on the user to keep up with developments in browsers and monitors without alienating the very people we hope to serve?

What are the areas of greatest potential for the Archive? A good goal for us, I think, is to make the Whitman Archive not only a rich resource but an enabling interpretive tool that advances how analysis itself is done. We need to find a way to trace poetic change as it ripples through versions, to make this visually clear. Exactly how this will be accomplished remains unknown. I can foresee how some kinds of visualizations could work. For example, I think we could take clues from an internet site called "Name Voyager" to trace changes in Whitman's diction from the 1840s to 1890s. What editions have the most proportional use of key words—say comrade, black, lover, slave, bowels, soul, onanist, Walt, tabouschnik, mossbonkers?

We'd like for the site to enable and to promote interpretations that hadn't been possible before. We've wondered about the possible use of Geographic Information Systems or GIS. We'd start with a base map with detail down to the block level. Period maps no doubt exist for DC, New York, Brooklyn, Philadelphia, and New Orleans. This idea also appeals to me because of my academic place, the University of Nebraska. William Thomas, formerly the director of the Center for Digital History at the University of Virginia, will be joining our faculty next fall and bringing with him a desire to develop GIS in ambitious new ways. We also have on the faculty Kenneth Winkle, a distinguished Abraham Lincoln scholar who does community-based analyses of Lincoln. There may be an opportunity for a collaborative development of a GIS-based resource on "Whitman, Lincoln, and Civil War Washington." Certainly new discoveries will emerge once we can ask different questions because of having a great deal more information from census records, maps, health records, police reports, possibly even information on sexual subcultures, and so on. For example, we could produce a visualization of where Whitman's soldiers were from or a visual dramatization of the idea that Whitman saw America by walking the wards of the hospitals.

What are the greatest weaknesses of the Archive? When we ask where and how the NYUP Collected Writings will live, we have a predictable and more or less reassuring answer: on library shelves. All we know for certain about such projects as the The William Blake Archive and The Complete Writings and Pictures of Dante Gabriel Rossetti: A Hypermedia Research Archive is that they must live and thrive simply because they are too valuable to let die. None of us, though, can yet say with any certainty under what conditions some of the early breakthrough projects in humanities computing will survive. Fortunately, the library community is deeply committed to overcoming the challenges that are presenting themselves. I believe and hope that the loss to the humanities would be just too great to let major projects die. Still, in all soberness, we have to realize that this is an experimental time, and what may happen in the future is anyone's guess. It has seemed to us nonetheless that these experiments had to be undertaken, because Whitman, like Rossetti and Blake, presents intractable problems for conventional editorial approaches. His work has characteristics—for example, an obsessive, seemingly ceaseless revisionary effort that has been called a "fluid text"—that despite gargantuan editorial efforts have resisted adequate representation in conventional print formats.

The Whitman Archive is providing a complete record of the "American bard," thus giving the general public and scholars at all levels the opportunity to read and study the work of this central spokesman for America. The Archive devotes to Whitman the type of care that has been bestowed on writers who are cultural icons elsewhere—Cervantes in Spain, Shakespeare in England, and Goethe in Germany, for example. Like Whitman, these writers are fundamental to their cultures. Whitman's challenging and provocative work is a major cultural resource, deserving to be widely and freely available in reliable scholarly texts. The Archive has begun the unprecedented process of providing free access, via the world-wide web, to the entire corpus of a writer who deepens and enriches our sense of who we are and of what we can become.

Whitman used pens and pencils, paper and magazines, type and books to create Leaves of Grass. In the last 140 years we've used the same implements to study him. But in the last ten years we've brought other tools into play, and we should reflect on the consequences and implications of those tools. We are only beginning to grasp how electronic technology will impact our study of Whitman, how elements which are now invisible might become visible. At the Whitman Archive, we hope, paradoxically, that the place for rich exploration of Whitman in future years will be no place at all, but the fluid, expansive, exploratory realm of the web.


Notes:

1. Ed Folsom and Kenneth M. Price, "Introduction" http://www.whitmanarchive.org/introduction/. [back]

2. This will part of The Aurora Project: A Dynamic Atlas of American History. This project is a multi-institution partnership, currently planned between UVA and UNL led by Ayers and Thomas, to develop both new model scholarship in digital form and whatever tools and techniques are needed to create dynamic visual models of large-scale social processes in history. Aurora, Latin for "dawn", takes its name from two sources of inspiration: first, the Revolutionary era Philadelphia newspaper edited by William Duane that represents the social consequences of dynamic information, and second the "northern lights" and its effect of dynamic illumination. [back]

Back to top