Chapter 5Digital editing as autopoietic process

Jon deTombe

doi:10.16995/dscn.4

The guidelines of the Text Encoding Initiative are the 800-pound gorilla of digital textual editing. While the guidelines expand and adapt to meet the needs of editors, the hierarchical nature of the TEI, wherein the text is demarcated into an ordered hierarchy of content objects (OHCO) has long been critiqued. The editor of any digital project faces a choice between the openness and accessibility that the TEI provides as the standard method of textual markup, and the precision offered by the development of a proprietary system. Unsurprisingly, the vast majority of textual projects in the digital humanities deploy TEI-informed XML for editing and describing manuscripts and printed objects. One of the strengths of the TEI as a system of textual description is that it enables one to theorize many potential forms of a document in various states at once. Further, the transformability of which XML is capable allows for multiple instantiations and representations of the electronic text to function concurrently and/or consecutively.

One of the most potent criticisms of the TEI as a means for describing documents is that it is deterministic: it forces upon a text a set of a priori categories and thus largely determines how the text is understood. For Jerome McGann (2014), the poietic functioning of the original document (i.e. the potential for meaning-making implicit in the textual condition) is therefore suppressed in the resulting electronic text. My contention is that critiques offered by scholars such as Jerome McGann can be answered in part by modifying one's conception of the text's poietic frame. If the entire process of editing, from the initial transcription of the inscribed print or manuscript object to the end-reader's engagement with the electronic text, is conceived of as an autopoietic system (that is, a self-producing system), the act of marking the text according to the TEI standards becomes but one stage of a process that is repeatedly questioned and open to revision. By conceiving of the system as editorial interaction with successive iterations of the text, rather than with a static set of textual elements and bibliographic codes, the editor is made aware of the inherent textual and linguistic ambiguities of the source document and is forced to read actively and engage these ambiguities in pursuit of an adequate electronic text. The autopoietic functionality of the text is retained in the iterative nature of the process, in that iterations of the text are read and understood in the light of previous and subsequent iterations of the text. The rendered text becomes the cumulative product of all the decisions that led to the particular iteration and is representative of the numerous iterations that preceded it and surround it. These ideas will be demonstrated with examples from my work in digitizing a manuscript of records belonging to William Courten, a 17th-century naturalist and collector. The challenges encountered in the transcription and marking-up of the text and its idiosyncratic cipher illustrate the critiques offered by others, but also show how a processual model of editing can answer these critiques.

Processing critique

Disagreement about the processes at work in the act of digitization—the transmitting of textual content from a material document into a digital medium—has engendered numerous debates about the nature of the work being performed upon "the text" and the meaning of that work (see Robinson (2009) and Hayles (2003)). Though such discussions deal with the subject in abstract terms, the practical implications of the resulting theories directly influence the constitution and possible functionality of the digital text. The manner in which one prepares the text, then, reflects the theoretical assumptions under which one operates. These assumptions in turn define and limit the manner in which the reader will be able to interact with the resulting text. Critiques of the TEI Guidelines frequently come in two forms: i) the theoretical objection, based upon abstract notions of bibliographical ontology and how these notions are or are not served by the TEI and ii) the practical response to challenges encountered in deploying the TEI in transcribing a particular text when the Guidelines prove inadequate to the task at hand (for critiques see Buzzetti (2002) or Buzzetti and McGann (2006); Robinson (2009) and Hayles (2003) also discuss the history of critique of the TEI Guidelines). Reports of the second type of critique are helpful for seeing the practical work-arounds and improvised solutions to which other editors have resorted. These, in turn, can be productively read in correlation with the theoretical critique.

The idea of processual editing as a means of resolving the challenges faced when deploying a TEI schema has been suggested by Domenico Fiormonte, Valentina Martiradonna and Desmond Schmidt (2010). Their work in digitizing the manuscripts of Italian poet Valerio Magrelli (b. 1957) has revealed the "complex multidimensional and interactive reality of the process of composition" and, by "extension, editing" (Fiormonte, Martiradonna and Schmidt 2010, par. 1) Accordingly, they suggest, "a literary work can be regarded not merely as a product but as the result of a dynamic process of interaction between several factors" (Fiormonte, Martiradonna and Schmidt 2010, par. 1). This is an insightful comment, locating the complexity of the textual process (writing in this case, though it also applies to reading) in the interaction between identifiable elements of the text. The completed text, presumably, includes the reader as an interacting factor, thus increasing the complexity inherent in the interactions that must occur in the poietic functioning of the text.

Fiormonte, Martiradonna and Schmidt (2010) recognize that the complexities of the make-up of a text-bearing document are scarcely realized until one begins the process of digitizing and editing it. With the recognition of the difficulties of digital representation comes the realization of the limitations of the system being used. They ask, for example, "Can the multidimensional and pragmatic nature of different writing stages/sketches be represented with the help of digital elements?" (Fiormonte, Martiradonna and Schmidt 2010, par. 2). For their project, it was significant that in the TEI Guidelines, the system of markup "cannot represent the chronological sequence of corrections" to the text (Fiormonte, Martiradonna and Schmidt 2010, par. 20). Representing, as they must, the chronological strata evident in the Magrelli drafts, they were forced, due to the limitations of the TEI, to represent the differences in time "only with great effort or with an unacceptable level of imprecision" (Fiormonte, Martiradonna and Schmidt 2010, par. 35). The TEI, in their experience, with its necessary reductiveness, is inadequate for marking differences in time. Faced with the difficulty of representing draft additions and alterations temporally, they were unable to reconstruct in their electronic edition the manuscript's chronological development. They conclude that the sense of variance and instability present in the interpretation of the act of composition cannot be accommodated by the TEI or, indeed, any other system that makes similar textual assumptions.

Patricia Bart writes about similar challenges in her work encoding a transcription of the Huntington Library Hm114 manuscript for the Piers plowman Electronic Archive. The particular needs of this "highly eccentric" manuscript, she says, are not well-served by the existing models, which presuppose "that manuscript features will fall into categories that are already well known and agreed upon in advance" (Bart 2006, par. 2 and 11). Markup, in this sense, "is seen more as a means of display and delivery of information of already known significance than as a tool for gradually discovering significant patterns as they emerge" (Bart 2006, par. 2 and 11). TEI-based markup, she suggests, could and should be used as an interpretive tool. Faced with an otherwise unresolvable difficulty—a text that cannot be adequately marked using the then-current TEI Guidelines—Bart's solution was to innovate, acknowledging the provisional nature of her TEI deployment and instinctively enacting a processual editing practice. To fulfill this model of TEI use, she calls for a new attribute that tags a speculative marking. This attribute would, she suggests, "foster rigorous experimentation within a controlled context" (Bart 2006, par. 38). Such experimentation, she adds, makes the process of editing into an education in reading (Bart 2006).

I have touched on these two examples because, when faced with very different issues, the editors of both projects reached in a similar direction—toward the provisional, the processual, and the iterative. Further, these two examples, drawn from the practical challenges faced by editors in practice, point to the difficulty that the TEI Guidelines will perpetually face in making generalized provisions that are to be applied to particular situations.

Markup as autopoiesis

McGann's (2014) critique of the TEI Guidelines in his essay "Marking texts of many dimensions" is based on theoretical objections, but correlates with the practical critiques outlined above. In the essay, McGann attempts to theorize a ludic method (one that makes possible spontaneous or playful interaction) of digital textual markup that more closely emulates the process of reading than the TEI currently allows. Using the language of systems theory, he suggests that a text operates as a poietic system that is always in the process of producing meaning, where the nature and content of this meaning depend on the constitution of the textual system and the input the reader brings to the whole. A system that functions by producing itself, whether as a derivative iteration or a functional duplicate, is called autopoietic. McGann also invokes this metaphor in other books and essays. For example, in The textual condition, "books" are described as "autopoietic mechanisms operating as self-generating feedback systems that cannot be separated from those who manipulate and use them" (McGann 1991, 15). Thus the interactions of the reader (who functions as an essential component of the "system") with the book produce an output that, to McGann's thinking, is one possible realization of the book as a system. In contrast, an allopoietic system is one that functions to produce something other than itself. In McGann's analysis, the reductive effect of an a priori system such as the TEI functions allopoietically. Without any possibility of an unpremeditated encounter with the text, poiesis must depend upon the system from without.

McGann (2014) emphasizes the importance of the concept of text-as-system in his discussions about markup and digital-bibliographic ontology (for an excellent overview of the issues related to markup, ontology, and digital textual encoding see Renear (2004)). Textual models such as OHCO are inadequate, he argues, because "traditional texts are riven with overlapping and recursive structures of various kinds, just as they always engage, simultaneously, hierarchical and nonhierarchical formations" (Buzzetti and McGann 2006, 62). These overlapping structures are excised in models such as OHCO, limiting the ability of the electronic text to function as a printed text would. McGann does not suggest that the OHCO thesis is in itself incorrect, but that it is adequate only for some purposes: "Hierarchical ordering is simply one type of formal arrangement that a text may be asked to operate with, and often it is not by any means the chief formal operative. Poetical texts in particular regularly deploy various complex kinds of nonlinear and recursive formalities" (Buzzetti and McGann 2006, 62; see also Dean Irvine's treatment of tensions between TEI markup and representation of the modularity of modernist magazines in chapter 3 of this volume). OHCO is too limiting a model, he suggests, because the reader's interaction with a printed text is not confined to any hierarchical arrangement of content objects. This being so, the intentional marking-up of a digital text—the defining of its constitution—should better reflect the natural interaction that one has when reading its analogue exemplar. This interaction is, according to McGann, non-hierarchical, non-linear, and marked throughout with the preservation of ambiguities inherent in the use of language.

The argument that McGann offers in these essays represents a strain of legitimate critique against hierarchical markup systems such as the TEI. In the years since "Marking texts" was first published, however, TEI-informed XML has become entrenched as the de facto standard method of markup in digital textual studies. While the TEI Guidelines continue to develop in answering the difficulties and challenges encountered by editors, McGann's essay retains, to my reading, value as a critique. Acknowledging, then, the limitations of the TEI and its OHCO underpinnings, I propose that the multi-dimensional engagement in editing that McGann envisions can be achieved through other means. By conceiving of the entire process of digitization, from transcription to representation, as an act of specialized reading and thus an autopoietic system, the reductive tendency of TEI-informed XML can be mitigated, destabilizing the hierarchical rigidity of the OHCO thesis. The topological perspective that McGann proposes can thus be realized as a process in time rather than just space. As such, the marking of the digital text, which is the encoding of a particular theory of that text, becomes but one potential outcome of the process of digitization. The rigid structure imposed a priori on the text by the TEI is placed in the context of the workflow and decisions that prompted its use. With such a view, editing becomes a heuristic and nonlinear task, one that allows for the preservation and accommodation of ambiguities and complex linguistic structures.

Theorizing process as poiesis

In elaborating his definition of autopoietic systems, McGann (2014) turns to the originators of the theory, Humberto Maturana and Francisco Varela: "If one says that there is a machine M in which there is a feedback loop through the environment so that the effects of its output affect its input, one is in fact talking about a larger machine M1 which includes the environment and the feedback loop in its defining organization" (Maturana and Varela 1980, 78). While McGann's definition of textual autopoiesis focuses on the output of the system that occurs in the act of reading and the relation of that output to the system itself, Maturana and Varela's definition suggests that the nature of the system is defined by its input and the relation of input to output. The autopoietic output is fed back into the system as input, making the system cyclical. Its organization is closed and, as a system, it is stable in its operation.

Building on Maturana and Varela's (1980) model, the entire process of digitization and editing can be understood as an autopoietic system of textual production wherein each iteration of the text functions by producing a subsequent iteration that in turn reinforms the system, forming a new component in the ongoing cycling of the system. The system creates itself as it operates and thus meets McGann's description of an autopoietic system, one that performs "self-maintenance through self-transformation" (McGann 2014, 94). A web of co-dependencies is established between the various instantiations of a text and the editor, who is also incorporated into the system. The process begins like this. The editor has manuscript A, which is electronically transcribed (or OCR'd and proofed) into B. The content of B is encoded in XML, creating document C, which, when rendered according to the accompanying XSLT and/or css file(s), generates and outputs on-screen document D (See Figure 1). This model posits two renderings (D1 and D2) of the same text, derived from options made available in the encoding, as for example, one that retains abbreviations and one that expands them, or in the case I describe below, one that renders cipher in a text, and another which deciphers it.

Figure 1: The process of digitization from manuscript A to rendered text on-screen D, including two transforms.

The entire process is a system that inputs A and produces D with the editor's engagement in each iteration. The autopoietic nature of the whole is suggested by the heuristic and cyclical nature of the iterative process of interpretation that is essential to scholarly editing. For example, I am reading D1 and notice a potential error that necessitates a correction in C. I return to the transcription B to see whether the error occurred in the encoding in C and to A to see whether the error was in my reading and transcription of the original in B, or whether it was an error in the original document. A determination and correction are then made with reference to these iterations. It may be, for example, that there is an error in the original transcription owing to ambiguity in the author's inscription in A. Should a correct reading remain elusive, it can be appropriately marked and revisited later (perhaps further reading of D1 or D2 will shed new light on the ambiguous text). It may also be that the correction suggests a pattern of error in the way the editor has been reading the source document A, perhaps misinterpretation of a siglum or an abbreviation. With this realization, further corrections can be made, producing subsequent iterations resulting in an improved reading of A. This process of (proof)reading is augmented by the possibility of rendering the encoded text in various ways (such as adding formatting to the encoded features of the text), and thus enabling the reader/editor to see with different eyes.

Figure 2: A model of iterative editing.

The process of revision and correction expands the web of interactions (Figure 2). The revision and proofreading occurs with reference to A, B and C. The correction in B (which follows a rereading of A) results in a new transcription (B1) and XML document (C1). The original transforms are applied to C1, creating new rendered texts (D1 and D2). These new renderings might reveal new possibilities for reading the text, perhaps errors. Additional corrections or revisions will expand the relational web further. One's ability to read and work with the manuscript improves as an increasingly functional electronic text is created. In this way, editing (as Bart (2006) suggests) teaches the editor how to read the document.

This process can be considered an autopoietic system both in the cyclical manner by which it operates and by virtue of the web of co-dependent relations that develops between iterations. Since each iteration becomes a new component of the system that informs subsequent iterations while re-interpreting prior ones, one can say that the process is creating itself through its operation, performing the aforementioned "self-maintenance through self-transformation." Such an editing process is non-linear and proceeds by identifying and utilizing the web of marked relations between iterations of the text. Being non-linear, the process is also one without a fixed point of termination. Each new reader/editor can, identifying an error in the process or new relation between iterations, modify the transcription or code, thus reorienting the web of relationships and potentially revealing new patterns of meaning that are read in the context of the entire process. The digital medium supports this process in a way that print does not. Since the act of marking the text in XML—defining its structure according to the TEI Guidelines—is but one stage of the process, the acts of reading and editing are blurred into each other, so that the reader of the rendered iteration can become a new editor, contributing to the functioning of the text.

Collecting Courten

The application of these ideas will be better understood with an example. My work on British Library MS Sloane 3961 is one aspect of a larger project researching the Culture of Curiosity in 16th- and 17th-century England and Scotland (see the project blog at http://digitalarkproject.blogspot.ca/). My role in the project has been transcribing documents, editing and encoding the resulting texts, tagging the contents appropriately, and helping to populate a reference database of names, places, and bibliographic entities mentioned in the documents. Sloane 3961 contains one of several documents included in the project, a record by William Courten (1642-1702) of acquisitions he made to his substantial collection of exotic artefacts, natural curiosities, numismatic fascinations, and art objects (for discussions about Courten, his collection, and the place that this MS has within his extant writings, see Gibson-Wood (1997) and Griffiths (1996)). Courten was a naturalist and collector, who associated with other notable collectors of his day, including Sir Hans Sloane and Elias Ashmole. He was the grandson of the merchant Sir William Courten, who financed the colonizing of Barbados and held the deed to the island for a time. The basis of the grandson's collection were the items inherited from his father and grandfather, although he increased its size significantly through his own travels and acquisitions made from other merchants, travellers, and collectors. Upon his death, Courten bequeathed his collection to Sloane who, combining Courten's with his own collection (which was already vast), bequeathed it all to the English nation, forming the foundation of the British Museum.

MS Sloane 3961 is a codex of 186 folio leaves containing ledgers, lists, letters, and personal memoranda, most of them related to Courten's collection. Most significant for Courten as a collector are the ledgers, which document the acquisition and sale of his collected objects. These records also bear witness to the social nature of Courten's collecting practices. Names of donors, sources, and recipients of objects punctuate the entries, recording a social collection as much as a natural or antiquarian one. But these ledgers pose significant difficulties for the editor: Courten's use of space and abbreviation are inconsistent; within the tabular form of the ledger, records sometimes curve above or below the line; braces loosely join records; some characters are ambiguously inscribed and easily mistaken for others; cancellations are messily applied; and insertions are forced into too-little space. The records are further and significantly complicated by Courten's frequent (and inconsistent) use of an idiosyncratic cipher, for which an incomplete key is extant (Sloane MS 4019 f.79). The challenge of working with Courten's cipher was mitigated by a processual approach facilitated by the transformability of TEI-XML.

Courten's cipher distinguishes this document from the other catalogues and inventories included in the Culture of Curiosity project. It also presents several unique challenges to the editor seeking to digitize it effectively. How can one best represent the enciphered data, and how will the enciphered text be distinguished from other use of similar symbols? For example, while much of the cipher uses non-alphabetic characters, it also (confusingly) uses Latin letters (for example, "p" to represent "r") and (even more confusingly) sometimes uses a Latin letter as a cipher for itself ("y" represents "y") or a slightly altered form (a longer form of "f" represents "f"). It is important, therefore, to be able to determine what is cipher and what is regular usage of a Latin letter. In order to identify the unique and idiosyncratic cipher characters, a process involving several steps was devised. In the header of the XML document a glyph declaration was made for each cipher character. Each declaration supplies the glyph name, a description, a Unicode character that is graphically similar to the cipher character, and the corresponding Latin character. For example, the cipher character that resembled the forward-slash and represents the letter "a" was defined thus:

<glyphName>cipher /</glyphName>

</glyph>

Each cipher character receives a similar declaration. Each occurrence of a cipher character within the document is tagged with reference to this declaration to distinguish cipher from other uses of glyphs:

The XML-encoded text is transformed by the accompanying XSLT style-sheet into a form readable by the CSS, which instructs the HTML browser in the rendering of the document. In the XSLT style-sheet, which is called in the header of the XML document, the contents of the glyph declarations are supplied, allowing for multiple transformations in the rendering of the text. In the case of the above cipher character I used the following CSS definitions:

g[ref="#cslash"]:before { content: "/"}

or

g[ref="#cslash"]:before { content: "a"}

Defining the content of the cipher in this way allows the editor/reader to display either the representative characters (the cipher "/") or their Latin-alphabet equivalents ("a"), or both. The encoded document thus makes the manuscript readable. For example, this

Figure 3: A sample of enciphered text.

Becomes

Figure 4: The same text, deciphered.

Styling can also be used to represent for the human eye the difference between an "f" that is used as a cipher and an "f" that is used as an abbreviation for "fecit," for example. The former is wrapped in <seg type="cipher"> and the latter in <abbr>. In this way the encoded document enables proofreading (one can be rendered red, and the other blue, for example); moreover, as we will see, the encoded document in fact teaches the editor how to read it.

Encoding to decipher

The work of properly deciphering and encoding the cipher was enabled by the autopoietic process outlined above. My ability to read and understand the cipher, and thus encode it accurately, improved because the process of encoding was not a mere defining of textual elements, but a process of tentative reading and experimentation with reference to previous iterations of the text. After tagging each cipher character transcribed from the manuscript according to the XML glyph declarations, I rendered the text using the deciphered Latin-alphabet representations and, beginning with the assumption that the enciphered content was mostly correctly-spelled English words, I was able to check the accuracy of my transcription against one potential rendering of Courten's cipher. When I encountered a misspelled English word among the deciphered text, my course of action was this: I transformed the rendered document back into the Unicode cipher equivalents and checked the encoded transcription against the original manuscript. By this comparison I could judge whether my transcription was incorrect or whether it accurately recorded a misspelling by Courten. If the error was my own, I made the correction in the XML document with reference to the manuscript and rendered the document again. Perhaps more importantly, when the translated version of the text revealed a string of text that did not appear to be a word, I went back to the original document to determine whether I had misinterpreted and misattributed a glyph. In this case, it was the glyph declaration that needed correction. This process had to be repeated several times, but through it I was able to eventually arrive at an accurate interpretation and rendering of the manuscript.

This process also enabled me to read other obscure features of the text. Some instances of apparent spelling errors drew my attention to the difference between the regular "f" glyph, the longer form of "f" used as a cipher, and an "f" used as an abbreviation for "fecit." Unsatisfactory translations forced a reconsideration of some instances that turned out not to be glyphs but abbreviations: these determinations were made possible by the context of their usage. These realizations led to corrections and altered encoding that recognized these distinctions, resulting in a more legible text, and an increased reading capacity in the editor. By preserving the presence of the cipher while providing the ability to decipher the text, TEI-based XML allows this edition to perform the autopoietic nature of the text; it preserves the interaction between creator and text, reader and text. It acknowledges that meaning-making in the text is dependent on an understanding of the system that feeds into it.

On a deeper level of reading, XML and its ability to transform the contents of the rendered document allow for the presentation of multiple voices marked by the cipher. Reading the document is not simply a matter of making substitutions for the cipher so the document can be literally read; reading also requires discerning why cipher was deployed in certain situations and not others, and therefore the meaning of the act of enciphering in each instance. For example, Courten often used the cipher to indicate differences in time, identity, or intent. Thus it is used to distinguish personal memoranda from business transactions, mark entries made at different dates, or differentiate an amount paid from a valuation. For example, folio 56r contains this entry immediately prior to the final tally:

Figure 5: Courten's household accounts. British Library Sloane 3961, f. 56r.

Deciphered, it reads: "for 2 small racks for, ye. roasts, meat, in, my, chamber." In this example, Courten's use of the cipher is consistent and straight-forward, in that he does not mix cipher and Latin characters together (which he sometimes does). A simple substitution makes grammatical sense of the encoded string. But the fact of the encoding does not make clear sense: this string does not appear to contain sensitive information that would require obscuration or encryption, as one might expect. In this case, the racks are meant for use in Courten's chamber and, as such, may represent a private purchase or purchase made for domestic purposes recorded in the midst of the other transactions pertaining to collectible objects. The presence of the cipher, then, becomes a signal to himself of the different character of this particular transaction.

Folio 43r contains an example of differences of time being marked. In the page header the following is written (Figure 6):

Figure 6: British Library Sloane 3961, f. 43r.

Deciphered, the second line reads: "April ye: 15th: 1691 not catalogued." This example marks two different actions occurring at different times. The shells listed on this page were purchased of one Mr. Jackson and, presumably, not properly catalogued or entered into the collection. Knowing that the cataloguing would have to be done at some point, Courten marked this in cipher with the date. The presence of the cipher, then, serves to signify the incompleteness of the act while it anticipates the future completion. Accordingly, the third line indicates that the shells were indeed catalogued on September 12th of that year (Courten abbreviated September, October, November, and December according to their respective Latin prefixes: 7ber, 8ber, 9ber, Xber).

The cipher characters were also frequently used to indicate differences in value for any given artefact. The following example from folio 9r ("Sold out of ye. Painters statuarys, grauers, &c.") demonstrates the interplay of difference that the cipher characters can signify, even when they are not used to encipher words:

Figure 7: British Library Sloane 3961, f. 9r.

The first entry "Titian by Aug. Caraccio" was apparently sold for 5 shillings, but is marked with the enciphered note "losse s1s." Each of the subsequent artefacts is marked with a value in shillings or pence, indicated with "s" and "ↄ" (being the cipher equivalent for "d" and thus an abbreviation for pence). These entries are bracketed together indicating the items were collectively sold for 3 shillings. This whole entry is also marked with "σ ↄ2ↄ." I read this as follows: "ↄ2ↄ" indicates 2 pence, a formula that can be seen throughout the list of artefacts, where "s#s" indicates a value in shillings. The "σ" is the cipher equivalent of "p." In the context of the "losses" noted above, I would suggest that "σ" is an abbreviation for "paid," thus the cipher is also serving as an abbreviation. So while the "Titian" painting was sold at a loss for 5s, the other 5 artefacts, which (it seems) were purchased for a mere 2d, were sold for 3s. The presence of the cipher characters, used to encipher and to abbreviate, can thus suggest the dynamics of economy and fluctuating value.

Stand-off markup

External, or stand-off, markup presents an alternative method of engaging the textual system, enabling a different interaction with the text. I would like to briefly discuss this idea in relation to the method of autopoietic editing I have discussed above, showing how, in the particular case of the Courten manuscript, processual and iterative editing can better serve the particular challenges of this document than could an external markup system. Stand-off markup, which has been promoted by scholars such as Peter Shillingsburg (2006), Paul Eggert (2009), and in conversation Peter Robinson, is a method of marking and rendering a digital text in which the editor does not modify the actual transcription file (as with inline markup). Rather, externally defined tagsets contained within discrete documents are applied to the read-only document containing the verified transcription of the text. An XML-compliant composite file is created by the process (for a helpful overview and explanation of stand-off markup and the issues surrounding its use see Eggert (2009)). The advantages of such a system are numerous. The transcription file contains only the text being edited, without any addition of interpretive marking. XML files marked inline, as Eggert writes, "mix data with data referring to the data and with data referring to itself, all in the same file" (Eggert 2010, par. 41). While it is a concern that the accuracy of text marked inline will be affected the more complicated the coding becomes, it also becomes increasingly difficult to check for error. As Eggert points out, "in the standard paradigm—i.e. using in-line markup—every addition of markup to a text-file necessarily creates a new state of the text. Text-files can quickly become so heavily encoded as to be beyond human capacity to proof-read them" (Eggert 2010, par. 41). Further, inline markup also presents a challenge to editors coming late to a project. The system of tags and particularities of the text must be learned before significant contribution can be made. Any subsequent editors of the Courten manuscript, for example, will face a steep curve as they learn the methods with which it has been marked.

Stand-off markup is able to answer the critique of hierarchical markup discussed above, as it is able to accommodate overlap of textual elements. According to Shillingsburg, stand-off markup "provides a preliminary way to deal with conflicting overlapping structural systems, a currently fatal weakness in the SGML-XML implementations" (Shillingsburg 2006, 120). To render the electronic text, the user selects the external tagsets (or markup definitions) that will be applied to the transcription. Each tagset specifies a uniquely-marked form of the document. Shillingsburg suggests that "in cases where the selection of multiple markup files results in a perspective in which there would be conflicting overlapping structures, the user will be informed of this fact and given a choice of viewing first one and then another perspective because XML is unable to use both at once" (Shillingsburg 2006, 120). Shillingsburg's suggestion here—that different instantiations of the text can be generated concurrently and presented successively—is not unlike the discussion above about the transformability enabled by XML, in that the text is considered as an iterative process. In both cases the limitations of a hierarchical system of markup can be mitigated, though as presented here they are solutions to different problems.

Though stand-off markup presents many opportunities and seems able to effectively resolve some difficulties that inline XML markup entails, its effectiveness does depend on the particular text and the purpose for which the text is being prepared. In the case of the Courten manuscript—and other documents that may contain similar idiosyncracies—stand-off markup would not have allowed for the work that I have done with it. External markup requires from the outset a trustworthy transcription to which the markup can be applied. Such a setup, as Eggert points out, is ideal for the purposes of authentication, copy protection, and enabling worksites with multiple editors providing markup for the text (Eggert 2010). For my work on the Courten manuscript, it was known that the transcription was incomplete and contained errors from its first draft. It was also unclear how the transcription could or should be read in its enciphered form. The transcription only became trustworthy as I read, edited, marked and revised it, as it was the process of marking-up the transcription and tagging and encoding its contents that improved both its accuracy and its readability.

There is in the comparison of iterative XML markup with stand-off markup different assumptions about the constitution and functionality of text that I think are instructive to highlight. Stand-off markup emphasizes the integrity of the text that exists and functions discretely from the acts of editing and reading. The processual and iterative methods of editing that I have described here assume that reading and editing, acts which are not exclusive of each other, are intimately involved in the functioning and meaning-making of the text. The reader/editor, being a component of the textual system, is an essential part of its functioning. As such, the acts of reading and editing leave their trace on the textual system. Editing that enables reading means that the act of reading alters the text, which in turn results in further acts of editing as the system functions autopoietically. This model of XML markup makes explicit the acting of the reader/editor, marking the ambiguity of language across iterations of text while remaining compliant to the TEI Guidelines.

Works Cited / Liste de références

Bart, Patricia. 2006. "Experimental markup in a TEI-conformant setting." Digital Medievalist 1. http://www.digitalmedievalist.org/journal/2.1/bart/.

Buzzetti, Dino. 2002. "Digital representation and the text model." New Literary History 33: 61-88.

Buzzetti, Dino and McGann, Jerome. 2006. "Critical editing in a digital horizon." In Electronic textual editing, edited by Unsworth, John, Katherine O'Brien O'Keeffe, and Lou Burnard, 53-73. New York: MLA of America.

Eggert, Paul. 2009. "The book, the E-text and the 'work-site." In Text editing, print and the digital world, edited by Deegan, Marilyn and Sutherland, Kathryn, 63-82. Farnham:Ashgate.

───. 2010. "Text as algorithm and as process." In Text and genre in reconstruction: Effects of digitalization on ideas, behaviors, products, and institutions, edited by McCarty, Willard, 183-202. Open Book Publishers. http://books.openedition.org/obp/660.

Fiormonte, Domenico, Valentina Martiradonna, and Desmond Schmidt. 2010. "Digital encoding as a hermeneutic and semiotic act: The case of Valerio Magrelli." Digital Humanities Quarterly 4.10. http://www.digitalhumanities.org/dhq/vol/4/1/000082/000082.html.

Gibson-Wood, Carol. 1997. "Classification and value in a seventeenth-century museum: William Courten's collection." Journal of the History of Collections 9.1: 61-77.

Griffiths, Anthony. 1996. Landmarks in print collecting: Connoisseurs and donors at the British Museum since 1753. London: British Museum.

Hayles, Katherine. 2003. "Translating media: Why we should rethink textuality." Yale Journal of Criticism 16.2: 263-290.

Maturana, Humberto and Francisco Varela. 1980. Autopoiesis and cognition: The realization of living. Boston: D. Reidel.

McGann, Jerome. 1991. The textual condition. Princeton, NJ: Princeton University Press.

───. 2014. "Marking texts in many dimensions." In A new republic of letters: Memory and scholarship in the age of digital reproduction, 90-112. Cambridge, MA: Harvard UP.

Renear, Allen. 2004. "Text encoding." In A companion to digital humanities, edited by Schreibman, Susan, Ray Siemens, and John Unsworth, 218-239. Malden, MA: Blackwell Publishing.

Robinson, Peter. 2009. "What text really is not, and why editors have to learn to swim." Literary & Linguist Computing 24.1: 41-52.

Shillingsburg, Peter L. 2006. From Gutenberg to Google: Electronic representations of literary texts. New York: Cambridge University Press.

Abstract

Keywords

How to Cite

Download

1386

201

Processing critique

Markup as autopoiesis

Theorizing process as poiesis

Collecting Courten

Encoding to decipher

Stand-off markup

Works Cited / Liste de références

Share

Authors

Download

Issue

Dates

Licence

Identifiers

File Checksums (MD5)

Table of Contents

Abstract

Keywords

How to Cite

Download

1386

201

Processing critique

Markup as autopoiesis

Theorizing process as poiesis

Collecting Courten

Encoding to decipher

Stand-off markup

Works Cited / Liste de références

Share

Authors

Download

Issue

Dates

Licence

Identifiers

File Checksums (MD5)

Table of Contents

Non Specialist Summary