Architectures for Volatile Hypertext

Mark Bernstein
Eastgate Systems, Inc.
134 Main Street
Watertown, MA 02172 USA
+1 (617) 924-9044
Jay David Bolter
Department of Literature, Communication and Culture
Georgia Institute of Technology
Atlanta, GA 30332-1065 USA
Michael Joyce
Jackson Community College
2111 Emmons Road
Jackson, MI 49201 USA
Elli Myionas
Project Perseus
Harvard University
319 Boylston Hall
Cambridge, MA 02138 USA
122974.122999.pdf1.23 MB

1. INTRODUCTION

We wish to explore architectures for creating and using volatile hypertexts— dynamic documents whose content and structure are subject to rapid change. While static hypertext systems architectures [Oren 87] emphasize effective presentation and exploration of stable documents, volatile hypertext systems emphasize a continual process of construction, reconstruction, and reconstruction [Joyce 88].
Volatile hypertext raise a fundamental theoretical issue what is the value and proper role of the link? Astonishingly, no consensus has emerged on this central hypertext question, even within the hypertext research community. Where Bolter [Bolter 9la] views rich webs of links as a liberating force that reshapes the constraints of artificial, linear-hierarchical authority, Glushko sees fruitful sources of confusion, writing that “limiting the links in the first place seems a more practical solution.” [Glushko 89]. Indeed, DeYoung considers linking to be harmful [DeYoung 90]. In working with volatile hypertext, we deliberately choose an extreme case in which dedicated readers and writers are necessarily faced with rich, complex, and irregular hypertext webs. Can this task be rendered manageable?
The complex roles which volatile hypertext serve defy facile attempts at measurement. Static hypertext can be designed for use by casual readers, from whom little intellectual energy is expected and whose interest in the document may be transient or perfunctory. Museum kiosks, software help systems, reference handbooks and advertising brochures all demand an essentially static design in which magisterial authority is vested in the author. The success such hypertext is, in principle, comparatively simple to measure although actual measurement can be quite difficult [Shneiderman 89] [Nielsen 90]):
Does the hypertext attract readers?
Do the readers say they like or dislike the system?
Do readers use the hypertext for the desired length of time

—neither too long or too short?
Do readersfind the isolated fact they require? Does the hypertext close the sale?
These questions are not without importance, yet their scope contrasts starkly with earlier visions of non-linear reading, visions of personal liberation [Nelson 76] and intellectual augmentation [Engelbart 63]. The complex aims of even such simple hypertext as 1912 [Bernstein 88], Inigo Gets Out [Goodenough 87] or A Sucker in Spades [DiChiara 88] clearly require a more subtle and flexible critical approach [Andersen 90].
Volatile hypertexts— on paper or on computer— are created and used by thoughtful, deeply involved readers who need to wrestle with complex and incompletely-understood ideas. For example, as scholars explore and describe the internal structure and inter-relationship of texts— historical ephemera, inscriptions, manuscripts, novels, poems, and plays— new connections constantly appear; the shape of the discourse changes as new ideas, interpretations, and structures augment earlier understanding. Journalists, intelligence analysts, planners and legislators constantly filter and organize texts to convey new understanding, formulating coherent stories from incoherent information or reformulating existing structures to convey new perspective. Early drafts of well-crafted documents are volatile, because thoughtful authors revise and experiment.
We ask not whether links can appeal to students, children, or tourists-in short, to novices—but whether they can help us address our most difficult intellectual challenges.

1.1 Implementation

Throughout this discussion, we use the term writing space to denote a hypertextual unit, whether a card, article, or chunk. A hypertext is a unified work— a body of writing spaces offering the surface unity traditionally designated in print culture as a book, article, or novel. A link connects writing spaces; links with a single beginning and a single end are univalent while those with multiple sources or destinations are multivalent.
In this paper we describe two interlinked environments, Storyspace and Lynx, for writing and exploring volatile hypertexts.Storyspace molter 87] [Storyspace 91] was designed as a tool for the process of writing and is thus related to WE [Smith 87] and to the Andrew hypertext suite [Neuwirth 89]. Like Intermedia [Yankelovich 88], Storyspace anchors its univalent links and multivalent paths at text spans. Storyspace is most at home as a tool for writing and revising, and has been widely used for writing instruction and for composing conventional books (e.g. [Bolter 91]) as well as for writing hypertext ([Joyce 90] [Bolter 91a]).
Lynx, an experimental prototype, explores the role hypertext might play in the earliest phases of writing— the stage in which issues are formulated and revised, ideas are gathered and discarded, and concepts arrange themselves fluidly within an ill-defined conceptual space. Lynx favors small pages which typically contain jottings, passages copied from other documents, or chunked extracts from relevant books and databases. Univalent and bivalent links connect entire writing spaces; since these spaces are small, more specificity is unnecessary. Lynx, like NoteCards [Halasz 87] is most at home as a tool for extracting, digesting, and exploring textual information, for finding new insights and new ideas which we might later express with a more conventional tool. In contrast to almost all previous hypertext systems, Lynx provides only feeble presentation facilities; it is a private note pad, not a publishing medium.

2. LARGE-SCALE STRUCTURE

While early hypertext systems attempted to display global hypertext structures, comprehensive maps were soon found to be difficult to implement and problematic to use [Fairchild 88] [Utting 90]. Local navigation aids were discovered to be more tractable [Walker 88] and demonstrated to be more helpful [walker 90], leading to a general loss of interest in large-scale structure. The dynamic context of volatile hypertext makes construction and discovery of large-scale order a pressing concern once more; when everything is changing all the time, we need instruments to visualize large structures and computational tools to bend them to our will.
Figure 1. A Storyspace writing space that contains six writing spaces.

2.1 Constructing and Manipulating Structure

Storyspace uses a simple spatial metaphor to disclose the changing, dynamic structure of hypertext of moderate scale— that is, hypertext the size of books. Verbal/graphic elements or writing spaces are represented by rectangular icons arranged in a display window on the screen. Each writing space has two roles: it contains text (including pictures) and it occupies a place in the evolving structure of elements. The visual representation of the writing space echoes this duality; every space contains a textual component (the title bar) and a structural component— a place that can hold additional writing spaces. The title bar opens on command to reveal a window in which a writer may view and edit the text inside the writing space the structure area opens on command to reveal a window in which the writer may view and edit other writing spaces contained within the first writing space. As seen in Figure 1, the horizontal and vertical placement of the boxes maps the logical structure of the hypertext. Dragging a space to a new position changes the reading order.
The duality of the writing space is fundamental to Storyspace. In a hypertext, structure is content: the relationships among the spaces are as much apart of the content as the words or graphics themselves. Users manipulate structure directly, by arranging, opening and closing spaces, leading writers to see their work as a set of meaningful components or topics, each represented as a concrete object. Writing spaces are easier to manipulate than chunks of continuous writing precisely because the writer identifies each space as a discrete symbolic entity; the natural seams of a Storyspace hypertext, the places most amenable to rupture, revision and refinement, are identified for future collaborators even as their original author creates them. In watching writers— especially developmental writers— use Storyspace, we notice a remarkable inversion of the familiar tension between liberalism and magic in user interface design [R. Smith 89]; here, the familiar representation (printed text) is invested with a mystical aura of tradition and authority while the non-literal (and hence magical) interface seems tame.
Storyspace shows hierarchical structure by placing one space inside another. When the user drags space A into space B, A shrinks to fit inside B. This action indicates that A is now subordinate to, or part of, B. The user can locate as many spaces as he or she likes inside one superordinate space. Thus, the area under the title bar of a space is itself a writing space of indefinite size. (Figure 2)
Figure 2. Two views of the same document, describing a cluster of archæological sites.
Because large tree diagrams are awkward, the Storyspace map systematically hides information. In any one view, Storyspace shows two levels of the structure and only a selected portion of the (possibly-large) plane. To examine hidden information, the user changes perspective by scrolling, zooming, by diving deeper into the structure or by climbing to a higher plane. Readers may shift their current point of view and may open new windows to view the same material from a different vantage Point for example, one window might present broad vista of a tree diagram at tiny magnification (compare [Halasz 87] Figure 3) while others provide close-up maps of interesting landmarks. Each view is fully alive, readers can open, move, or link a writing space in any view.
A Storyspace link may interconnect spans of text, regions of graphics, or entire writing spaces. Links appears in the map as an arrow from the source space to the destination. (Figure 3) If both the source and destination are visible on the current level, then the full link (shaft and arrowhead) is shown, running between adjacent sides of the two spaces. If source and destination are not both on the current level, only an abbreviated incoming or outgoing arrow appears.
Figure 3: Storyspace links. By disclosing some links and partially hiding others, Storyspace preserves the expression of both hierarchical and cross-hierarchical structure.
This link-drawing convention represents a sensitive balance. If every link were shown in complete detail, links would quickly cover the entire map. Judicious use of the hierarchy helps Storyspace manage the links; by revealing the structure of the document to the system, the author suggests which links need to be drawn in greatest detail and which may be abbreviated.
Other hypertext systems have adopted similar strategies for handling complexity and rapid change. In NoteCards, for example, filebox and browser cards allow the user to represent and manipulate the structure of the document [Halasz 87] although the presentation and manipulation of information is quite different. More fundamentally, the NoteCards notion of static type assignment segregates information units and structural views, while in Storyspace each unit is intrinsically both a text and structure. Like the Storyspace map, Intermedia’s elegant Web View [Utting 89] tracks the user’s path through the hypertext and also presents a diagram of links and representation of local structure. The designers of Intermedia, however, made a conscious decision that the Intermedia map should show only local context, relegating the map to the role of an auxiliary navigation aid while Storyspace insists on the explicit unity of content element and the structure.

2.2 Discovering Large-scale Structure

While [Utting 90] finds that “the link structure of the web has no inherent correlation with the user’s concept of how documents are related”, we nevertheless retain the hope that careful study of thoughtful hypertext will reveal intriguing and meaningful structure. Critical examination of these structures may, in turn, lead to a new vocabulary for discussing both static and volatile webs, and perhaps to new insights into the creation and depiction of webs.
Figure 4 shows a Lynx Linkplot view of Afternoon, a story [Joyce 90], a hypertext novel which was written with Storyspace. In Afternoon, Joyce avoids even the suggestion of hierarchical structure, embedding narrative elements in a complex web of connections. Few readers can offer a concise view of Afternoon’s structure, and even the author compares the effect of multiple readings to a bramble:
These are not versions, but the story itself in long lines. Otherwise, however, the center is all— Thoreau or Brer Rabbit, each preferred the bramble. I’ve discovered more there too, and the real interaction, if that is possible, is in the pursuit of texture. There we match minds. [Joyce 90]
Yet even a casual glance at Figure 4 reveals a wealth of structure: complex, incomplete, elusive, but unmistakable.
Figure 4. Linkplot of Afternoon [Joyce 91]; each dot represents a sing!e link.
The linkplot represents each hypertext link as a dot. Every writing space is assigned a number in sequence (in this case, the number is based on the position of the writing space in Joyce’s final working draft), and a link between, say, writing space #237 and #463 is represented as a dot at {237,463}. This diagram is simply the graph-theoretic adjacency matrix; raising this matrix to, say, the seventh power yields a plot that reveals places in Afternoon which are separated by no more than seven links (Figure 5).
Figure 5. Augmented link plot of Afternoon; each dot represents places separated by no more than seven links.
Dark areas in the linkplot identify clusters of writing spaces which are closely associated, while light areas highlight spaces which are sparsely interlinked. Figure 5 clearly suggests a division of Afternoon into three parts; two large clusters densely linked to themselves and to each other appear at the corners, separated by broad white bands in which few links appear. This central band represents an extended central word poem, a word maze whose rhythm readers immediately recognize as different from other parts of the work. The broken central diagonal (Figure 4) indicates that Joyce’s working notes were arranged in a sequence resembling one possible reading sequence; gaps in the diagonal reveal disparities between the sequence of working notes and the link structure. Isolated horizontal lines signal points of departure [Landow 87]— places that allow or require readers to change their current center of attention, while vertical lines herald points of arrival— places at which readers may arrive from widely disparate sites. Extremely dark areas near the diagonal often reflect episodic units; the dark square immediately following the poem (Figure 5), for example, includes scenes from the narrator’s lunch with his mentor.
Unless all links are bidirectional, the linkplot will not be symmetric; points above the diagonal represent links that jump “forward” in the text, while links below the diagonal represent links that jump “backward” with respect to the (perhaps arbitrary) ordering that underlies the diagram. It is interesting to compare the approximate symmetry of Afternoon with the asymmetry of Bolter’s Writing Space (Figure 6), despite the systematic avoidance of bidirectional links in both works. The contrasting styles of Joyce and Bolter are reflected in the symmetries; Afternoon is quasi-symmetric because in it we follow broad, elliptical trajectories with complex patterns of repetition and renewal, while many sections of Writing Space contain prominent, quasi-hierarchical navigational cues, cues reflected by prominent horizontal “lines of departure” (see [Landow 87]). Interestingly, the section of Writing Space which Bolter calls the “shadow text”- the hypertextual commentary absent from the linear version [Bolter 91]— the link plot is more nearly symmetric.
Figure 6. Linkplot of Bolter’s Writing Spaces [Bolter 91a].

3. MEDIUM-SCALE STRUCTURE

Presentation of smaller, more focussed structure is the best understood hypertext domain, and a wide array of presentation tools are discussed in [Conklin 87] [Bernstein 88] [Nielsen 90]. We therefore confine our remarks to the discovery and elucidation of new structure.
The quest for a deeper understanding of complex texts is the definitive mark of scholarship, although journalism, market analysis, intelligence assessment, and the law also depend upon this demanding skill. Lynx seeks to bridge the gap between traditional and computational methods by providing fluid, impressionistic tools to supplement quantitative stilometric approaches. These tools are not intended to generate a batch of measurements for the scholar to analyze, but rather to offer a stream of suggestions and associations for continual review and refinement.

3.1 Search and Query

Search and query tools help readers locate information that they know exists and that they can describe. In its simplest form, a search command collects writing spaces which share a specified word, phrase, or lexical pattern. In some implementations, the Search command creates an ephemeral, multivalent link from which the reader selects one or more destinations; in others, the reader moves to the “next” occurrence of the pattern according to some sequence imposed on the hypertext. More elaborate queries allow readers to search for patterns of connection as well as lexical patterns [Croft 89], or permit incremental refinement of queries [Frisse 89].
Search is most effective where the reader wishes to revisit a half-remembered place in the hypertext, or where a reader can predict some properties of a writing space which she hopes to find. Skill and expertise are needed to formulate an effective query, and much research has been devoted to designing query languages, man-machine interfaces, and expert systems to aid this process [Salton 83][Salton 86].
Query-based strategies require readers to pose an explicit question and assume that the query actually reflects the reader’s needs. While query-driven reading is not unknown, we believe it to be less common than is sometimes assumed. Readers do consult books to find the answer to a specific question, but their true motives are not always explicit. Finding the right question to ask is an elusive goal; while teachers allow students to ask questions, question-and-answer sessions are rarely the dominant form of instruction. Insightful questions are the mark of a master, not of an apprentice.
Query-based tools play two key roles in highly dynamic hypertext. First, queries help readers relocate material which has been misplaced, since the hypertext structure changes often, it is often easier to relocate a writing space by describing part of its contents than by describing its remembered location. Second, queries are the basic tools for information mining— for extracting useful and pertinent data from a larger collection. Search facilities in Storyspace and Lynx emphasize these dual roles. To help information miners, Lynx queries create ephemeral multivalent links, collecting sets of related writing spaces. To help writers and analysts, the Storyspace path builder translates queries into new paths through writing spaces.

3.2 A Link Apprentice

Link apprentices locate sections of the hypertext which resemble a specific text of interest by using broad lexical measurements [Bernstein 90]. Conventional queries are focussed and therefore brittle, relying on the presence (and the reader’s recognition) of unusual key words or phrases. Much can be gained by taking into account all the words (and word fragments) that occur in a writing space, rather than focusing exclusively on isolated exceptional cases. For example, the words “set”, “union “prove”, and “intersection” might appear in any piece of English text, but their Joint appearance suggests a mathematical treatise.
The Lynx link apprentice follows [Bernstein 90] in measuring the similarity between two writing spaces. Whenever a writing space is created or modified, each word, and each left- substring of a word, is repeatedly hashed into a small hash table or Bloom filter [Knuth 73]. The uncompressed table is moderately sparse— that is, there are many more bits in the table than words in a typical writing space; thus if the hash tables for two writing spaces share an unusually large number of bits, it is likely that they share many words or word roots. We have experimented with a variety of similarity measures including the normalized dot product, binomial probability, and coefficient of correlation between the two hash tables. Only minor differences in behavior were noted, although metrics that lend weight to the absence of common words (e.g. Hamming distance) performed poorly. The apprentice deliberately ignores both the number of times a word occurs and the frequency with which the word is found in the text— while this information can in principle be useful [Salton 90], we deliberately wish to place no special weight on repetitive or exotic terms.

3.2.1 Finding the Question

While insightful questions are hard to frame, we all know that children and students do ask delightful and intriguing questions— questions that raise issues deeper and more complex than the questioner ever anticipated. Like naive but willing pupil, link apprentices can lead us to explore deep questions by seizing upon regularities and anomalies of which they know nothing.
For example, while reading a translation of Herodotus’ Persian Wars, we might find a passage that seems odd or remarkable. For example, we might chose the following:
There are not many marvelous things in Lydia for me to tell of, in comparison with other countries, except the gold dust that comes down from Tmolus. But there is one building to be seen there which is much the greatest of all, except those of Egypt and Babylon. In Lydia is the tomb of Alyattes, the father of Croesus, the base of which is made of great stones and the rest of it of mounded earth. It was built by the men of the market and the craftsmen and the prostitutes. There survived until my time five comer-stones set on the top of the tomb, and in these was cut the record of the work done by each group: and measurement showed that the prostitutes’ share of the work was the greatest. [Herodotus 1.93 ff]
Reading this, we might ask, “What else in Herodotus resembles this?” We allow the link apprentice to scan the 2038 spaces into which we have arbitrarily divided Herodotus books 1-7: in roughly a minute the link apprentice selects passages describing:
[2.136] The construction of the brick pyramid of Asukhis
[7.23] The exemplary expertise of Phoenicians in canal building
[4.43] A Phoenician journey to Africa discovers towns of exotically-clad men
[1.187] Construction of a monumental gate in Babylon
[3.8] Picturesque oath-taking customs of the Arabians
[2.124] Pharaoh Kheops sends his daughter to a brothel and builds a pyramid
[1.194] Canoes- a remarkable tourist sight in Babylon
[4.82] A footprint of Heracles: another tourist sight
[1.94] The Lydian custom of making prostitutes of their female children
Notice how these passages reflect various facets of the first passage: some describe other scenes of interest to ancient tourists, others are anecdotes about the construction of famous structures, still others describe unusual sexual or social practices.
The apprentice’s sensitivity to quotation and paraphrase makes it a powerful tool for finding thematic and formulaic passages. For example, to find scenes of death and lamentation in Homer, we might begin (in translation) with Iliad 22.405ff, the classic scene in which Priam and Hecuba learn that their son Hector has been slain. The link apprentice quickly finds a host of passages with intriguing parallels; the top ten include:
[17.410] Warriors fighting over the corpse of Patroclus
[24.705] Priam bringing home the corpse of Hector
[18.55] Thetis forsees the death of Achilles
[23.1] Official mourning for Patroclus begins
[18.325] Achilles, after Patroclus’ death, vows to kill Hector
[24.500] Priam negotiates with Achilles for his son’s body
[21.540] Priam views the battle in which Hector will soon die
[24.300] Priam prepares to seek Hector’s corpse
[23.150] Achilles asks the Acheans to cease their mourning for Patroclus
[16.200] Achilles decides to return to the fight.
All these passages concern death of a relative or dear friend, lamentation, and fighting. Two fortell deaths still to come, eight lament deaths that have already taken place. The apprentice thus shows surprising flexibility in finding thematic parallels.
VISAR [Clitherow 89] might also be capable of discovering connections like these by analyzing the semantics of each text. Moreover, VISAR should also be able to explain its reasons for believing two texts to be related, while Lynx merely asserts that some relationship may exist. On the other hand, the link apprentice requires only a modest investment of space and time (approximately 20 msec/writing space on a 68000 microcomputer), while VISAR requires a detailed representation of consensual reality.

3.2.2 Phylogenies

Monumental scholarly efforts have been expended to discover relationships among important historical and literary documents. Gross measures of similarity— letter-count statistics or usage patterns— are sometimes interesting, and statistical stilometric measures have provided important evidence on the authorship of such works as the anonymous Federalist Papers [Merriam 89] or Shakespeare’s disputed plays [M. Smith 89]. Yet, when we inquire into the relationship between different documents, we normally wish to ask subtler questions than these— in particular, we are usually interested in tracing the origin and transmission of individual images and ideas. Preliminary experiments suggest that a link apprentice may assist these efforts.
Consider, for example, the first century documents now known to us as the gospels of Matthew, Mark, and Luke. The similarities, disparities, and lines of influence among these books have been a focus of scholarship for centuries, resulting in a copious literature including careful compilations of parallel passages [Throckmorton 79]. The link apprentice readily locates most of the parallels Throckmorton identifies; for example, starting from
Matt 5:15 Neither do men light a candle, and put it under a bushel, but on a candlestick; and it giveth light unto all that are in the house.
the link apprentice finds
Luke 11:33 No man, when he bath lighted a candle, putteth it in a secret place, neither under a bushel, but on a candlestick, that they which come in may see the light.
Mark 4:21 And he said unto them, Is a candle brought to be put under a bushel, or unakr a bed? and not to be set on a candlestick?
Luke 8:16 No man, when he bath lighted a candle, covereth it with a vessel, or putteth it under a bed; but setteth it on a candlestick, that they which enter in may see the light.
This approach may also be applied to tracing detailed relationships among large documents. Examining in turn each of the 2900 verses of these three books, the apprentice chose the most similar verse from one of the other books. The results (Table 1) indicate, for example, that a given (translated) passage in Matthew is equally likely to find a counterpart in Mark or in Luke, but that Mark is more closely tied to Matthew than to Luke [Kermode 87]. This is consistent with (although we do not claim it is evidence for) the current consensus that either Mark or Matthew was the earliest of these books, and that the author of Luke referred to one or both of the earlier works when composing his narrative.
Table 1. A link apprentice search through each verse of the three synoptic gospels to find the closest association between verse pairs from different books.
The interest in these results lies not in their application to biblical scholarship (although see [DeRose 9 l]), but rather in the promise they hold to make these methods available to analysis of other ancient and modern texts, texts which merit study but which have not received the dedicated attention of generations of students.

3.2.3 Limits to Link Discovery

These successes should not lead us to place excessive confidence in lexical approaches; because the link apprentice understands neither the user nor the text, the links it proposes may be puzzling. A seemingly far-fetched link may reveal connections upon close reading for example, the seemingly dissimilar couplets
Tempera cum causis Latium digesta per annum
Lapsaque sub terras ortaque signa canam. [Ovid 1.1-2]
Idem sacra cano signataque tempora fastis
Ecquis ad haec illinc crederet esse viam? [Ovid 2.7-8]
are, upon inspection, found to be linked by the repetition of “tempera” and parallels cano/canam and signa/signata. In fact, the association of these lines and another couplet ([1.7], ranked even more highly by the link apprentice) had been recently argued from external evidence [Mylonas 91].
Because the link apprentice seizes on any similarity it notes, the links it discovers may not be the links we hope to find. For example, [Hesiod 405ff] casts an interesting light on the role of women in ancient farming:
[405] First of all, get a house, and a woman and an ox for the plough — a slave woman and not a wife, to follow the oxen as well — and make everything ready at home, so that you may not have to ask of another, and he refuse you, and so, because you are in lack, the season pass by and your work come to nothing. Do not put your work off till to-morrow and the day aften for a sluggish worker does not fill his barn, nor one who puts off his work industry makes work go well, but a man who puts off work is always at hand-grips with ruin.
We hoped to use the link apprentice to find other precepts on women, but instead it delivered a bundle of tips on barns, oxen, and ploughing— all prominent in the source paragraph but none relevant to our interests.
The link apprentice benefits from linguistic peculiarities of English [McCrum 86], and while it operates successfully in some other languages, performance of the current implementation is inconsistent. No obvious degradation is observed in Elizabethan or medieval English, but performance in Latin seems clearly inferior, and Greek is worse still. Highly-inflected languages like ancient Greek lead the apprentice to believe words in different grammatical roles have different meanings. Moreover, the current implementation considers prefixes to be much more significant than suffixes, an assumption that is defensible in English (where “evolve” and “evolves” are more similar than “evolve” and “revolve”) but is clearly wrong in Greek. Similarly, languages like German and Navajo which exploit compound words more heavily and literally than does English are likely to cause problems. Matters will be still worse in languages with large, idiographic, or mutable character sets such as Chinese, ancient Egyptian, or Arabic.

3.3 Lexical Agents

Agents or dæmons recognize patterns that identify texts of special interest, Unlike queries, which are ad hoc and ephemeral, an agent persistently examines the changing hypertext to identify the appearance (or disappearance) of specific lexical patterns in which the author is particularly interested.
Lynx follows ObjectLens [Lai 88] and Agenda [Kapor 90] in representing agents as a hierarchy of classes in which each agent implicitly subsumes its children. ObjectLens agents are extensible and are capable of sophisticated computation, while Lynx and Agenda agents merely search for lexical patterns that occur in the text. For example, the Lynx agent
MorningReminder
Urgent&^Done
[Today]][Tommorow]
will identify text chunks that represent urgent uncompleted tasks. Agents may call other agents by enclosing their name in square brackets. Lynx agents may also use the link apprentice [Bernstein 90] to categorize items that resemble one or more exemplary items, much as Intermedia’s AutoLink service [Coombs 90] depends on its own full-text retrieval engine.
Incremental refinement of agent hierarchies can yield useful and complex agents, helping to focus attention on interesting or critical information. For example, [Buckmaster 90] collects news stories, federal disclosure filings, press releases, sports scores, and other journalistic ephemera in one large mass; a simple collection of agents can help the scholar or analyst find information that she might otherwise overlook. Careful refinement of sufficiently powerful agents [Hayes 89] can approach human competence in classifying news stories. Similarity-based agents, too, can be surprisingly powerful [Andersen 89].
The sensitivity of the link apprentice to usage and vocabulary allows agents that use it to distinguish passages from the King James version of Chronicles from Song of Solomon or Daniel; it is far from clear how one would specify a predicate of this nature in a procedural language.
To explore how lexical agents might complement link apprentices, we revisit the themes of lamentation in the Illiad. The simplest approach, of course, is a simple search for the word “lament”. This finds 24 items; 5 were also found by the Apprentice, 4 others were closely related in theme, and 15 were distinctly different in tone and content, mentioning “lamentation” in the context of revenge or recounting a part of the mourning activities that did not include wailing or weeping, but rather the washing of the body, or the preparation of the pyre. A more complex agent
Lament
Grief
weep|wail
[Death]&[Family]
Death
death|die|dying|dead
Family
mother|father|son
yields 159 of the 579 notes. These notes include 16 of 20 notes found by the apprentice, and additional interesting scenes about mourning a relative or quasi-relative, at the cost of retrieving a significant function of the entire work.

4. SMALL-SCALE STRUCTURE

Documents often exhibit a fascinating structure at fine levels of detail— at scales smaller than hypertext usually considers. Poetry and finely-honed prose often exhibit intricate internal structure, patterns of alliteration, assonance and allusion which interlink lines and even syllables. The dot plot, a fine-grained analog to the linkplot described above. is equally valuable for rapidly calling attention to detailed internal structure.
Figure 7. Dot plots for an alphabet, a palindrome, and a line of Latin poetry.
Dot plots, widely used for locating structural homology in gene sequences [Kruskal 83] [Bernstein 87], compare two pieces of text or compare a text to itself. To draw a dot plot, we take texts A and B of length n and m respectively, and construct an n x m array P such that P[i,j] = 1 iff A[i] matches B[i], and P[i,j]=O otherwise. Spaces and punctuation marks never match, and upper-case letters match lower-case letters. We visualize the matrix by drawing non-zero entries as black. In Figure 7a, we see a dot plot for the text “abcdefg....xyz” compared to itself, while Figure 7b is based on the palindrome, “Able was I ere I saw Elba”. Notice how the palindromic sequence and the vowel repetitions are immediately evident. Figure 7C examines a more realistic case, a line of Latin poetry:
constituit menses quinque bis esse suo. [Ovid 128]
The curious loops call our attention to the inverted repetition of “menses” “esse”. Once a reader learns to interpret dot plots, they afford, at a glance, information that would otherwise be gleaned only by most patient and careful reading. The cadences of King Arthur’s promised boons to faithful followers are evident in a dot plot of [Layomon 7] (Figure 8). Note how the lower left-hand comer seems to extend “poke out” beyond this mass of the plot indeed, the terminal consonants “n” and “d” have no precedent in this line, emphasizing that lond (land) was an exceptional gift indeed.
Figure 8. A dot plot showing repetition and alliteration.

5 COMPUT ABILITY

The utility of these tools in finding traces of structure is pleasing, but computational arguments indicate that there exist clear limits to the results we may expect from discovery-based approaches.

5.1 Global Structure

While writers do not deliberately create chaotic, incomprehensible structures, the structural complexity of large documents can become formidable. Much work has been devoted to the creation and presentation of tools to generate large-scale hypertext maps. The Storyspace map and adjacency matrix views described before are modest additions to a large and important line of development [Conklin 87].
It seems not to have been noted, however, that the inductive description of hypertext structure— that is, the identification of structure in the underlying graph of a hypertext— is computationally intractable. Even such simple questions as
Do hypertext A and hypertext B have identical structures?
Can a reader see each page of A without seeing any page twice?
are np-complete. The discovery of hypertext structure is inherently more difficult than the systematic construction of structure meaningful maps require either deep understanding or extravagant effort.

5.2 Computed Links

It is thus likely that we will never possess reliable tools which fully reveal meaningful structure by examining only the underlying hypertext graph. Matters are worse if we consider complex or intensional [DeRose 89] links— links whose existence or destination are not static. Manly hypertext systems permit some degree of computation to be associated with links; in some, all links are programs which compute their destination.
If links are described in a sufficiently expressive language, then clearly the extraction of the underlying hypertext graph is undecidable. Observing this, we might decide to restrict link computation— for example, by restricting computation to logical predicates [Storyspace 91] or Petri nets [Furuta 89][Furuta 90]. Brown’s concern [Brown 90] that computational links could harbor computer viruses also argues against arbitrary computation on links, since it seems unlikely that the question, “Is this program a virus?”, is decidable. Finally, the implementation of distributed hypertext systems is simplified if it is possible to assert that all link-based computations are finite; again, a restricted computational language appears desirable. If we cannot “limit the (number of) links” [Glushko 89], perhaps we can limit their computational complexity.
However, if we limit the computation associated with links (or, equivalently, the computation associated with entering or leaving a writing space), we necessarily abandon one of hypertext’s most attractive properties— its facility for combining simulation and explication in a single, seamless document [Bernstein 89]. While structural analysis, distributed hypertext, and freedom from viruses might ultimately prove attractive, simulation is attractive today; current authors use and depend on these facilities, and will not abandon them without a clear, incremental advantage. Also sacrifices would be the intensional links that make systems like ObjectLens [Lai 88] so attractive. Computational cleanliness may require sterility; if so, writers may opt for a messy, but lively, environment.

6 THE CHALLENGE OF SCHOLARSHIP

Tools like Lynx and Storyspace link the literary and textual worlds of scholarship to the visual and symbolic worlds of engineering. While early computational approaches like concordance generation and stilometric analyses sometimes sought to override the scholar’s subjective interpretation, hypertextual tools seek to enhance her subjective investigations and critical reading of a text. The ability to visualize structure, to see spatial, multi-sequential relationships, provides perspectives that do not easily emerge from traditional notetaking. Furthermore, the quest for objectivity is chimerical; the subjective intrusions of the scholar are always present in the study of a text. These tools abandon the illusory hope of automated objectivity; instead, they provide an amplifier for the scholar’s intended subjectivity.
Success or failure in this endeavour cannot be measured by time-and-motion usability studies, nor by commercial success, nor yet by popular acclaim. We meet the challenge of scholarship only if hypertextual tools help scholars find new insights they might otherwise have missed.

7 THE DEATH OF TEXT

Few hypertext are widely read, and little hypertext criticism has yet appeared (but see [Delany 91]). Tools for writing hypertext have been widely available for years. Where are the hypertext? Where are the critics?
It is possible that the dearth of hypertext is a symptom of the impending death of text; that few hypertexts merit critical examination because talented people prefer to express themselves in other media. The lack of good hypertext writing and the greater lack of serious hypertext criticism are often blamed on the novelty of the technology, but might also reflect the absence of a thoughtful audience. The emphasis hypertext research has given to casual, occasional readers may indicate that careful, dedicated readers no longer matter.
We do not believe this to be true.
Acknowledgements: Storyspace was developed by two of us (JDB and MJ) and by Professor John B. Smith of the University of North Carolina. Professor Smith made important contributions to the principle of information hiding discussed in this section. We thank Meryl Cohen, David Levine and Richard Ristow for valuable discussions.
Methodological Note: The critical analyses of [Joyce 90] and [Bolter 91a] contained in this paper were not performed by the authors of these hypertext, nor do the conclusions drawn on the structures of there works necessarily reflect the author’s intentions or views of their own work.

REFERENCES

Note: references to hypertexts are preceded by a bullet ●.
[Akscyn 87] R. Akscyn. D. McCracken. and E. Yoden “KMS: A Distributed Hypermedia System for Managing Knowledge in Organizations”, Proc. Hypertext ’87, ACM, Baltimore, 1987, pp. 1-20; reprinted in Communications of the ACM 310 (1988) 820-35. https://dl.acm.org/doi/10.1145/317426.317428
[Andersen 89] Michael H. Andersen, Jakob Nielsen, and Henrik Rasmussen, “A Similarity-Based Hypertext Browser for Reading the Unix Network News”, Hypermedia 1 (1989)255-265.
[Andersen 90] Peter B. Andersen, “Toward an aesthetics of hypertext systems: a semiotic approach”, Hypertext: Concepts, systems and applications, A. Rizk et al., eds., Cambridge University Press, Cambridge. 1990. pp. 224-237.
[Bernstein 87] Mark Bernstein, “Using spreadsheet languages to understand sequence analysis algorithms”, Computer Applications in the Biosciences 3 (1987) 217-221.
[Bernstein 88] Mark Bernstein, “The Bookmark and the Compass: Orientation Tools for Hypertext Users”, SIGOIS Bulletin 9 (1988) pp. 34-45. https://dl.acm.org/doi/10.1145/51640.51645
•[Bernstein 89] Mark Bernstein and Erin Sweeney, The Election of 1912, hypertext for Macintosh computers, Eastgate Systems Inc, Watertown MA, 1989.
[Bernstein 90] Mark Bernstein, “An apprentice that discovers hypertext links”, Hypertexts: Concepts, systems and applications, A. Rizk et al., eds., Cambridge University Press, Cambridge. 1990. pp. 212-223.
[Bolter 87] Jay David Bolter and Michael Joyce,“Hypertext and Creative Writing”, Proceeding of Hypertext ’87, ACM, Baltimore, 1987. pp.41-50. https://dl.acm.org/doi/10.1145/317426.317431
[Bolter 91] Jay David Bolter, Writing Space: The Computer, Hypertext, and the History of Writing, Lawrence Erlbaum and Associates, Hillsdale NJ. 1991.
•[Bolter 91a] Jay David Bolter, Writing Space: The Computer, Hypertext, and the History of Writing, hypertext edition, hypertext for Macintosh computers, Lawrence Erlbaum and Associates, San Mateo. 1991.
[Brown 90] Peter J. Brown, “Assessing the Quality of Hypertext Documents”, , Hypertext: Concepts, systems and applications, A. Rizk et al., eds., Cambridge University Press, Cambridge. 1990. pp, 1-12.
•[Buckmaster 90] Buckmaster Publishing, Front Page News, CD-ROM for Macintosh computers, Wayzata Technology, Prior Lake MN, 1990.
[Clitherow 89] Peter Clitherow, Doug Riecken, and Michael Muller, “VISAR: A system for inference and navigation of hypertext”, Proceedings of Hypertext ’89, ACM, Baltimore, 1989. pp. 293-305. https://dl.acm.org/doi/10.1145/74224.74248
[Coombs 90] James H. Coombs, “Hypertext, Full Text, and Automatic Linking”, Proceedings of SIGIR 90, (Brussels, September 5-7) ACM, New York, 1990. https://dl.acm.org/doi/10.1145/96749.98010
[Croft 89] W. Bruce Croft and Howard Turtle, “A Retrieval Model Incorporating Hypertext Links”, Proc Hypertext 89, ACM, Baltimore, 213-224. https://dl.acm.org/doi/10.1145/74224.74242
[Delany 91] Hypermedia and Literary Studies, George P. Landow and Paul Delany, eds., MIT Press, Cambridge, 1991.
[DeRose 89] Steven J. DeRose, “Expanding the Notion of Links”, Proceedings Hypertext 89, ACM, Baltimore, 1989. pp. 249-258. https://dl.acm.org/doi/10.1145/74224.74245
[DeRose 91] Steven J. DeRose, “Biblical Studies and Hypertext”, Hypermedia and Literary Studies, George P. Landow and Paul Delany, eds., MIT Press, Cambridge, 1991.
[DeYoung 90] Laura DeYoung, “Linking Considered Harmful”, Hypertext: Concepts, systems and applications, A. Rizk et al., eds., Cambridge University Press, Cambridge. 1990. pp. 238-249.
•[DiChiara 88] Robert DiChiara, A Sucker In Spades, hypertext for Macintosh computers, Eastgate Systems, Cambridge MA, 1988.
[Engelbart 63] Engelbart, D.; “A Conceptual Framework for the Augmentation of Man’s Intellect”, Computer-Supported Cooperative Work: A Book of Readings, Irene Greif, ed,, Morgan Kaufmann Publishers Inc., San Mateo. 1988. pp. 35-66.
[Fairchild 88] K. M.Fairchild, S. E. Poltrock, and G.W. Furnas, “SemNet: Three- dimensional graphic representations of large knowledge bases”, Cognitive Science and its Applications for Human-Computer Interaction, R. Guindon, ed., Lawrence Erlbaum, 1988.
[Frisse89] Mark E. Frisse and Steve B. Cousins, “Information Retrieval from Hypertext Update on the Dynamic Medical Handbook Project”, Proceedings Hypertext 89, ACM, Baltimore, 1989. https://dl.acm.org/doi/10.1145/74224.74241
[Furuta 89] Richard Furuta and P. David Stotts, “Programmable Browsing Semantics in Trellis”, Proceedings Hypertext 89, ACM, Baltimore, 1989 pp. 27-42. https://dl.acm.org/doi/10.1145/74224.74227
[Furuta 90] Furuta R., Stotts D., “The Trellis Reference Model”, Proc. Ist Hypertext Standardization NIST Workshop, Gaithersburg, MD, Jan. 1990.
•[Goodenough 88] Amanda Goodenough, Inigo Gets Out, hypertext for Macintosh computers, Amanda Stories, 1988.
[Glushko 89] Robert Glushko, “Design issues for multi-document hypertexts”, Proceedings of Hypertext ’89, ACM, Baltimore, 1989. pp. 51-60. https://dl.acm.org/doi/10.1145/74224.74229
[Halasz 87a] Halasz, F.; Moran, T.P.; Trigg, R.H.; “NoteCards in a Nutshell”, Proceedings ACM CHI+GI 87 (Toronto, 5-9 April 1987) 45-52. https://dl.acm.org/doi/10.1145/29933.30859
[Hayes 88] P. J. Hayes, L. E. Knecht, and M. J. Cellio, “A News Story Categorization System”, Proc. 2nd Conf. on Applied Natural Language Processing, Austin, Texas, February 1988. pp. 9-17. https://dl.acm.org/doi/10.3115/974235.974238
[Herodotus] Herodotus, The Persian Wars, Steve Ott, Lynn Sawlivich, and Annette Giesecke trans., in Perseus Project, New Haven: Yale University Press, in press. Adapted from Herodotus,V. 1-3, A. D. Godley, trans., Cambridge, MA: Harvard University Press, 1920.
[Hesiod] Hesiod, Works and Days, Hugh G. Evelyn-White, trans., in Perseus Project, New Haven: Yale University Press, 1991, in press. Originally in Hesiod, the Homeric Hymns and Homerica, Cambridge, MA: Harvard University Press, 1914. pp. 2-65.
[Homer] Homer, Iliad, A. T. Murray, trans., in Perseus Project, Yale University Press, 1991, in press. Originally in Iliad, Cambridge, MA: Harvard University Press, 1914. vol. 2.
[Joyce 88] Michael Joyce, “Siren Shapes: Exploratory and Constructive Hypertext”, Academic Computing 3 pp.10- 14.
●[Joyce 90] Michael Joyce, Afternoon, a story, hypertext document for Macintosh computers, Eastgate Systems, Cambridge MA, 1990.
[Kapor 90] Kapor, M. et al; “Agenda”, Communications of the ACM 33 (1990) pp. 105- 116. https://dl.acm.org/doi/10.1145/79204.79212
[Kermode 87] Frank Kermode, “The New Testament”, The Literary Guide to theBible, Robert Alter and Frank Kermode, eds., Harvard University Press, Cambridge (1987) PI. 377.
[Knuth 73] Donald Kuuth, Sorting and Searching, Addison-Wesley, Reading MA, 1973.
[Kruskal 83] J. P. Kruskal, “An overview of sequence comparison”, Time Warps, String Edits, and Macromolecules: The Theory and Practice of Comparison, Addison-Wesley, Reading, MA. 1-44.
[Lai 88] Kum-Yew Lai, Thomas W. Malone, and Keh-chiang Yu, “Object Lens: A ‘Spreadsheet’ for Cooperative Work”, ACM Transactions on Office Information Systems 6 (1988) 332-353. https://dl.acm.org/doi/10.1145/58566.59298
[Landow 87] George P . Landow, “Relationally Encoded Links and the Rhetoric of Hypertext”, Proceedings of Hypertext ’87, ACM, Baltimore, 1987. pp. 331- 44. https://dl.acm.org/doi/10.1145/317426.317450
[Layamon] Layamon, Brut, in The Oxford Book of Medieval English Verse, Celia and Kenneth Sisam, eds., Clarendon Press, Oxford, 1970.
[Lynx 91] Lynx, prototype hypertext environment for Macintosh computers, Eastgate Systems, Inc, in press.
[Merriam 89] Thornan Merriam, “An Experiment With The Federalist Papers”, Computers and the Humanities 23 (1989) 251-254.
[McCrum 86] Robert McCrum, William Cran, and Robert MacNeil, The Story of English, Viking, New York, 1986. pp. 47ff.
[Mylonas 91] Elli Mylonas, doctoral dissertation, Brown University, in preparation.
[Nelson 76] Nelson, T.H.; Computer Lib/Dream Machines, reprinted by Microsoft Press, Redmond WA 1988.
[Nielsen 90] Jakob Nielsen, Hypertext and Hypermedia, Academic Press, 1990.
[Neuwirth89] Christine M. Neuwirth and David S. Kaufer, “The Role of External Representations in the Writing Process: Implications for the Design of Hypertext-based Writing Tools”, Proceedings Hypertext 89, ACM, Baltimore, 1989. https://dl.acm.org/doi/10.1145/74224.74250
[Oren 87] Tim Oren, ‘The Architecture of Static Hypertext”, Proceedings of Hypertext 87, ACM, Baltimore, 1989. https://dl.acm.org/doi/10.1145/317426.317447
[Ovid] Ovid, Fasti, Sir James G. Frazer, ed. Cambridge, MA: Harvard University Press, 1976
[Salton 83] G. Salton and M. McGill, An Introduction to Modern Information Retrieval, McGraw-Hill, New York, 1983.
[Salton 86] Gerard Salton, “Another Look at Automatic Text Retrieval Systems”, Communications of the ASCM 29 (1986) 648-656. https://dl.acm.org/doi/10.1145/6138.6149
[Salton 90] Gerard Salton and Chris Buckley, Flexible Text Matching for Information Retrieval, Technical Report TR90-1158 Cornell University, Ithaca, 1990.
[Shneiderman 89] Ben Shneiderman et al., “The Hyperties electronic encylopedia: An evaluation based on the museum installations”, J. American Society for Information Science 40 (May 1989) 172-82.
[Smith 87] John B. Smith, Stephen F. Weiss and Gordon J. Ferguson, “A Hypertext Writing Environment and its Cognitive Basis’’, Proceeding of Hypertext ’87, ACM, Baltimore, 1987. pp.41-50. https://dl.acm.org/doi/10.1145/317426.317442
[M. Smith 89] M. W. A. Smith, “A Procedure to Determine Authorship Using Pairs of Consecutive Words: More Evidence for Wilkins’s Participation In Pericles”, Computers and the Humanities 23 (1989) 113-129.
[R. Smith 89] Randall B. Smith, “Experiences with the Alternate Reality Kit: An Example of the Tension Between Liberalism and Magic”, Proceedings CHI+GI 87, (Toronto, April 5-9) ACM, New York, 1987. pp.61-8.
[Storyspace 91] Storyspace, hypertext writing environment for Macintosh computers, Eastgate Systems, Inc. 1991.
[Throckmorton 79] Burton H. Throckmorton Jr., cd., Gospel Parallels: A Synopsis of the First Three Gospels, Thomas Nelson Inc., Nashville, 1979.
[Utting 90] K. Utting and N. Yankelovich, “Context and Orientation in Hypermedia Networks”, ACM Trans. on Information Systems 7 (1990) pp. 58-84. https://dl.acm.org/doi/10.1145/64789.64992
[Walker 88] Walker, J. H.; “Supporting document development with Concordia”, IEEE Computer 21 (1988) pp. 48-59.
[Walker 90] Janet H. Walker, Emilie Young, and Suzanne Mannes, “A Case Study of Using a Manual Online”, Machine-Mediated Learning (1990).
[Yankelovich 88] Nicole Yankelovich, B. J. Haan, Norman K. Meyrowitz, and Steven M. Drucker, “Intermedia: The concept and construction of a seamless information environment”, IEEE Computer 21 (1988) pp. 81-96.
Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to replenish, requires a fee and/or specific permission.
© 1991 ACM 0-89791-461-9/91/0012/0260 ...$1.50
[Source: Mark Bernstein, Jay David Bolter, Michael Joyce, and Elli Mylonas. 1991. Architectures for volatile hypertext. In Proceedings of the third annual ACM conference on Hypertext (HYPERTEXT '91). Association for Computing Machinery, New York, NY, USA, 243–260. https://doi.org/10.1145/122974.122999]