What are the problems with traditional tools like TACT?
TACT [1] is a second generation tool. First generation tools were batch-oriented and designed originally for mainframes. They were not interactive; the user had to run a job and then examine the results. This meant that exploring a text was a long iterative process, but the batch files did offer a record of what was done to get a particular result. TACT, on the other hand is one of the best second generation tools. It was designed from the beginning for interactive use on a microcomputer. It even employs primitive windows in which the user can view different data displays. Being interactive, the process of trying queries, checking the results, and refining your queries is much faster. One of the things that was lost, however, was a way to track the project so that you could reconstruct how you arrived at a result after the interaction.
While there are many problems specific to TACT, we are going to focus here on those problems which TACT shares with similar interactive tools. We take TACT as an example of a text-analysis tool, not because we wish to criticise TACT, but because we have used it many years and have an intimate knowledge of its design -- one of us was involved in its creation and development. The major limitations we encountered when we used TACT to study Hume's Dialogues Concerning Natural Religion were:
One of the first problems we encountered with TACT was that it is a closed program; it cannot be modified or extended except by those who own the code. Not only is the program closed to extension, but the output it generates cannot be passed to other programs dynamically. The result files created by TACT must be manually exported and then opened in another program for further processing.
One solution that occurred to us was to repackage TACT so that it could be accessed by other programs. TACTweb [2], from one perspective, is such a repackaging for a WWW server to access. In our experience, however, the extension problem is not simply a matter of providing hooks so that a program can be called or call others. In original research one moves quickly beyond what has been done before in ways not foreseen. The unforeseeable nature of original research means that a research tool cannot be extended in a predictable way to meet the demands of research. What is needed are research tools whose every aspect can being modified and extended through easily added modules. In other words, to be truly extensible, a tool should be capable of having any component replaced. Eye-ConTact is one model for how such extensibility can be achieved based on similar tools in the sciences. [3]
According to one model of what happens in a computer-assisted research project, the researcher comes to a text with questions formulated in terms of queries the computer can answer. If the answers are interesting, the researcher records the answers and how they were derived and then publishes a description of the methods used and the results. For the results to be convincing, others have to be able to repeat the research and arrive at similar results by following the described methods. As readers of such publications we expect the research to be described in sufficient detail to allow us to test the results ourselves. Programs like TACT are unfortunately missing mechanisms to record or log one's work in order to accurately describe the methods used for oneself and others. (How often have you saved results and found that a few months later you cannot remember how they were arrived at!) We need tools that ensure that our computing work is clearly and completely logged while it is developing, so that it can be accurately described and later recreated by others. Specifically, we need tools that:
This leads to the next problem.
Existing text-analysis tools have a fundamental problem: they do not assist the researcher to share his or her research in a fashion that makes the text analysis accessible. TACT is a private tool; you study a text and then you publish your results as a separate act, preferably hiding the fact that a mere computer assisted you in any way. TACT does not help you keep track of how you reached your conclusions, nor does it help you show others how you arrived at those conclusions. It fosters private exploration, not open reproducible interactive research.
Research is by nature something others can recover (or re-search) if they are sceptical. To be convincing, research results have to be open to examination, so that others can traverse the logic of the research themselves. Current text-analysis tools do not allow one to share results in an interactive form for verification; they force us to either give colleagues the complete environment or nothing at all. Many papers based on text analysis simply tell us that the computer assisted a given insight, without showing us how the insight was arrived at. The methods used are described all too often in a truncated fashion, because there is really no graceful way to share them.
In short, we need tools that make the research accessible, not the technology. Such tools should show how the results were arrived at and what decisions were made along the way, instead of showing the toys used. Paradoxically, what we need are research tools aimed, not at the researcher for his or her private exploration, but at the researcher's audience who wants to test the insight. We need a tool that allows people to package easily their research for distribution in an interactive form and which highlights the research, not the technology.
The design philosophy of Eye-ConTact that grew out of TACT's limitations can be summarized as follows:
Eye-ConTact is first and foremost a prototype of a visual programming environment in which one creates text analysis applications. It is, in a sense, a thought piece meant to illustrate some of the ideas it implements, and to provoke further thought about these issues.
Eye-ConTact is a domain-specific visual programming environment in which one creates text analysis applications. Visual programming environments need to be distinguished from program visualizations and data visualizations. Brad Myers in "Taxonomies of Visual Programming and Program Visualization" nicely distinguishes the two, defining visual programming as "any system that allows the user to specify a program in a two(or more)dimensional fashion" (Myers 1990: 98). By contrast program visualization is a graphical representation of aspects of a program that may have been written in a conventional text programming language. Data visualization has nothing to do with the program, but rather has to do with the information managed by a program. A graph produced by a spreadsheet would be an example of a data visualization - it offers a view of the data manipulated by the program, rather than illustrating some aspect of the program. We will not discuss here the variety of such visual programming and visualization tools used in the sciences (see Price 1993 for a taxonomy of software visualization).
It is easy to confuse visual programming environments with data visualization because several examples of scientific visualization programs that produce data visualizations are at the same time themselves visual programming environments. For example, IRIS Explorer, which runs on Silicon Graphics workstations, is an environment in which the user creates maps that show how data should be piped, transformed and eventually graphed. (The reader familiar with this and similar programs will appreciate how Eye-ConTact is an attempt to explore the application of this model to textual visualization.) These maps are visual programs - they describe a complex process that the user wants performed on his or her data. What is confusing is that the final results of typical Explorer programs are data visualizations that graphically show some aspect of the data. Thus a map might show how data from a flight simulation would be processed in order to produce a graphic representing the air-flow over the wing of the simulation. The map is a visual program and the air-flow image is a data visualization. Eye-ConTact likewise has maps that are visual programs that can be run and could produce data visualizations.
Unlike general purpose visual programming environments like Prograph, Eye-ConTact is a domain-specific environment; it has been designed for creating programs in a specific domain, i.e. text analysis. There have been questions raised about the application of visual programming. It was assumed that visual programming would make programming accessible to everyone, but M. Petre and colleagues (Petre & Green 1993 and Petre 1995), among others, have suggested that visual programming benefits the domain expert more than the novice. Visual programming, like textual programming, is more effective when the users have acquired visual "reading" skills for the graphical conventions of the domain. A picture might be worth a thousand words, but without conventions and experienced viewers the same picture isn't worth the same thousand words. While general purpose visual programming may not be terribly efficient, domain-specific visual programming does have promise, precisely because of the restricted domain (See Raymond 1991). In a specific domain there is likely to be an informal consensus about the operations needed and the domain expertise that can be enhanced in a visual programming environment. Graphical elements, rather than being so abstract that they cannot be understood, can represent the common, conceptually simple operations of the domain. The expertise of those working in the domain can be harnessed by providing them with a programming tool aimed directly at their work. Ideally a domain-specific programming environment should hide the mysteries of programming and present the expert in the domain with operations that correspond to their research. This is the promise of a domain-specific environment such as Eye-ConTact - it should allow those interested in text-analysis to build applications that correspond to their research questions without having to master a new discipline.
What follows is a narrative of how you might use Eye-ConTact:
The underlying architecture is two-fold:
The Eye-ConTact environment is made up of modules and files. Specifically:
Some of the advantages of this design are:
Some of the disadvantages are:
How does Eye-ConTact deal with the limitations identified in the first part of this paper?
Eye-ConTact deals with the issue of extension by encouraging modularity. Not only does Eye-ConTact consist of a collection of modules, including the framework module, but it also presents the user with a modular interface paradigm. The user is encouraged to think of a text analysis project as a flow of data from one module to another. Eye-ConTact is, in effect, a tool for managing smaller modules and passing data from one to the other.
Eye-ConTact goes further in the support of modularity. It has built-in tools for adding modules or repurposing existing code to act as modules. The extension tools include:
Eye-Contact deals with the problem of recording the logic of an exploration by encouraging the user to lay out the fundamental steps in a visual environment. The user creates the logic by dragging out icons and "connecting the dots". This has the advantage of acting as both a record of the flow of choices made and a synoptic description of that flow, which should make the research easier to grasp. John Bradley experimented in TACT with a macro language that could record activities and replay them. This had the disadvantage that it was hard to "read" the macro file, let alone change the process. Graphical representation shows the logic in a more intuitive way. With the help of these records, users can build new projects, show and exchange them, and, finally, create custom applications that hide the logic.
Eye-ConTact also has an annotation tool that allows one to insert comments on the Map and attach them to particular operations. This is to encourage verbose and contextualized discussion of the project. Such annotations are particularly useful when the Map is to be shared.
One outcome of the Eye-ConTact approach is that it forces the user to map out the logic of his or her exploration before generating any results. From one perspective this is a disadvantage. The novice who does not have a research agenda, but is simply testing the tool, will have to make a map before he or she can see anything. By contrast, interactive tools can present the user with a default collection of displays (the word list, the KWIC list, and a full text display) already to be clicked on. The user can learn about text analysis by clicking on any of the displays and watching what happens in the others. While Eye-ConTact does not offer this sort of immediate feedback, it can be used to create such interactive tools. If we think of Eye-ConTact as a visual programming tool, it can be used by experts to create applications that are interactive and immediate. In fact, in Eye-ConTact one can, in theory, create any other type of text analysis tool.
One design issues that the Eye-ConTact prototype has raised is the degree of detail to be shown on Map display. If the Map is to show the logic of a project it needs to show more than generic icons for operations whose details are hidden in forms. At the same time, there are operations (especially displays) whose details would be too verbose to be shown on the Map. What we need is to find a way to let the user spill out the details and some of the resulting displays onto the Map so that it can serve as a reasonably complete representation of the whole.
With its visual programming paradigm, the Eye-ConTact Framework is only one possible user interface. If the framework module which manages the modules and the user interface is treated as one more replaceable module then one can share applications with alternative frameworks that present alternative user interfaces. The researcher should be able to create an interactive package for the audience, once interesting results have been mapped out. The audience could be students or scholars. Such interactive publications would have the following features:
We envisage two types of publications that Eye-ConTact is being designed to support:
Tact is after all a kind of mind reading. (Sarah Orne Jewett)
Eye-ConTact is both a design philosophy and a crude environment cobbled together out of modified existing tools like TACT and some new tools like the Eye-ConTact Framework program. This last program is a prototype designed to test the interface paradigm proposed here where the user maps out experiments with texts. A modular design encourages one to cobble together such prototypes in the hopes of being able to create more robust modules later. We hope the reader will not condemn the design because the prototype is flawed. In our defense, we believe that the best way to test a design philosophy is to build prototypes that can be tried by scholars on real projects. The project has already revealed some interesting issues regarding the design, which we will note here by way of conclusion:
We would like to acknowledge the following people and organizations:
The BTACT module reuses TACT code originally written by John Bradley and Lidio Presutti.
The McMaster Arts Research Board provided financial and programming support for this project.
The Eye-ConTact Framework program is being created in Visual Basic by Patricia Monger, Visualization Specialist, McMaster University.
The TG graphics module was programmed by Mark Janoska.
[1] TACT was developed originally by John Bradley and Lidio Presutti at the University of Toronto starting in 1984. For more information, or to download TACT, click here. In 1995 TACT was adapted so that it could be a CGI (Common Gateway Interface) program. The result was TACTweb, which can be tried or downloaded through the World Wide Web. The manual for TACT, a CD-ROM with TACT, and an extensive collection of texts is now published by the MLA under the title Using TACT with Electronic Texts (Lancashire et al. 1996).
In this paper we will use TACT as an example of a traditional text analysis tool, partly because one of us was involved in its design and development, and partly because it is still one of the best tools of its kind.
[2] TACTweb connects TACT to the World Wide Web -- making a TACT TDB database accessible to the entire WWW community. By using WWW forms, users have access to some of the interactive services that TACT provides them, but without requiring them to use TACT itself, or have a copy of the TACT database on their own machine. TACTweb can also be thought of as a text engine module called by other programs like a WWW server or another text-analysis program. In fact Eye-ConTact uses TACTweb in just this fashion: as a module that is called when needed.
For more information about TACTweb click here.
[3] One might ask if such extensibility is a reasonable goal? Perhaps we should not set our sights so high given the modest programming resources in the humanities. We believe such extensibility is not only feasible, it is the best way to ensure the long term survival of research tools. Like Jason's boat, Eye-ConTact is designed to be a collection of tools which can be slowly replaced, component by component, over time, as research in humanities computing evolves. Any research tool project that is closed may not survive the natural curiosity of researchers.