ITG Logo










Internetworking 1.1 Header

contents prev: Creating Web Site Designs next: Gathering user requirements
ARTICLE

The Log Annotation Device (LAD): A World Wide Web Storage and Retrieval Tool
Robert C. Allen, Bob_Allen@stricom.army.mil
Hector Morelos-Borja, morelos@cs.ucf.edu
Kay Stanney, stanney@iems.engr.ucf.edu
University of Central Florida

Abstract
Users have two options for storing or retrieving information from the World Wide Web (WWW). First they can click on the History (or GO) function, which will allow them to see the History of all Web pages recently visited. The other option users have is to use the Bookmark function. This function allows users to mark any Web page with a bookmark that can be readily accessed at a later date. One problem associated with both of these functions is that the title that the computer assigns to a Web page, when it is saved in the History list or as a Bookmark, may not be related to the information that the user wishes to retrieve from that page. The Log Annotation Device (LAD) is a voice recording system that allows users to make vocal annotations (notes) about the Web page at which they are currently located. The device places three bits of information into user defined categorization pages: (1) a short sentence describing the page contents (user defined), (2) a voice annotation (user inputted), and (3) an icon from the Web page (user selected). An experiment was conducted to compare the current use of the Bookmark and History functions with the LAD. It was hypothesized that use of the LAD would produce shorter data retrieval times than those achieved through use of the current functions. Although the means were in the appropriate direction, results indicated that no difference in data retrieval time occurred. Possible reasons for these results are discussed.

Introduction
This project focused on the development of an annotation device that could be applied to any World Wide Web (WWW) page. As such, the experiment that was conducted aimed at comparing the marking systems currently available on the Web, i.e., History (or GO) and Bookmarks, with the Log Annotation Device (LAD). The objective was to determine how the current bookmarking systems of the WWW could be enhanced such that the cognitive loading placed on users is reduced.

Current Web Practice: History and Bookmarks
Users have two options for storing or retrieving information from a Web page. First they can click on the History (GO) function, which will allow them to see the History of all recently visited Web pages. If users wish to go back to a certain page, they simply click on the appropriate page in the History list. Thus, History is a device that automatically creates a list of all the pages that users have visited. One problem associated with the History function is that users can lose some or all of the pages stored, depending on the "direction" that they choose to move in the Web. Another problem associated with the History function is that the information in this list is not placed in any order, but is randomly organized (depending on the order that users visit each Web page). Bower, Clark, Lesgold, and Winzeny (1969) conducted a study comparing the recall of words that were either organized in a hierarchical fashion, e.g., placed in categories like animals, plants, colors, etc. or were randomly organized. They found that recall of information in a list is superior when it is organized in a hierarchical manner as compared to random organization. The implication is that users may find it difficult to search through and access desired information from the History list.

The other option users have is to use the Bookmark function to mark information for future reference. This function allows users to mark any Web page with a bookmark. By using this function, users can create a permanent record of all the Web pages they have visited (and may wish to visit again). The Bookmark system also allows users to create a hierarchy of named folders that can help in page storage and retrieval. For example, users can create a folder named "coins" and place all web pages associated with this topic in that folder. However there are several problems associated with this system. First, users can only use a limited number of characters to name a folder before the folder name is truncated in the Bookmark list. Second, search time could be quite lengthy if users have many pages stored in a folder. Third, users may be unaware that they can create category folders. In such case, users may become overwhelmed in their efforts to identify a target web page within a lengthy list of bookmarks. Fourth, the title that the computer assigns to a Web page Bookmark may not be related to the information that the user wishes to retrieve from that page (a problem shared by the History function). Finally, there is a problem associated with recall. Unless the Web page title is clearly associated with the information that users wish to retrieve, recall of the specific information stored on that page will be difficult at best. With the passage of time and subsequent visits to different Web pages, accurate recall of information contained in bookmarked pages will be very difficult, due to the effects of interference and memory decay (Solso, 1988).

Log Annotation Device (LAD)

Description
The Log Annotation Device (LAD) is a voice recording system that allows users to make vocal annotations (notes) about the Web page at which they are currently located. The current device will place three bits of information into user defined categorization pages:

  1. a short textual sentence describing the page contents (user defined),
  2. a voice annotation (user inputted), and
  3. an icon from the Web page (user selected).

The choice of a vocal, or voice, recording method vs. a textual method was based on the fact that the Web is currently textually/visually intensive. We feel that the best way to reduce the cognitive loading of any visually intensive application like the WWW is to offload some of the visual information to the auditory channel. This idea is in line with Wickens (1992) multiple-resource theory of attention which states that tasks that draw upon a different modality of attention, i.e., visual vs. auditory, will be less likely to interfere with each other. It also follows usability studies conducted by Nielsen (1996). Nielsen found that users of the WWW do not want to read superfluous text or text intensive displays but would rather go straight to text that is highlighted. In addition, Nielsen points out that users do not like to scroll through information. As a corollary to scrolling, we propose that users would rather not have to open a series of folders, or click on each Bookmark, to find the Web page of interest. We feel users would prefer a more direct method of locating the appropriate page. Finally, no one to our knowledge has tried to apply the concept of auditory retrieval of data from WWW pages.

Verbal Annotation
The use of a voice annotation device as a data storage/retrieval method comes from an idea developed by Laurel, Strickland, and Tow (1994) called Placeholders. Laurel, et al. (1994) used icons in a Virtual Environment (VE) through which users could leave verbal messages describing their experiences within the VE. These markers were designed to record a user's experience of the VE, within the context of the marker's location. Subsequent users could replay these recordings and 'experience' previous user's thoughts and feelings about the current VE location. Related to the idea of a marker is the notion of a logbook. Archeologists, and researchers in general, usually keep some type of notebook or log about the research they are conducting. They use such a log to record important pieces of information, note the location of said information, etc. when initially researching a topic (personal communication, Drs. Arlen and Diane Chase, 1996). We have combined the two concepts, i.e., Placeholder and a log metaphor, into the LAD architecture.

Verbal annotation has several distinct advantages over the current History/Bookmark system found in the Web, some of which were mentioned above. The main advantage of the LAD system is that it allows users to personalize their notes about the Web pages that they have visited. As Laurel, et al. note, "...voice permits greater expressiveness and personalization than writing; it is also more immediate" (1994, p.122).

In addition to Wickens (1992) multiple-resource theory of attention, the design implementation of LAD is also in line with Pavio's dual-coding information-processing theory, as cited by Nugent (1987). Pavio's duel-coding information-processing theory suggests that information represented by both verbal and image codes is more powerful than a single representation of the same information. Nugent tested this theory, presenting an instructional set for an oscilloscope to Navy personnel in either a text, audio, text-audio, text graphics, audio-graphics, or audio-text-graphics format. Performance was measured by number of repeated steps in setting up the oscilloscope, setting errors, and sequence errors. Regardless of a subject's prior training in oscilloscope use, Nugent found that the most efficient and effective performance was obtained under the audio-graphics format. Nugent stated that his results are consistent with Pavio's duel-coding theory, i.e., a person could alternate between the audio and visual codes to more effectively obtain needed information. Nugent found that subjects who used a format that included the audio instructional method generally performed better than subjects who did not use this channel of information. This work further suggested that information presented aurally lasts longer in Short Term Memory (STM) and is less vulnerable to interference than visually presented information. Card, Moran & Newell (1983) demonstrated this difference in their information-processing model that shows that the working memory decay constant for auditory information is on average 7.5 times longer than it is for visual information.

Based on these studies, the LAD should allow for better recall of the information that is contained in a particular Web page as compared to the current approaches. That is, unless the information on the History list or Bookmark page is directly linked to the Web page title, the knowledge of what was on the page is likely to be lost using these current approaches. Information annotated with the LAD, however, will have visual (icon), textual, and auditory cues to aid the retrieval process.

Iconic Representation and Textual Labels
As aforementioned, the LAD allows users to choose a section of the Web page and capture it as an icon. This will provide users with a redundant visual cue of the information stored on that page. This feature is similar to a system developed by Lucarella and Zanzi (1996) called MORE, which allows users to choose which nodes of information they want displayed for complex information systems, e.g., hypermedia systems. The authors suggest that such a feature provides users with a visual interface that has high semantic power, or expressability, of the data structure. They also suggest that this type of interface helps reduce the cognitive load that is placed on users during data retrieval (i.e., a picture is worth a thousand words).

The current History/Bookmarks system creates a list of textual descriptors of Web pages visited by users. This is not an efficient system because, as Wickens (1992) notes, items that are randomly listed in a menu increases search time compared to items that are structured in some manner. In addition, search time increases as the number of items to be searched increases, unless the target is defined by one level along a particular dimension (e.g., color). The use of user defined icons can decrease processing time, compared to some words, but the icons must be distinguishable from each other and meaningful in context. For example, an arrow could be pointing at something or indicate movement. Therefore, an icon must have an unambiguous meaning so it won't be misinterpreted. In addition all codes, including icons, should be meaningfully related to their referents (Wickens, 1992). Therefore, providing users with pictorial icons that they choose from a Web page provides them with a redundant visual cue to the context from which the icon was taken.

In addition, Booher (1975) conducted a study that examined the relative comprehensibility of pictorial vs. textual representations of procedural instructions. He found that the lowest time and error rates were obtained under the pictorial format. However, the most effective performance was obtained under a format that used pictures as the primary source of information with text used as a secondary source of information, i.e., when text was used to clarify the information displayed on the pictures. Lastly Schmidt and Kysor (1987), in a study that examined the preferred format for airline safety cards, found that passengers preferred cards that were less wordy, more colorful, and more graphical then their counterparts.

Studies have shown that subjects perform best on either textual or pictorial/graphical representations of information, depending on their verbal/spatial abilities. Because of this fact, Wickens (1992) recommended that information be presented redundantly so that users can capitalize on whichever ability they exceed at. Wickens (1992) also states that the best way to reinforce pictorial material is with redundant textual labels. This idea is reinforced by the findings of Booher (1975) and Schmidt and Kysor (1987).

Taken together, these studies indicate that the use of verbal, textual, and pictorial (i.e., iconic) representations of a Web page should maximize the context of the page and allow for better recall and, therefore, data retrieval efficiency than the current approach of using text alone. This hypothesis was tested in the current study.

Method

Objectives
The objective of this study was to determine if the use of the LAD would improve both user search time and user satisfaction as compared to the current text-based Bookmark approach to annotating and retrieving information on the Web.

Participants
Eight participants were used in this study. Two were undergraduate students who received class credit for their participation. Six were graduate students that were enrolled in the same course as the experimenters. The group consisted of six males and two females. Four participants were randomly assigned to each condition.

Apparatus
The apparatus used in the study consisted of PC compatible Dell computers running Microsoft Windows 3.11. Each computer was equipped with a Pentium 133 MHz. processor and had 16-Mb main memory with speakers and microphones that used a Sound Blaster compatible audio system.

The Web browser used was the Netscape Navigator 3.0. The software used to develop the LAD Web pages was Netscape Gold 3.0. The recording software system for the LAD was the Microsoft Windows' Audio Recorder utility.

On average, each file recorded in the final phase of the experiment consumed about 500 Kbytes of memory. Obviously a space problem could occur, depending on the number and length of the annotations recorded. However, current trends in providing computers with huge hard drives and/or Zip drives may easily overcome this potential problem. A Zip drive-equipped computer could store about 200 voice annotations on just one disk.

Procedure
Upon entering the lab, participants were given an instructional set explaining the experimental procedures. Included in this set were instructions that explained how to create a bookmark or voice annotation, depending on which group the participant was randomly assigned to.

Once the participants indicated that they understood the procedures, the experimental session began.

Training Session: Task 1. Each Training Session was composed of two tasks. Task 1 required the participants to search for four bits of information from two Web sites, related to Biology and Fossils, and either bookmark the appropriate page or create a voice annotation of the information contained on the page depending on the group to which they were assigned. The LAD Group was instructed to write a seven to ten word descriptor of the Web page and to limit their annotation to 30 seconds or less. The Bookmark Group was instructed to place each bookmark into bookmark category folders that were predefined by the experimenters (note that this may be a "best-case" scenario as many users may not obtain the efficiency of self-defined folders). Both groups were instructed to write down the start and finish times of each information search. Time was displayed on the computer's VDT through the use of the computer's clock function. Once participants started their information search, they would write down the start time. When they found the correct Web page, they would note the finish time. Participants were then instructed to read some information found on the page. They then were required to bookmark the page and place it in the appropriate folder or create a voice annotation, and write down a seven to ten word descriptor, of the page depending on the group to which they were assigned.

Training Session: Task 2. After the information search was finished, the second task of the Training Phase was administered. In this task, participants were given a set of four questions that related to the topics that they had read about on the Web pages during Task One. The participants were required to use the bookmarks (Bookmark Group) or voice annotations and icons (LAD Group) to locate the Web page that contained the information needed to answer each question. The LAD Group had a GUI button that allowed them to jump directly to the Web page. They could also play their voice annotation first and jump to the Web page at the same time. For this version of the LAD the experimenter had to select the icon that was displayed to the LAD users. The icon was obtained from a picture found on each Web page that the user had visited. In later versions this function would be automated and given to the user to conduct. Again, users were required to write down the start and finish times. The finish time was written down once the participants had found, and written down, the answer to each question. The experimenter visually verified the times.

Testing Session: Tasks 1 and 2. After the Training Session, the participants were given a new information search set. The procedures for Task 1 and 2 of the Testing Session were exactly the same as they were for the Training Session. The only difference was that the participants had to search for ten bits of information related to ancient Greek culture. After the information search, but before the question task began, participants had to fill out three questionnaires: a Biographical Data questionnaire (see Appendix A), a Verbal Ability and a Spatial Ability questionnaire (Ekstrom, French, & Harmon, 1976). The administration of the questionnaires took 30 minutes. Then the participants finished the experimental session by completing 10 questions related to their ancient Greek culture Web search. Finally, the participants completed a Questionnaire for User Interaction Satisfaction (Shneiderman, 1998) and Mental Demands (a modified Cooper & Harper, 1969).

Experimental Design
A between subjects one-way analysis of variance was used for the experimental design. Two levels of the independent variable Interface Group (Bookmark, LAD) were used. The dependent variables were Total Time on Task, Total Time Spent on Pages or Questions, and Subjective Ratings of each interface obtained through the Overall User Reactions and Workload scales. Total Time Spent on Pages or Questions differed from Total Time on Task in that the times needed to answer, or find, each page or question were totaled in the former case. This differed from Total Time on Task in that Total Time on Task was measured from the start of a particular phase of the experiment to the finish of the phase. That is, it included time in which the users may have been doing something that wasn't related to the experiment, e.g., resting their eyes. Total Number of Correct Responses was also used as a dependent measure in Task 2 of the Training and Testing phases, although we had no reason to suspect that either group would be more accurate than the other.

Results

Biographical Data and Verbal and Spatial Abilities Tests
Results of the Biographical Data Questionnaire are shown in Figures 1 and 2. No significant differences were found between the Interface Groups on any of the biographical data found in Figure 1.

Figure 1. Biographical Data Means

Figure 1. Biographical Data Means

There was also no significant difference between the groups (F (1,7) = 4.91, p = 0.069) on user's self-rating of expertise on the WWW. The means for this variable are shown in Figure 2.

Figure 2. Mean Computer and WWW Use and WWW Self-Rating

Figure 2. Mean Computer and WWW Use and WWW Self-Rating

An analysis of variance was conducted on the Spatial and Verbal Abilities questionnaires. No significant difference was found between the two Interface Groups on either of these scales. The means are shown in Figure 3.

Figure 3. Mean Spatial and Verbal Scale Scores

Figure 3. Mean Spatial and Verbal Scale Scores

Training Session: Task 1
An analysis of variance was conducted to test for differences between the Interface Groups, with Total Time on Task as the dependent variable. Results revealed a significant main effect for Interface Group when Total Time on Task was the dependent variable, F (1,7) = 6.95, p = 0.038. Results of this analysis revealed that the Bookmark condition was significantly faster in completing the Web page search task during training, as measured by Total Time on Task (see Figure 4). There was no significant difference between Groups when Total Time on Pages was the dependent measure.

Figure 4. Mean Search Time (Min./Sec.) for Web Page(s): Training Task 1

Figure 4. Mean Search Time (Min./Sec.) for Web Page(s): Training Task 1

No significant difference was found between the Interface Groups for any of the dependent variables during Task 2 of the Training phase. The means of each dependent variable are shown in Figure 5.

Figure 5. Mean Number Correct and Search Time(s) (Min./Sec.): Training Task 2

Figure 5. Mean Number Correct and Search Time(s) (Min./Sec.): Training Task 2

Testing Session: Task 1

No significant differences were found between the Interface Groups for any of the dependent variables during Task 1 of the Test phase. The means of each dependent variable are shown in Figure 6.

Figure 6. Mean Search Times to find Web Page: Test Task 1

Figure 6. Mean Search Times to find Web Page: Test Task 1

Testing Session: Task 2
No significant differences were found between the Interface Groups for any of the dependent variables during Task 2 of the Test phase. The means of each dependent variable are shown in Figure 7.

Figure 7. Mean Correct and Answer Time(s): Test Task 2

Figure 7. Mean Correct and Answer Time(s): Test Task 2

Mental Demand and User Interaction Scales
Both the Actual and Ideal ranks of the Mental Demand Scale approached significance, F (1,7) = 5.54, p = 0.057 and F (1,7) = 4.52, p = 0.078, respectively. No significant difference was found between the Groups when the User Interaction Satisfaction score was the dependent variable. The means for all three scales are shown in Figure 8.

Figure 8. Mean Results of User Interaction and Mental Demand Scales

Figure 8. Mean Results of User Interaction and Mental Demand Scales

Discussion
The hypothesis that user performance would be superior, in terms of Time on Task, for the LAD Group was not confirmed. No significant differences were found between the Interface Groups in either session of the experiment during the testing tasks (Task 2). This phase of the experiment is more relevant to the hypothesis because Task 1 times were only measuring how long it took a Group to find the correct Web page using the Web browser, not how easy (or difficult) each interface is.

However, two of the obtained results might be revealing. First, the Bookmark Group's performance was superior, in terms of Total Time on Task, for Task 1 of the Training Session. Compared to the LAD Group, the Bookmark Group took significantly less Total Time to complete Task 1, (F (1,7) = 6.95, p = 0.038; X B = 18.53, X LAD = 30.36). Second the self-rating of the users on WWW expertise approached significance, with the Bookmark Group rating themselves as more expert than the LAD Group, (F (1,7) = 4.91, p = 0.069; X B = 5.75, X LAD = 4.25).

These results might indicate that the Bookmark Group was more familiar with the Web than the LAD Group. If so, the fact that no significant difference in performance was subsequently found between the two Groups might indicate that the LAD is relatively easy to learn and use. That is, even though the LAD Group might not have been as familiar with the Web as the Bookmark Group was, their performance didn't suffer on any of the tasks except in Task 1 of the Training phase.

Even though not significant, the direction of the Total Time on Question and Total Time on Task means reversed after the Training Session. That is, compared to the Training Session Task 2 means, the LAD Group means were in the expected direction in the Test Session Task 2 phase of the experiment, when more questions had to be answered (see Table 1). This indicates that the LAD may provide an advantage once users are as experienced with the LAD as they are with Bookmarks. More study is required to address this suggestion.


Table 1
Mean Total Times on Question and Task, in minutes and seconds, obtained for each Interface Group during the Training and Testing Phase of the experiment.


Total Time on Question Total Time on Task
Interface Group BookMark LAD BookMark LAD
Training Phase 4.16 4.24 4.33 5.20
Test Phase 14.17 10.21 15.56 12.24

The results obtained with the Mental Demand scale are somewhat confusing. That is, the Bookmark Group's Ideal Mental Demand mean was lower than their Actual Mental Demand mean. While the results were not significant, the results from the Mental Demand Scale revealed that the Bookmark Group rated the Actual Demand of their interface as less demanding then the LAD Group rated their interface. The LAD Group's Actual and Ideal Mental Demand means were equal. Thus, it seems that the LAD Group felt that the Mental Demand of their interface was about right, i.e., somewhat demanding (and remember that this group was somewhat less experienced than the Bookmark group). The Bookmark Group apparently felt that their interface was largely undemanding but that it should be even less demanding (see Table 2). This may be an indication of their level of Web experience and frustration with the Bookmark system. It would be interesting to test a more experienced user group on the LAD and determine if this approach reduces the mental demands to those desired by experienced users.


Table 2
Mean Actual and Ideal Mental Demand per Interface Group.


Actual Mental Demand Ideal Mental Demand
Bookmark 3.62 2.25
LAD 5.12 5.12

Another possible explanation for the higher Mental Demands score of the LAD Group may be related to the fact that the LAD group had to use the recorder of the Dell computer. All Bookmark participants were familiar with the bookmark system, but we don't know if any of the LAD participants were familiar with the recording system. If not, then this fact would help explain why the LAD Group had a higher Actual Mental Demand rating than the Bookmark Group.

Conclusions and Future Research
One issue with the current study was the small sample size. Although there is no hard and fast rule on the number of participants that should be run (without conducting a power analysis), it is obvious that more than eight total participants would be desirable with for a between-subjects design. One encouraging result was that even with the small sample size, the mean times that were achieved in the Test Session Task 2 phase of the experiment were in the expected direction. This suggests that further study should be conducted, with a larger sample, to determine if the LAD approach does indeed provide advantages for Web-based storage and retrieval tasks.

References

  • Booher, H. R. (1975). Relative comprehensibility of pictorial information and printed words in proceduralized instructions. Human Factors, 17 (3), 277-277.
  • Bower, G. H., Clark, M. C., Lesgold, A. M., & Winzeny, D. (1969). Hierarchical retrieval schemes in recall of categorized word lists. Journal of Verbal Learning and Verbal Behavior, 8, 323-343.
  • Card, S. K., Moran, T. P., & Newell, A. (1983). The psychology of the human-computer interaction. Hillsdale, NJ: Lawrence Erlbaum Associates.
  • Cooper, G. E. & Harper, R. P. (1969). The use of pilot ratings in the evaluation of aircraft handling characteristics. Washington, DC: NASA, Report No. TN-D-5153.
  • Ekstrom, R. B., French, J. W., & Harman, H. H. (1976). Manual for kit of factor-referenced cognitive tests. Princeton: Educational Testing Service.
  • Laurel, B. Strickland, R. & Tow, R. (1994). Placeholder: Landscape and narrative in virtual environments. Computer Graphics, 28 (2), 118-126.
  • Lucarella, D. & Zanzi, A. (1996). A visual retrieval environment for hypermedia information systems. ACM Transactions on Information Systems, 14 (1), 3-29.
  • Nielsen, J. (1996). Interface designs for Sun's WWW site. [On-Line]. Available: http://www.sun.com/sun-on-net/uidesign/
  • Nugent, W. A. (1987). A comparative assessment of computer-based media for presenting job task instructions. Proceedings of the Human Factors Society - 31st Annual Meeting, 696-700.
  • Schmidt, J. K. & Kysor, K. P. (1987). Designing airline passenger safety cards. Proceedings of the Human Factor Society - 31st Annual Meeting, pp. 51-55.
  • Shneiderman, B. (1998). Designing the user interface: Strategies for effective human-computer interaction (3rd ed.). Reading, MA: Addison-Wesley.
  • Solso, R. L. (1988). In R. L. Solso (Ed.) Cognitive Psychology. Newton, Mass.: Allyn and Bacon, Inc.
  • Wickens, C. D. (1992). In C. D. Wickens (Ed.) Engineering Psychology and Human Performance. New York: HarperCollins Publishers, Inc.

contents prev: Creating Web Site Designs next: Gathering user requirements

© Internet Technical Group
Last update: June 1, 1998
URL: http://www.sandia.gov/itg/newsletter/log_annotation.html
hosted by Sandia National Labs

Disclaimer: Neither Sandia Corporation, the United States Government, nor any agency thereof, nor any of their employees makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately-owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by Sandia Corporation, the United States Government, or any agency thereof. The views and opinions expressed herein do not necessarily state or reflect those of Sandia Corporation, the United States Government or any agency thereof.