|
|
![]() ARTICLE
The Log Annotation Device (LAD): A World Wide Web Storage and Retrieval Tool
Abstract
Introduction
Current Web Practice: History and Bookmarks The other option users have is to use the Bookmark function to mark information for future reference. This function allows users to mark any Web page with a bookmark. By using this function, users can create a permanent record of all the Web pages they have visited (and may wish to visit again). The Bookmark system also allows users to create a hierarchy of named folders that can help in page storage and retrieval. For example, users can create a folder named "coins" and place all web pages associated with this topic in that folder. However there are several problems associated with this system. First, users can only use a limited number of characters to name a folder before the folder name is truncated in the Bookmark list. Second, search time could be quite lengthy if users have many pages stored in a folder. Third, users may be unaware that they can create category folders. In such case, users may become overwhelmed in their efforts to identify a target web page within a lengthy list of bookmarks. Fourth, the title that the computer assigns to a Web page Bookmark may not be related to the information that the user wishes to retrieve from that page (a problem shared by the History function). Finally, there is a problem associated with recall. Unless the Web page title is clearly associated with the information that users wish to retrieve, recall of the specific information stored on that page will be difficult at best. With the passage of time and subsequent visits to different Web pages, accurate recall of information contained in bookmarked pages will be very difficult, due to the effects of interference and memory decay (Solso, 1988). Log Annotation Device (LAD)
Description
The choice of a vocal, or voice, recording method vs. a textual method was based on the fact that the Web is currently textually/visually intensive. We feel that the best way to reduce the cognitive loading of any visually intensive application like the WWW is to offload some of the visual information to the auditory channel. This idea is in line with Wickens (1992) multiple-resource theory of attention which states that tasks that draw upon a different modality of attention, i.e., visual vs. auditory, will be less likely to interfere with each other. It also follows usability studies conducted by Nielsen (1996). Nielsen found that users of the WWW do not want to read superfluous text or text intensive displays but would rather go straight to text that is highlighted. In addition, Nielsen points out that users do not like to scroll through information. As a corollary to scrolling, we propose that users would rather not have to open a series of folders, or click on each Bookmark, to find the Web page of interest. We feel users would prefer a more direct method of locating the appropriate page. Finally, no one to our knowledge has tried to apply the concept of auditory retrieval of data from WWW pages.
Verbal Annotation Verbal annotation has several distinct advantages over the current History/Bookmark system found in the Web, some of which were mentioned above. The main advantage of the LAD system is that it allows users to personalize their notes about the Web pages that they have visited. As Laurel, et al. note, "...voice permits greater expressiveness and personalization than writing; it is also more immediate" (1994, p.122). In addition to Wickens (1992) multiple-resource theory of attention, the design implementation of LAD is also in line with Pavio's dual-coding information-processing theory, as cited by Nugent (1987). Pavio's duel-coding information-processing theory suggests that information represented by both verbal and image codes is more powerful than a single representation of the same information. Nugent tested this theory, presenting an instructional set for an oscilloscope to Navy personnel in either a text, audio, text-audio, text graphics, audio-graphics, or audio-text-graphics format. Performance was measured by number of repeated steps in setting up the oscilloscope, setting errors, and sequence errors. Regardless of a subject's prior training in oscilloscope use, Nugent found that the most efficient and effective performance was obtained under the audio-graphics format. Nugent stated that his results are consistent with Pavio's duel-coding theory, i.e., a person could alternate between the audio and visual codes to more effectively obtain needed information. Nugent found that subjects who used a format that included the audio instructional method generally performed better than subjects who did not use this channel of information. This work further suggested that information presented aurally lasts longer in Short Term Memory (STM) and is less vulnerable to interference than visually presented information. Card, Moran & Newell (1983) demonstrated this difference in their information-processing model that shows that the working memory decay constant for auditory information is on average 7.5 times longer than it is for visual information. Based on these studies, the LAD should allow for better recall of the information that is contained in a particular Web page as compared to the current approaches. That is, unless the information on the History list or Bookmark page is directly linked to the Web page title, the knowledge of what was on the page is likely to be lost using these current approaches. Information annotated with the LAD, however, will have visual (icon), textual, and auditory cues to aid the retrieval process.
Iconic Representation and Textual Labels The current History/Bookmarks system creates a list of textual descriptors of Web pages visited by users. This is not an efficient system because, as Wickens (1992) notes, items that are randomly listed in a menu increases search time compared to items that are structured in some manner. In addition, search time increases as the number of items to be searched increases, unless the target is defined by one level along a particular dimension (e.g., color). The use of user defined icons can decrease processing time, compared to some words, but the icons must be distinguishable from each other and meaningful in context. For example, an arrow could be pointing at something or indicate movement. Therefore, an icon must have an unambiguous meaning so it won't be misinterpreted. In addition all codes, including icons, should be meaningfully related to their referents (Wickens, 1992). Therefore, providing users with pictorial icons that they choose from a Web page provides them with a redundant visual cue to the context from which the icon was taken. In addition, Booher (1975) conducted a study that examined the relative comprehensibility of pictorial vs. textual representations of procedural instructions. He found that the lowest time and error rates were obtained under the pictorial format. However, the most effective performance was obtained under a format that used pictures as the primary source of information with text used as a secondary source of information, i.e., when text was used to clarify the information displayed on the pictures. Lastly Schmidt and Kysor (1987), in a study that examined the preferred format for airline safety cards, found that passengers preferred cards that were less wordy, more colorful, and more graphical then their counterparts. Studies have shown that subjects perform best on either textual or pictorial/graphical representations of information, depending on their verbal/spatial abilities. Because of this fact, Wickens (1992) recommended that information be presented redundantly so that users can capitalize on whichever ability they exceed at. Wickens (1992) also states that the best way to reinforce pictorial material is with redundant textual labels. This idea is reinforced by the findings of Booher (1975) and Schmidt and Kysor (1987). Taken together, these studies indicate that the use of verbal, textual, and pictorial (i.e., iconic) representations of a Web page should maximize the context of the page and allow for better recall and, therefore, data retrieval efficiency than the current approach of using text alone. This hypothesis was tested in the current study. Method
Objectives
Participants
Apparatus The Web browser used was the Netscape Navigator 3.0. The software used to develop the LAD Web pages was Netscape Gold 3.0. The recording software system for the LAD was the Microsoft Windows' Audio Recorder utility. On average, each file recorded in the final phase of the experiment consumed about 500 Kbytes of memory. Obviously a space problem could occur, depending on the number and length of the annotations recorded. However, current trends in providing computers with huge hard drives and/or Zip drives may easily overcome this potential problem. A Zip drive-equipped computer could store about 200 voice annotations on just one disk.
Procedure Once the participants indicated that they understood the procedures, the experimental session began. Training Session: Task 1. Each Training Session was composed of two tasks. Task 1 required the participants to search for four bits of information from two Web sites, related to Biology and Fossils, and either bookmark the appropriate page or create a voice annotation of the information contained on the page depending on the group to which they were assigned. The LAD Group was instructed to write a seven to ten word descriptor of the Web page and to limit their annotation to 30 seconds or less. The Bookmark Group was instructed to place each bookmark into bookmark category folders that were predefined by the experimenters (note that this may be a "best-case" scenario as many users may not obtain the efficiency of self-defined folders). Both groups were instructed to write down the start and finish times of each information search. Time was displayed on the computer's VDT through the use of the computer's clock function. Once participants started their information search, they would write down the start time. When they found the correct Web page, they would note the finish time. Participants were then instructed to read some information found on the page. They then were required to bookmark the page and place it in the appropriate folder or create a voice annotation, and write down a seven to ten word descriptor, of the page depending on the group to which they were assigned. Training Session: Task 2. After the information search was finished, the second task of the Training Phase was administered. In this task, participants were given a set of four questions that related to the topics that they had read about on the Web pages during Task One. The participants were required to use the bookmarks (Bookmark Group) or voice annotations and icons (LAD Group) to locate the Web page that contained the information needed to answer each question. The LAD Group had a GUI button that allowed them to jump directly to the Web page. They could also play their voice annotation first and jump to the Web page at the same time. For this version of the LAD the experimenter had to select the icon that was displayed to the LAD users. The icon was obtained from a picture found on each Web page that the user had visited. In later versions this function would be automated and given to the user to conduct. Again, users were required to write down the start and finish times. The finish time was written down once the participants had found, and written down, the answer to each question. The experimenter visually verified the times. Testing Session: Tasks 1 and 2. After the Training Session, the participants were given a new information search set. The procedures for Task 1 and 2 of the Testing Session were exactly the same as they were for the Training Session. The only difference was that the participants had to search for ten bits of information related to ancient Greek culture. After the information search, but before the question task began, participants had to fill out three questionnaires: a Biographical Data questionnaire (see Appendix A), a Verbal Ability and a Spatial Ability questionnaire (Ekstrom, French, & Harmon, 1976). The administration of the questionnaires took 30 minutes. Then the participants finished the experimental session by completing 10 questions related to their ancient Greek culture Web search. Finally, the participants completed a Questionnaire for User Interaction Satisfaction (Shneiderman, 1998) and Mental Demands (a modified Cooper & Harper, 1969).
Experimental Design Results
Biographical Data and Verbal and Spatial Abilities Tests
Figure 1. Biographical Data Means
There was also no significant difference between the groups (F (1,7) = 4.91, p = 0.069) on user's self-rating of expertise on the WWW. The means for this variable are shown in Figure 2.
Figure 2. Mean Computer and WWW Use and WWW Self-Rating
An analysis of variance was conducted on the Spatial and Verbal Abilities questionnaires. No significant difference was found between the two Interface Groups on either of these scales. The means are shown in Figure 3.
Figure 3. Mean Spatial and Verbal Scale Scores
Training Session: Task 1
An analysis of variance was conducted to test for differences between the Interface Groups, with Total Time on Task as the dependent variable. Results revealed a significant main effect for Interface Group when Total Time on Task was the dependent variable, F (1,7) = 6.95, p = 0.038. Results of this analysis revealed that the Bookmark condition was significantly faster in completing the Web page search task during training, as measured by Total Time on Task (see Figure 4). There was no significant difference between Groups when Total Time on Pages was the dependent measure.
Figure 4. Mean Search Time (Min./Sec.) for Web Page(s): Training Task 1
No significant difference was found between the Interface Groups for any of the dependent variables during Task 2 of the Training phase. The means of each dependent variable are shown in Figure 5.
Figure 5. Mean Number Correct and Search Time(s) (Min./Sec.): Training Task 2
Testing Session: Task 1
No significant differences were found between the Interface Groups for any of the dependent variables during Task 1 of the Test phase. The means of each dependent variable are shown in Figure 6.
Figure 6. Mean Search Times to find Web Page: Test Task 1
Testing Session: Task 2
No significant differences were found between the Interface Groups for any of the dependent variables during Task 2 of the Test phase. The means of each dependent variable are shown in Figure 7.
Figure 7. Mean Correct and Answer Time(s): Test Task 2
Mental Demand and User Interaction Scales
Both the Actual and Ideal ranks of the Mental Demand Scale approached significance, F (1,7) = 5.54, p = 0.057 and F (1,7) = 4.52, p = 0.078, respectively. No significant difference was found between the Groups when the User Interaction Satisfaction score was the dependent variable. The means for all three scales are shown in Figure 8.
Figure 8. Mean Results of User Interaction and Mental Demand Scales
Discussion
The hypothesis that user performance would be superior, in terms of Time on Task, for the LAD Group was not confirmed. No significant differences were found between the Interface Groups in either session of the experiment during the testing tasks (Task 2). This phase of the experiment is more relevant to the hypothesis because Task 1 times were only measuring how long it took a Group to find the correct Web page using the Web browser, not how easy (or difficult) each interface is. However, two of the obtained results might be revealing. First, the Bookmark Group's performance was superior, in terms of Total Time on Task, for Task 1 of the Training Session. Compared to the LAD Group, the Bookmark Group took significantly less Total Time to complete Task 1, (F (1,7) = 6.95, p = 0.038; X B = 18.53, X LAD = 30.36). Second the self-rating of the users on WWW expertise approached significance, with the Bookmark Group rating themselves as more expert than the LAD Group, (F (1,7) = 4.91, p = 0.069; X B = 5.75, X LAD = 4.25). These results might indicate that the Bookmark Group was more familiar with the Web than the LAD Group. If so, the fact that no significant difference in performance was subsequently found between the two Groups might indicate that the LAD is relatively easy to learn and use. That is, even though the LAD Group might not have been as familiar with the Web as the Bookmark Group was, their performance didn't suffer on any of the tasks except in Task 1 of the Training phase. Even though not significant, the direction of the Total Time on Question and Total Time on Task means reversed after the Training Session. That is, compared to the Training Session Task 2 means, the LAD Group means were in the expected direction in the Test Session Task 2 phase of the experiment, when more questions had to be answered (see Table 1). This indicates that the LAD may provide an advantage once users are as experienced with the LAD as they are with Bookmarks. More study is required to address this suggestion.
Table 1 Mean Total Times on Question and Task, in minutes and seconds, obtained for each Interface Group during the Training and Testing Phase of the experiment.
The results obtained with the Mental Demand scale are somewhat confusing. That is, the Bookmark Group's Ideal Mental Demand mean was lower than their Actual Mental Demand mean. While the results were not significant, the results from the Mental Demand Scale revealed that the Bookmark Group rated the Actual Demand of their interface as less demanding then the LAD Group rated their interface. The LAD Group's Actual and Ideal Mental Demand means were equal. Thus, it seems that the LAD Group felt that the Mental Demand of their interface was about right, i.e., somewhat demanding (and remember that this group was somewhat less experienced than the Bookmark group). The Bookmark Group apparently felt that their interface was largely undemanding but that it should be even less demanding (see Table 2). This may be an indication of their level of Web experience and frustration with the Bookmark system. It would be interesting to test a more experienced user group on the LAD and determine if this approach reduces the mental demands to those desired by experienced users.
Table 2 Mean Actual and Ideal Mental Demand per Interface Group.
Another possible explanation for the higher Mental Demands score of the LAD Group may be related to the fact that the LAD group had to use the recorder of the Dell computer. All Bookmark participants were familiar with the bookmark system, but we don't know if any of the LAD participants were familiar with the recording system. If not, then this fact would help explain why the LAD Group had a higher Actual Mental Demand rating than the Bookmark Group.
Conclusions and Future Research References
© Internet Technical Group Last update: June 1, 1998 URL: http://www.sandia.gov/itg/newsletter/log_annotation.html hosted by Sandia National Labs Disclaimer: Neither Sandia Corporation, the United States Government, nor any agency thereof, nor any of their employees makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately-owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by Sandia Corporation, the United States Government, or any agency thereof. The views and opinions expressed herein do not necessarily state or reflect those of Sandia Corporation, the United States Government or any agency thereof. |
||||||||||||||||||||||||||||||