Evaluation of the Existing UConn Interface
Our evaluation of the existing interface took two forms: an informal heuristic review—such as we had applied to other libraries’ sites—and an analysis of query logs for keyword subject and title searches. We again consulted the well-regarded Web usability principles laid out in Jakob Nielsen’s lists of top ten ways to improve (or diminish) usability when we examined the effectiveness of the research database locator in its current form.12 In addition to Nielsen’s general guidelines and standards, we made use of evaluation principles unique to library websites such as those considered in John Kupersmith’s encompassing analysis of user comprehension of common librarian terms such as database, e-journals, and index.13
It was immediately apparent that the existing RDL interface (see figure 1) failed on several fronts with regard to usability. The interface offered users three means of accessing databases: a keyword search, a “database by title” option (letters of the alphabet linking to all titles beginning with a particular letter), and browse-by-subject pull-down menus broken up into five umbrella subjects: Arts and Humanities, Business, General and Multidisciplinary, Sciences, and Social Sciences. As our usability testing would soon confirm, only one of these means of access—the keyword search—was inviting to most users. The existing design rested on several assumptions: that users would know what they were looking for and be able to correctly type in a title keyword or navigate by first letter, that users who didn’t know titles could accurately choose a correct umbrella subject (and then choose the correct subject from the drop-down menu), and that users who chose the keyword search option would use effective search terms (such as title or broad subject keywords).
The page was also very jargon heavy, relying in particular on the word “database,” which was repeated several times on the page without explanation or direction. By way of instruction, only the keyword search option offered any—and only in the form of advice to use exact phrases or “and/or” to narrow searches. Further into the site, things only got worse. Rather than connecting directly to databases, users were led to text-heavy descriptions that had little consistency from description to description and links into databases that weren’t visually outstanding.
These early visual assessments of the interface helped frame our understanding of the site’s weaknesses. At this stage, we spent some time creating working prototypes for a revised interface. Although the final redesign came out of our assimilation of query log data and several rounds of usability testing, these early conceptual redesigns helped bring into focus our understanding of problem areas and helped us formulate the content of the usability testing that followed.
Our informal heuristic observations were borne out dramatically by an analysis of usage logs from three months: February, March, and May 2006. We gathered data on usage of the three main areas of access: keyword searching, letter (title) browsing, and subject menu/category browsing. Of 41,433 total user actions, there were 18,522 category searches (through pull-down menus), 6,650 title-letter searches, and 15,836 total keyword search-es—or 5,986 unique keyword searches.14
Some of this data was encouraging because it suggested user proficiency in the database locator, particularly in the use of title keywords. Users browsed by title about 16 percent of the time, and, of the keyword searches, they searched for database names approximately 5,031 times (32 percent of the total keyword searches). The use of title keyword searching as a means of access seemed to suggest that a statistically relevant number of users did approach the database locator with a clear destination in mind.15
Less conclusive in terms of how successful users were in finding appropriate databases was what usage data indicated about the use of subject (pulldown) menus. The high use of category browsing (users selected subjects from the pull-down menus more than eighteen thousand times—more than 40 percent of total user actions) suggested that users wanted to search by subject, but whether that led to successful discovery of appropriate databases could only be assessed during our subsequent usability testing of the site.
The most discouraging results came from patrons’ use of the keyword search option when not searching for a database name. Of the more than fifteen thousand keyword searches, 48 percent were unsuccessful because the user attempted what we designated “topic” searches—that is, the user input a narrow research topic rather than a broad subject term. For example, there were multiple searches for topical phrases and words such as “alice walker,” “global access,” “schizophrenia”, “the glass menagerie,” “the orange revolution,” and “marriage ritual.” There were also searches for long book titles—for example: “sex and death in the rational world of defense intellectuals” or “vincent van gogh and a new approach to traditional art practice.” There were one or two searches each on a wide range of miscellaneous topics, from “systemic lupus” to “1970s crime.” Approximately one-third of all keyword searches yielded no results at all.
A significant number of keyword queries (almost one thousand queries, or 6 percent of all keyword searches) made use of words matching academic disciplines: for example “history,” “anthropology,” “business.” The success of these searches was unpredictable and depended on the language in the database descriptions, language we had already discovered in our heuristic analysis to be very inconsistent.
In addition to the above, we were able to make some further observations related to the query logs for keyword searches: Most search phrases were short (three words or fewer); few search-directly to a title (by keyword or title-letter link) should be able to do so simply and without error based on typographic mistakes. Of these, the need for simple subject-based browsing and the importance of preventing unsuccessful topical searches were paramount, and it would be these modes of searching in particular that we would scrutinize in the usability testing phase of our process.
Usability Testing
After examining the design of other libraries’ database locators and reviews of current key literature, the team turned to the most important element of the assessment process—usability testing. Usability testing is different from other methods of sociological and ethnographic research; its goal is not to provide a thorough evaluation of user behavior, it is to highlight the majority of difficulties that most users would encounter when interacting with a website. Jakob Nielsen, in his article “Quantitative Studies: How Many Users to Test?” suggests that a qualitative test with a handful of users (i.e., no more than five) will uncover a majority of the problems: “When you see several people being stumped by the same design element, you don’t really need to know how much the users are being delayed. If it’s hurting users, change it or get rid of it.”16 We wanted to observe users performing real tasks to uncover which parts of the interface worked well for them and which were showstoppers—which features caused frustration and which facilitated their search.
es showed knowledge of Boolean connectors; a number of searches would have been successful in other search mechanisms such as the UConn Libraries’ catalog, the eJournal locator, or the SFX Citation Linker; and typos were a significant problem and affected both subject and title searching. Typos in particular were a vexing issue, causing users to retrieve zero results. The logs indicated that users made a wide range of both expected and unexpected typographical errors. It was predictable that PsycINFO would become “PsychINFO” or “Psych Info” and Infotrac would be transformed to “Infotrack,” but it was less obvious that Factiva would become “factivia” or “factive” and that JSTOR would become “jistor.”
The conclusions we drew from our analysis of the usage logs had three principal strands: (1) users were primarily looking for subject and disci-pline–based browsing tools; (2) users either needed more direction on keyword searching, or they should be steered away from keyword searching altogether; and (3) users who wanted to navigate
Developing Task-Based Questions
Based on analysis of query logs, we were certain many users did not understand the purpose of the keyword search box component of the database locator. Preliminary heuristic evaluation also led us to expect usability problems with the other two components available in the existing research database locator: the A–Z list for databases by title and the pull-down menus for subject browsing.
To gather more evidence on our users’ experience with the site, we set about the task of conducting iterative usability testing. The purpose of testing would be to assess the database locator and each of its three distinct components for:
- effectiveness—were users able to complete tasks successfully and how much effort was required to do so?
- efficiency—how much time was needed to complete tasks? and
- satisfaction—what were the users’ perceptions of the site?
To this end, we developed a slate of questions to be used in testing, which included eight preliminary questions, nine session questions, and ten post–evaluation questions (see appendix). The preliminary questions gathered demographic information regarding university status, home campus, length of time at UConn, age range, library employment, declared major or department, experience level with the UConn Libraries’ website, and participation in any library orientation class. The first session question was intended to capture participants’ definition of the word “databases” and understanding of the purpose of the database locator. The other session questions intended to represent typical research questions that might lead undergraduates to a database locator. Small variations to the questions were created to extend their applicability to graduate students and faculty. These session questions constituted the heart of the testing, being devised as simple, nonleading questions based on realistic tasks a user would perform; each question necessitated that participants experiment with and perform tasks using the site. For instance, users were asked to find articles about diabetes, which required that they translate from the topic diabetes to the subject medicine, find a database in medicine, and connect to that database. The ten post–evaluation questions were designed to gather information about the participants’ experience and satisfaction with the site as well as their posttest understanding of the purpose of the site. When each test was completed, staff administering the test ranked each task by perception of their success—1: “did not complete task;” 2: “completed task, but with difficulty”; and 3: “completed task easily.” In tandem with the questions, an administrator script was developed that included introductory comments and procedural instructions.