To evaluate service, AskColorado’s Quality Assurance and Evaluation subcommittee (QA&E) was convened. This subcommittee reviews AskColorado chat transcripts monthly and recommends best practices to improve the quality of the service. While evaluating the chat transcripts, QA&E focuses on two major components: quality of response and quality of interaction.16 The authors of this article were members of QA&E and involved in evaluating chat transcripts for several years.
At the request of AskColorado’s coordinator, QA&E undertook a study in 2006 to identify the prevalence of inappropriate use of the service. The study identified eighty-nine transcripts from 2003 and 2004 that contained offensive, rude, or irrational patron behavior. These transcripts were 8.7 percent and 5.3 percent of the samplings from each year, respectively, leading the committee to conclude that inappropriate use was minimal and perhaps decreasing.17
An unpublished follow-up study of 2005 transcripts identified another seventy-five inappropriate transcripts, 10.2 percent of the sampling. This possible increase in the prevalence of inappropriate behavior led the committee to desire further study, specifically an analysis of librarian behavior in these transactions, the purpose being to identify ways in which the inappropriate behavior of patrons might be prevented or mitigated by the behavior of the librarians.
The RUSA Guidelines
The RUSA Guidelines were chosen as the instrument by which librarians’ performance could be measured in this study. They comprise five broad dimensions divided by subordinate measures. Each category includes three subcategories specific to librarian–patron interaction settings: general, in-person, and remote. The remote subcategory focuses on reference encounters by chat, e-mail, or telephone. A brief summary of the RUSA Guidelines and how they were applied to this study follows. Appendix A provides our adaptation of the RUSA Guidelines to create an instrument with which to evaluate transcripts.
-
Approachability: “In order to have a successful reference trans-action, patrons must be able to identify that a reference librarian is available to provide assistance and also must feel comfortable in going to that person for help.” Approachability in this study was determined by the time elapsed between a patron’s log-in to AskColorado and a librarian’s response, and by the tone of the librarian’s greeting, a function of RUSA Guidelines 1.2 and 1.5.
-
Interest: “A successful librar-ian must demonstrate a high degree of interest in the reference transaction.” Interest in this study was deter-mined by both quantitative mea-sures of “word contact” (how frequently librarians sent messages) and qualitatively (how explicitly librarians indicated interest in working with the patron). RUSA Guideline 2.6 was evaluated with these two approaches and aggregated to determine a score for interest.
-
Listening/Inquiring: “Strong listening and questioning skills are necessary for a positive in-teraction.” This area was one of the largest included in this study, incorporating primarily ordinal scales for 3.1 and 3.3–10.
-
Searching: “The search process is the portion of the transaction in which behavior and accuracy intersect.” Searching was another significant area applied to this study, using a combination of two-point and ordinal scales for most of the 4.0 subordinate areas.
-
Follow-up: “The librarian is responsible for determining if the patrons are satisfied with the results of the search.” Follow-up was determined in this study as an aggregate score of two-point scales for 5.1, 5.2, 5.4, 5.5, 5.7, 5.8, and 5.9 (remote).
Though not all RUSA Guidelines could be applied to this study, the au-thors felt a majority of them were ap-plied in a sufficiently complex way to analyze librarians’ performance in each of the five broad areas.
Method
Because no standard instrument by which behavior can be evaluated against the RUSA Guidelines exists, the authors developed one (see appendix A). Only RUSA Guidelines that were reason-ably observable in chat transcripts were used, and each of the five major categories functioned as an aggregate score of all its subordinate measures. This method was used so that a macro-level analysis would be possible. Models for using the RUSA Guide-lines to evaluate transcripts have since been designed, but at the time of the genesis of the study there was only one.18 Most of the rubrics developed for this purpose employ chiefly two-point scales, where the coder simply assessed whether or not a guideline was observed, and the analysis centers on the prevalence of behaviors observed in the transcripts. The instrument in this study employed both two-point and ordinal scales, where the coders decided to what extent the behavior was observed on a 0–5 point scale in all measures that lend themselves to the method, and yes-or-no scales in those that did not. The authors believed this would result in a finer instrument, perhaps measuring the librarians’ performance more thoroughly.
But the finer instrument also was more complicated. The scale underwent three major revisions before the three coders tested it using three randomly selected transcripts. The results found that the three coders disagreed on sixteen of the thirty-two measures, and on six of them disagreed quite starkly. The authors felt the instrument needed to be refined and that inter-rater reliability statistics should be used to test it. Two additional revisions to the instrument were made, focusing on the six measures wherein there was most disagreement. In addition to many changes in language and definition, measure 3.2 was changed from a two-point scale to a nominal scale measure. After these changes were made, the original three transcripts and an additional three were used to test the instrument again, so more than 5 percent of the sampling would undergo inter-rater reliability testing.