In my work circle there has been a lot of talk about growing our institutional repository. There is a big push to add meaningful content. The thing that I always get hung up on though is usage. I’m very interested in what people find useful, and my feeling is that if I’m going to pitch this service to my faculty, then I need to prove to them that the stuff is actually being seen, rather than simply offering them a theoretical argument about why open access is good and big publishers are evil.
So I decided to do a mini study. I wanted to see what the top items viewed were across several universities. I used ROAR to identify DSpace collections in the US, and then sent emails to the libraries with the 10 largest collections. One library never responded, another (MIT) shot me down with ““I'm sorry to report that our staff is unable to provide that data at this time”—but all the others provided me with a list of their top 20 most viewed items. (Thanks!)
I should note that Georgia Tech and U of Oregon were the only organizations in this sample that allowed open access to their statistics.
The results were very eclectic, as expected, however there were definite themes that emerged. For example, the U of Rochester included many musical scores, U of Michigan was heavy with engineering technical reports, Ohio State had numerous articles from The Ohio Journal of Science, U Oregon featured NewBreed Librarian articles as well as classic texts from Shakespeare, Milton and others, while Oregon State included several environmental topics. U Maryland had the most diverse materials and is unquestionably the heaviest used collection within this sample.
Someone should publish a scholarly article about this and perform a detailed synthesis on these collections, but in the mean time, here are the top viewed items from each of the collections:
- Delivery of DNA and Recombinant Infectious Bursal Disease Virus Vaccines in Ovo (dissertation), 34,768 hits, U of Maryland
- How Do I Do This in ArcGIS/Manifold?: Illustrating Classic GIS Tasks, 18,636 hits, Cornell
- Relaxation studies in the muscular discriminations required for touch, agility and expression in pianoforte playing, 8,764 hits, U of Rochester
- A study of the role of carbon in temper-embrittlement and the effect of temper-embrittlement on the fatigue properties of a 3140 steel, 7,155 hits, U of Michigan
- Dragonflies Taken in a Week, 6,650 hits, Ohio State
- Measurement of delignification diversity within kraft pulping (dissertation), 5,517 hits, Georgia Tech (current year only)
- NewBreed Librarian ; Vol. 2, No. 4, 2,093 hits, U of Oregon
- Estimating the weight of plywood, 500 hits, Oregon State
There is definitely a lot of long-tail action going on too. Most of the repositories featured one or two heavily used items, but then dropped off drastically.
- Why is the U of Maryland IR used so heavily? Their top 3 items blow away everyone else (34,768 hits; 32,916 hits; and 32,214 hits respectively)
- How are people finding this stuff? Google? Native Searches? Catalog Searches? Direct Links? We need to run an analytics program.
- How many of these hits are from web crawlers or related software?
- Why the long-tail? What makes those top few items so popular? And just how long is the tail? Could you say something like 90% of everything in our IR was viewed at least once over the past two years?
- If you place your IR within your metasearch tool, will it pad your results?
- Is there a big difference between views and downloads?
- Why does the DSpace interface still look so mid-1990’s?
- How are items obtained? Is it piecemeal or more systematic? Are we building collections or is it random take-what-we-can-get?
- What is the percentage of dissertations? (or, take away dissertations and what have you got left?)
- What non-text items are collected (mp3, videos, jpg, etc)?
- Leaving the big vision rhetoric aside, what is the goal of each IR?
- How do you measure the success of an IR? Is it volume or downloads or something else?
(If this is your area and you want to work on something together, let me know. I'm devoted to ALA Editions right now, but I'd like to continue this project into 2008.)