As publishers, libraries, and technology providers grapple with customer and end-user demands for discovery as well as delivery of scholarly content, we have seen a surge of scientifically minded search engines and browser extensions released in the last year or two. Have these modern discovery tools finally solved our age-old limitations, from metadata to access controls? Can cutting-edge solutions in data science and authentication unite search and retrieval of scholarly literature (outside the library)?
With my doctoral student cap on, I recently test-drove the latest scholarly search tools to see how well they are responding to reader expectations for synchronized discovery and authorized access of full-text publications. When I last explored search options for scholarly readers, it was obvious that innovations are on the rise, but most academic reader needs were yet to be fully satisfied. However, at this time of this writing, the latest in search software still have a ways to go.
The newest entrants in the scientific search market demonstrate that the expected focus on the research workflow has indeed arrived. The focus on user experience is clear and commendable. In particular, we’re seeing a strong drive toward solving access control issues, with several competing options for browser plug-ins to sync with institutional credentials and follow a user across various databases and platforms. In my experience, however, no one has yet cracked the code for connecting discovery to full-text delivery. In fact, after tinkering with various browsers and settings and search criteria, I must confess that the newest entrants in the discovery marketplace have not inspired me to change my workflow.
Before I proceed, dear reader, let me clarify that I am not the average scholarly user. Given my professional background in academic and professional publishing, I take up my own doctoral work with a measure of insight into the sausage-making realities of scholarly communications. However, in my Ph.D. journeys, I have experienced the same tangled-web realities of digital and remote academic pursuits of typical researchers. (Also note: This post does not pretend to emulate the type of rigorous bibliometric or systems testing from other information researchers.)
While working on a doctoral publishing project the other day, polishing the reference list prior to submitting an original article, I needed just a quick citation check — and was quickly reminded of the chaotic information experiences and obstacles in digital research. While fuming about a simple metadata mismatch that cost me 20 minutes of biblio-sleuthing, I began to question my own research practice and citation-search workflow. Usually, I start with my library’s discovery layer or Google Scholar (where I have set up library links). But, maybe I’m stuck in a rut? Perhaps there are newer tools I should be learning and adopting? Might the latest plug-ins be the answer to some of these stumbling blocks? Should I rethink my off-campus approach to content discovery and access?
I decided it was worth a few hours one Sunday to run some discoverability tests, much like those I employ in my professional life. Using one of the known items from my search project a few days prior, I used a heuristic (exploratory) method to compare the results and choose-your-own-adventure pathways offered by several new free / mainstream academic search products. As you’ll see in my results, that conclusion would change dramatically if (when?) my library were to license one or more of the leading solutions for off-campus authentication.
My baseline for this comparative exercise was my successful (albeit sometimes frustrating) library or library-enabled Google Scholar experiences. Although I can expect the occasional embargo period, broken link or indexing error, I have developed a somewhat efficient formula for using the two together as my go-to search starting points. And, as both information professional and information scholar, I have a degree of faith in both my university library’s systems and the Scholar database to surface the most relevant, authoritative, accessible content for my initial search purposes. In this test, a search of my library’s installation of Primo Central located the article in question and offered a handful of access options via a link resolver page. Similarly, the same search on Google Scholar pointed me to the target article and the “[email protected]” library link sent me to the same link resolver page of institutionally enabled access options.
With that benchmark in mind, I began my tests with the newest open-web search tool that week, 1Findr, from the Canadian-based 1Science. Entering the paper’s title, author surname, and publication year into the search bar produced relevant results, including my target article. While the DOI link pointed to the primary publisher site, the bold full-text button took me straight to a PDF of the version of record on the author’s institutional repository. For my citation-checking purposes that day, I trusted the former over the latter, but on the whole I’d say that 1Findr is one to watch.
Using the same query format, I next explored the free version of Dimensions, part of the suite of workflow products from Digital Science. Initially, I expected to find a new STM search engine, but was surprised to see numerous multidisciplinary subject filters across the social sciences. Despite Dimensions’ efforts to be multidisciplinary, however, my target paper was not to be found there. Although were a number of headings within the information sciences, they favored the hard-science-leaning fields of systems design and bibliometrics. From what I’ve seen so far, Dimensions is a promising platform with some pretty sharp tools, in particular to filter and compare various data-points, covering dozens of information needs across the industry. I can imagine I’ll return to Dimensions for other needs, both scholarly and professional. However, for my initial search-and-retrieve trial run, it did not convince me to change my scholarly discovery process.
As an extended point of comparison, I also tried out my citation search with a few other established scholarly info-seeking options. Asking the same query of Microsoft Academic led me to an endless stream of results, none of which included my target article. The same search in the public version of OCLC’s WorldCat also surfaced thousands of results, but the use of quotation marks did the trick and directed me to the article on the publisher’s site. For good measure, I also tried a few new services that I know to be specially designed for biomedicine or other STM fields. As expected, I was unable to locate my target article through MyScienceWork, as I have found in the past with ScienceOpen and Semantic Scholar. Frankly, I did not bother testing my citation with Meta, given the hard-science focus (and the required registration before any searching can be done). So, I nixed these as out of scope for purpose of this test.
I then enlisted help from Kopernio, one of the several new browser extensions that aim to facilitate cross-platform content retrieval leveraging a user’s institutional credentials, recently acquired by Clarivate. After tripping over some bugs and finding the app worked best in Chrome, a new search in Google Scholar led to Kopernio retrieving a version hosted with Academia.edu. I went back to Dimensions and tested 5 other known items in addition to my original target article, but could not successfully link from Dimensions’ free search results to library-enabled full text via Kopernio. Although my library account appeared to be correctly plugged into the app, I saw no indication of successful proxy access and all subsequent test searches led either to pages in ResearchGate, Academia.edu, or directly to PDFs with a kopernio.com address (without indication as to the hosting source). While Kopernio was successful in retrieving the necessary item, thus receiving a check-mark in the table below, the results left me with some doubt regarding the business logic and why the article’s source was being obscured. I wonder if other students would question the authority of the sites often produced by Kopernio searches. I’m guessing this linking behavior will raise flags for publishers and other industry stakeholders as well.
Notably, selecting the Unpaywall app from the preview page of my target article produced a message stating that it “couldn’t find any legal open-access version of this article.” I found this very interesting, especially given the PDFs provided by Kopernio and 1Findr. If my library subscribed to the LeanLibrary or Anywhere Access apps, I’d be curious to see how closely they mirror or surpass the experience with Google Scholar library links — and I’d be hopeful they would generate more check-marks in the access column of the table below. Despite this recent spike in browser extensions that aim to bridge the gaps between mainstream search and institutional access, Google Scholar’s library links have proven the most reliable in getting the job done to date.
On reflection, these latest offerings in scholarly search prove that discovery is no longer enough and we’re seeing full-text access more regularly paired with search indexes. All major players seem equally focused on synching up with authorized institutional access as the ideal scenario, though some are engineering more stringent checks for legitimacy than others. Any scholarly reader searching on the mainstream web is at risk retrieving outdated versions or questionable resources. However, those apps that default to unverified PDFs on sites like Academia.edu or ResearchGate may increase pressure on libraries and content providers to participate in indexing and data-sharing, to increase the likelihood of capturing usage on primary publisher sites. Perhaps this practice will do just the opposite and drive away cross-sector cooperation.
In the latest search technology, discovery workflows are increasingly mobilized, with browser plug-ins catching on as an extended service for platforms to deliver both personalization and access to content across the sectors (including non-academic providers, like DeepDyve). To make this magic happen, dozens of successful linkages must be maintained across all necessary library systems, publisher sites, and intermediary data sources which is a challenge for all service providers, so I acknowledge that performance inconsistencies and bugs are unavoidable. However, I doubt users will see value in a service that does no better than a standard Google search and cannot consistently integrate with off-campus credentials.
Most of these new discovery and access services promote “fast” retrieval in “one click,” which speaks to the motivation to simplify the scholarly user experience. However, the more noble goal may be to strike a balance between quality and convenience, rather than to favor one over the other. Some new providers are prioritizing both authorization and validated versions of record over speed of delivery or convenience — which may cause some uneven user experiences, but are values which are usually well received by publishers and libraries alike. And, ultimately, I believe scholarly users will accept minor delays if they know they are receiving legitimate, citable resources in a way that honors the hard work of the authors behind the research they are reading.
In addition to reliability of content, breadth and coverage of publications may be an even sharper point of the competitive edge in this domain and some may give current giants, like Scopus and Google Scholar, a run for their money. The newest entrants follow suit in our industry’s dominant focus on journals and the discoverability of articles. While I predict ongoing metadata challenges and resulting usage limitations for reference works, book chapters, and videos, some of these new services are extending their reach to all types of content libraries and their users care about. This includes patents, grants, preprints, conference proceedings, OA articles, and institutional repository materials — which hints at new opportunities and/or disruptions for traditional library discovery services.
One takeaway from this exercise is that user priorities for discovery tools are largely subjective and situational. In my test case, I had a specific information need to validate a citation, but these search tools are being designed to attend to many different user practices and purposes. Generally, though, I’m finding that the top priorities for most scientifically oriented searchers are accuracy, coverage (both topical and in publication breadth), access, and reliability / authority. The ease of use and usability of search-result page designs sit alongside these priorities, where services with simple, clear, and familiar information architecture win out every time.
In the end, I decided my search process was just fine — my allegiance remains with my library because they are the one player in this landscape that can both reliably get me to a citable version of a given publication. I (heart) my library — or any service, like Scholar, that works with my library to make my workflow as smooth and successful as possible. I admit this means that I accept lack of personalization and the occasional metadata, proxy access, or other technical issues. And, as my library has invested in Google Scholar’s library links, I trust they will perform due diligence on the newest plug-in options and continue to further support us remote, off-campus students. Together, my library and Google Scholar come closest to Roger Schonfeld’s vision for a scholarly discovery supercontinent.
Acknowledgements: The author wishes to thank Roger Schonfeld, Sara Rouhi, Jason Chabak, Jan Reichelt, and Johan Tilstra for their insights and contributions to this post.