Notes from CNI Fall 2019

Clifford Anderson
12 min readJul 30, 2021
Photo by Ales Nesetril on Unsplash

In December 2019, I attended the Coalition for Networked Information (CNI) Fall 2019 membership meeting in Washington, D.C. The Coalition on Networked Information describes its raison d’être as advancing “the transformative promise of digital information technology for the advancement of scholarly communication and the enrichment of intellectual productivity.” Nancy Godleski and I represented the Vanderbilt University Library at the meeting. What follows are my notes, gently edited and written from a personal perspective, of the plenary presentations and breakout sessions I attended. The full schedule of the meeting is available online. If you want to watch the videos of these presentations, I believe some are also available for viewing on the CNI YouTube channel.

In the opening plenary of the fall meeting, Clifford Lynch, executive director of CNI, described a few issues that he had been “puzzling over” recently. What follows are highlights of Lynch’s environmental scan of emerging developments within librarianship.

  • What is the relation between record-making and record-keeping? How should archivists document events like movements, protests, natural disasters that take place in realtime? Lynch asked about the roles of memory institutions in preserving these collections.
  • How to interpret the past in the face of changing social norms and technologies? How might archives serve the purpose of reconciling hostile communities? What happens when aspects of the past that were hidden away become readily visible? Following the digitization of archives, Lynch pointed to a second round of this dynamic with the rise of computer vision, making people’s faces and handwriting discoverable within our special collections.
  • What about the destruction of places due to climate change and extreme weather conditions? To document these disappearing places we will need to place greater emphasis on 3D imaging and modeling.
  • Research on quantum computing is moving forward, but it is not yet practically deployed apart from a few experimental offerings like IBM Q and Amazon Bracket. We should be worried about the security of our cryptographic systems. We may need another decade to develop post-quantum cryptographic systems. Should archivists preserve cryptographically-protected flows of data with the hope that quantum computing will make their contents decodable?
  • How do we bridge the gap between research data management (RDM) and so-called “structured knowledge,” i.e., machine-readable data in linked open data (LOD) formats like RDF? In the past, the communities working on RDM and LOD have largely operated independently, but they’re increasing collaborating. How might such collaborations bear fruit for research data management in the coming years?
  • How do we heal the split between the services marketed to individual consumers and the collections sold to libraries? In the face of the balkanization of streaming music and video services, how can libraries collect and provide access to “exclusive” content from these various service providers?. In the past, libraries have invested considerable sums to connect balkanized services with tools like link resolvers, etc. But scholars are rebelling against the friction introduced by these services, effective as they are, by turning to sites like Sci-Hub, not because they do not have access to otherwise licensed materials but because of their ease of use.
  • What are the unintended consequences of protecting consumers’ data privacy? The cost of compliance with GDPR has proved too high for smaller U.S. publishers, which are now blocking access to their sites from Europe in order to avoid implementing GDPR protections. In the short term, this phenomenon has led to a loss of information about the U.S. available to Europeans. How might the emergence of state laws related to patron privacy balkanize the information landscape in the United States?
  • Lynch observed the rise of “adversarial interoperability” as a way to overcome the monopolization of big tech. The phrase points to technologies and tools that make it possible to aggregate data from services that service providers prefer to keep within so-called “walled gardens.” See Corey Doctorow’s Mint: Late-Stage Adversarial Interoperability Demonstrates What We Had (And What We Lost) for background information and thoughts about the underlying dynamic.

Lynch noted that while these matters all have technological components, the primary issues are about “society, practices, economics, organizations, geopolitics.” This provides librarians with a crucial vantage point as we operate across this spectrum in a way that other campus units typically do not.

The first session I attended was Artificial Intelligence: Impacts and Roles for Libraries and was led by Jason Griffey, Director of Strategic Initiatives at NISO, and Keith Webster, Dean of Libraries at Carnegie Mellon.

Griffey distinguished between weak AI, that is, machine learning algorithms trained for specific narrow purposes, and strong AI, which seeks to emulate general human intelligence. He reviewed the problem of biased AI, when algorithms are trained on biased data and wind up magnifying those biases in their outcomes. He also discussed the problem of deepfakes, which has become democratized to such an extent that you can now visit websites to create ‘fake people’ on demand; see, for instance, ThisPersonDoesNotExist.com. Griffey asked what these developments mean for librarians and scholarly publishing. He pointed to Andromeda Yelton’s HAMLET project, which utilizes machine learning for identifying semantically-similar electronic theses and dissertations, as an exemplar project. He also noted the advent of AI-powered search engines, which will have implications for metadata production. Things have advanced to the point that you could outsource your research to an AI tool. Griffey conceded that the results would be “garbage” at present, but will likely be credible enough to fool experts in ten years. He concluded by sharing his worries about the rise of these AI systems. Are librarians ready to take on the challenges that such systems pose to traditional research processes?

In his talk, Keith Webster discussed the potential of having an artificial intelligence working alongside human intelligence to help overcome ingrained biases and to generate creative insights. Webster noted the symbiosis between open access and artificial intelligence because the machine learning algorithms need lots of data and tend to draw on open data and open access articles as sources. Webster asserted that we need to rethink what a library is, pursuing a strategy of dissolving traditional functions in favor of supporting emerging services that researchers are already using in their labs. At Carnegie Mellon, the university library is contributing to the curriculum around AI, teaching humanities students about using data with credit-bearing courses like “Discovering the Data Universe” as well as offering regular noncredit Data and Software Carpentries workshops. The special collections of the university library also provide access to historical objects like the Enigma Machine for those researching the history of computing. Webster’s emphasized that librarians need to say in regular dialogue with researchers, building connections between the liberal arts and engineering, for instance. Crucially, the library is not only helping humanists to learn the technologies of AI but also assisting students and faculty across the university with grappling with the ethics of AI. Librarians have a “professional responsibility to be the voice of conscience” about artificial intelligence at their institutions. After his talk, an audience member asked Webster about the background of the librarians who are teaching these skills; Webster noted that he is recruiting new staff members from the data and computationally-intensive disciplines to work alongside trained librarians.

During the second panel session, I attended “Ready or Not: Here Comes Voice Search.” Carl Grant, interim dean of the University of Oklahoma Libraries, started the presentation by asking why librarians should support voice search platforms like Alexa, Siri, etc. He pointed out that children and teens already rely on voice search for their information-seeking needs. Importantly, kids can start seeking answers to their questions before they learn to read or write by using voice search devices. As these children come into higher education, they will also expect libraries to support voice search. The trend will intersect with the business services that Amazon, Google, Microsoft, and others are promoting to the enterprise.

In “Can Can We Talk? Adding a Smart Assistant Interface to Library Services,” Lisa Smith and Greg Davis of Iowa State University Libraries discussed their efforts to support voice search in the library. Smith and Davis noted that Primo plans to add voice search in its upcoming interface. They have partnered with ConverSight.ai to adapt Libro, a product from public libraries, to an academic library context. The question of handling personal information properly is crucial. If you limit all access to personal information, you restrict what the app can do. In the current version of the software, patrons must sign up for an Alexa account and agree to the sharing of personal information with Amazon. The next phase of their project will move them away from their dependence on the Alexa voice engine, allowing users a choice of back ends. The next version will also integrate the voice agent with a chat agent. This makes the tool accessible to patrons with hearing disabilities. While the technology is ready to be deployed, the university library has not rolled it out to patrons because of policy issues, which are difficult to navigate.

The last session I attended on Monday was titled “Into the Dataspace: Data Science Services on the Ground.” Mike Nutt, associate head of data and visualization services at North Carolina State University (NCSU), described the movement from designing spaces and services for data visualization to data science. Nutt led the planning and implementation of The Dataspace in the James B. Hunt Jr. Library at NCSU. He and his team of fellow librarians (7 FTE) offer a range of data science workshops, focused primarily on MATLAB, Python, and R. After the presentation, audience members asked questions about collaborating with research computing/high performance computing and also about how to encourage undergraduates to attend workshops. Nutt noted that the visibility of the physical space served as its best advertising since undergraduates do not typically understand the significance of terms like “digital scholarship,” but they do drop by when they seen an inviting library space with lots of cool technology.

The first session I attended on Tuesday morning was about setting a research agenda for librarians’ exploring the domains of data science, machine learning, and artificial intelligence. Thomas Padilla authored a report titled Responsible Operations: Data Science, Machine Learning, and AI in Libraries after a yearlong residency as practitioner researcher in residence at OCLC Research. He developed the report in conversation with librarians and allied cultural heritage professionals across the United States. The report address seven different and, as Padilla stressed, interdependent areas. For example, how do we “manage bias” in the library? Might we extrapolate from longstanding practices in librarianship to address distortions in descriptive metadata in order to take on the problem of prejudiced data in machine learning? Would it be possible to share data and models with the library community in order to foster transparency? Padilla emphasized that monocultures cannot mitigate bias; building diverse library staffs is imperative if libraries wish to produce fair data and models.

After Padilla’s presentation, Sarah Shreeves, vice dean at the University of Arizona Libraries and a member of the advisory board for the project, shared her reflections. She emphasized the need to develop a library workforce capable of understanding data science without hiring a single person to tackle the whole problem on behalf of everyone else. If we do not train our existing staff, she noted, we “risk being priced out” of the market for individuals with data science and machine learning expertise. The question of training for new librarians as well as professional development for existing staff resonated among the audience, which is natural as most attendees at CNI come from the ranks of library administrators.

The topic of the next session, “Refreshing the Agenda for Collaboration: Library, IT, and New Partners,” was similarly focused on setting an agenda for collaboration, this time among educational technologists, librarians, and information technologists. Clifford Lynch and Joan K. Lippincott, associate executive director of Coalition for Networked Information, reflected on how much the technology landscape has changed since the early 1990s. IT expertise is no longer centralized in IT organizations, but shared across schools, faculties, libraries, research computing, and specialized centers. Lynch noted that patterns differ “tremendously from campus to campus.” In light of the “massive generational shift,” Lynch and Lippincott convened senior campus leaders from these diverse areas to contemplate and to formulate a collaborative agenda going forward.

Lynch’s “biggest takeaway” around the topic of collaboration was that while in the past conversations about infrastructure took place bilaterally between campus IT and library leaders negotitations about information technology these days are fundamentally multilateral, drawing together everyone from digital humanists to data scientists. The value of developing personal relationships with specific leaders in these areas has diminished because the number of conversation partners has increased significantly and new participants in these conversations are regularly departing or being on-boarded. This dynamic has made reaching consensus about informational technology projects in higher education much more difficult to attain, Lynch emphasized. Another emerging trend that Lynch noted is that support for research technology (think: the people who develop software and support scholarly communication and research-sharing platforms in faculty labs) is currently being undergirded largely by graduate students or postdoctoral fellows who hold temporary appointments. The nascent field of research technology support should be professionalized and people already working in this area should be provided with viable career paths within the academy.

For her part, Lippincott spoke about CNI’s conversations around teaching and learning, highlighting, for instance, the development and promotion of Open Educational Resources (OER). The availability of OER textbooks is closely connected with student success, she indicated, as students require access to textbooks to succeed in their courses. She hopes that CNI will promote “an institutional approach” to tackling the major changes taking place in OER as well as commercial electronic textbooks, including faculty and campus bookstores along with librarians and IT professionals. Among the other topics she covered during her wide-ranging talk were support for makerspaces, research data management, and professional development programming (via initiatives like the Software Carpentries).

Lynch wrapped up the session by discussing policy and governance issues, especially ethical issues connected with collecting student success data, protecting the privacy of readers in libraries, and setting up institutional data governance models. Finally, he poignantly asked how we make activities in these areas sustainable? With research infrastructure dependent on grants and onetime funding of special provostial initiatives, how can universities maintain research infrastructure over the longterm? For a full report about these conversations, see IT & Library Leaders, Refreshing the Agenda & Priorities for Collaboration.

The final session I attended was about preserving local television news programming, a subject dear to our hearts at Vanderbilt. Morgan Gieringer, head of special collections at the University of North Texas, gave a talk titled “Pay to play: Licensing Local Television News Content.” She described UNT’s ongoing project to preserve and digitize the evening news collections of NBC5/KXAS, a local television station. She estimated that the digitization effort will cost $3,000,000; UNT has raised $750,000 from a variety of sources toward that effort. Gieringer admitted that she has no idea when the project will be completed; it could take as long as twenty years unless new technologies emerge to speed things up. Crucially, while the station maintained its copyright, it designated the University of North Texas as its sole licensing agent, agreeing to devote revenues from licensing fees to digitization until the collection has been made completely available online, at which point the station and the university will split the revenues. A key takeaway is that the success of any effort to preserve local media depends on fostering strong business relations with local television stations.

The fall meeting concluded with a philosophical reflection on memory and forgetting in the 21st century. Kate Eichhorn, associate professor of culture and media studies at the New School, described changes between the media culture of the past century and the present in her plenary talk, “Forgetting and Being Forgotten: Growing Up in a Digital Era.” On the one hand, children’s ability to document their lives is novel and should be celebrated. But what are the social and political consequences, she asked, of this newfound capacity? She remarked that children do not control the context of their images, which circulate far beyond the narrow settings in which they were created and intended to be seen. Drawing on Erik Erikson’s notion of a “moratorium” between childhood and adulthood, she argued that that moratorium is “eroding” with the “decline of forgetting” powered by the internet. The digital lives of children and teenagers, she contended, “will follow them into the future.” Eichhorn suggested that kids’ have become acutely aware of their digital reputations and, for that reason, have also become increasingly adverse about sharing personal information online. If we do not allow children privacy as they grow into adulthood, she asked, do we not risk calcifying their adolescent commitments to the detriment of personal growth and social development? From a libraries and archives perspective, the effect that the digitization of documents like yearbooks and the preservation of data from school days will have on students’ lives and careers as they transition into adulthood needs careful attention and discussion.

Attending the fall membership meeting of CNI was both exciting and informative. Throughout the conference, I wished that I could have attended more than one of the concurrent sessions. Several attendees described CNI as akin to “drinking from a fire hose.” Given the prominence of data science and artificial intelligence in contemporary higher education, much of what I learned at the meeting related in one way or another to supporting such emerging activities within libraries.

On Monday evening after dinner, I walked over to Kramer Books, a wonderful bookstore and cafe near Dupont Circle. As I wandered among tables stacked with new volumes, I noticed a line snaking from one end of the store to the other. Inquiring, I learned that Dave Eggers, author of A Heartbreaking Work of Staggering Genius and The Circle, among many others, was holding a signing for his recent book, The Captain and the Glory. I bought a copy and joined others in line, listening to the casual conversations about novels, music, and movies taking place around me. Eggers amiably inscribed the copy for my son, chatting briefly about introducing adult literature to kids. The experience underscored what I already knew, namely, that people cherish encounters with books and authors.

As we teach machines to read, librarians must also help students join communities of literary culture. How to achieve both ends simultaneously under circumstances of fiscal scarcity remains an unsolved challenge for contemporary librarianship.

Originally written in December 2019

--

--