Newspapers and their respective forms help build and sustain communities. They create dense overlapping networks of relationships that have a particularly strong importance to civic culture. If we want to nurture or even maintain civic culture, then a change to the forms of such an important medium demands critical attention. This paper seeks to better understand what happens to the newspaper when its print and microfilm forms are digitized and databased. This transfigurative process creates a complex weave of materials, practices and protocols circulating from public institutions to a series of private corporations, and, as the digitized papers flicker in and out of visibility, raises important questions about access. The result is the epitome of what Roy Rosenzweig has described as “the fragility of evidence in the digital era,” and a natural object for investigation through network archaeology.1
My particular focus here is on historical newspapers, since they represent an increasingly attractive target for corporate information firms like Ebsco, Readex, and ProQuest, which, as a group, offer some of the most sophisticated, wide-ranging and expensive database products to educational institutions and libraries. However, not all historical newspaper databases are corporate-owned. There are also a large number of public, freely-available databases on the Internet, ranging from sophisticated digitization initiatives like Chronicling America2 (e.g. Library of Congress), to provincial and state initiatives (e.g. Manitobia3 ), and small-scale local initiatives.4 While these latter projects are typically predicated on servicing public needs by making a community’s documentary heritage widely accessible,5 commercial databases seek to re-commodify an otherwise dead form, generating “new revenue from old news.”6
This paper presents a case study of Canada’s first commercially operated historical newspaper database, PaperOfRecord.com, formerly owned by Cold North Wind (CNW), whose Canadian holdings were procured through an agreement to license microfilms owned by the Canadian Library Association. In 2006, Google purchased the database for its historical news archive (though the sale was not public or even finalized for several years), effectively shutting down PaperOfRecord. Under Google, much of the digitized content was unavailable to the public. Moreover, many researchers suddenly found they could no longer access their primary source materials.7
Changing material forms not only imply a specific politics of information; they also implicate public policy decisions that determine the possibilities for our documentary heritage. This case is interesting because it points to the political economy of information around these databases, an economy that is the consequence of public policy decisions made by Library and Archives Canada (LAC) years before the launch of PaperOfRecord.8 This complex political economy renders highly problematic any simple claims about the relationship between digitizing and networking historical records and the democratization of information precisely because the residual layers of policy, practices and politics are utterly invisible in the digital record. The invisible residue of the past then becomes what Recall Raymond Williams refers to as “the selective tradition”:
Thus certain experiences, meanings, and values which cannot be expressed or substantially verified in terms of the dominant culture, are nonetheless lived and practiced on the basis of the residue – cultural as well as social – of some previous social and cultural institution or formation … It is in the incorporation of the actively residual – by reinterpretation, dilution, projection, discriminating inclusion and exclusion – that the work of the selective tradition is especially evident.9
Understanding how the residual continues to exert its influence entails considering the transfigurations that historical newspaper collections undergo as they are transfigured from paper to microfilm to digital scan to database.
My approach attempts to “think media archaeologically,”10 to treat “media cultures as sedimented and layered, a fold of time and materiality where the past might be suddenly discovered anew.”11 I will follow this consideration with a brief history of PaperOfRecord, focusing on its relationship to Canada’s various public library institutions in order to demonstrate how an archaeological approach to media history allows us to map out the hidden networks of relations that govern how a newspaper page appears as a digital object. Network archaeology calls attention not just to the history of a given media object, but also the history of the networks that constitute the object. I suggest that the patient tracing outward, not only “downward,” offers a more fulsome account of the present-day object.
Tracing the Material Form: Surrogacy, Remediation, Transfiguration
The shift from one medium or form to another is never a self-evident or seamless transfer of meaning and representation. Form, after all, is never innocent.12 Form sets up the conditions that constitute the relationship of the newspaper with its readers; it materializes and symbolizes the way the newspaper “imagines itself to be and to act. In its physical arrangement, structure and format, a newspaper reiterates an ideal for itself.”13 In other words, form is ideological. This observation, of course, has been made by other media theorists, from Benedict Anderson,14 whose interests in the periodicity of newspapers leads him to consider how communities are formed, to James Carey’s now-famous argument about how newspapers perform and instill rituals of communication,15 to McLuhan’s aphorism “the medium is the message.”16
Suggesting that content worked like a “juicy piece of meat carried by the burglar to distract the watch-dog of the mind,” McLuhan argued that the real content of any medium was other media.17 Though he frequently resorted to simple examples to explain this thought (the content of the movies were plays or operas; the content of writing or print was speech), McLuhan opened the door for thinking about media as sets of relations, especially with other media, rather than discrete entities. In short, McLuhan advocated for thinking about media as an interconnected field of relations, which he termed an “environment.”18 In his essay, “Print and Social Change,” McLuhan suggests treating print not as a singular invention, but an assemblage of technologies and practices accumulated over time, including both the material (advancements in paper) and the cultural (habits of reading, for example).19 The work of print, however, was to fix and thus obscure these relations.
Yet new media routinely perform the same sorts of obfuscation, complicating the relationship between the transitory and the permanent, producing an “enduring ephemeral.”20 No better description could be offered for historical newspaper databases that produce new value, not in stabilizing the otherwise ephemeral newspaper (after all, this act was performed long before in libraries), but in constituting new relationships across a multiplicity of texts. What adds value to the digitized and databased newspaper is its relationality to other texts – that is, its potential circulation.
An examination of forms, then, necessarily confronts how they interact with each other as texts circulate through new contexts, technologies and readerships. Roger Chartier writes in Forms and Meanings that “Each of the forms obeys specific conventions that mold and shape the work according to the laws of that form and connect it, in differing ways, with other arts, other genres, and other texts. If we want to understand the appropriations and interpretations of a text in their full historicity we need to identify the effect, in terms of meaning, that its material forms produced.”21 Here Chartier calls attention to how the circulation of texts transfigures them as they encounter other texts, particularly as material paths create networks between readers, texts, producers, distributors, and so on. Of course, all of this is obvious to the book historian; book history has been invested in tracing these material networks for decades. But what happens in the context of transition from print to a digital network, with various analog mediations along the way? And what happens when the audience is no longer invested in the continuous story of a community, but in a piecemeal investigation of a word, a topic, an event, or other singular asset?
What, exactly, are we looking at when we view a digital scan of an historical newspaper page? How do we consider the materiality of the digital historical newspaper page in relation to the objects we know more familiarly, the archived printed page, or the microfilm? Some theorists, historians and librarians have suggested treating these objects as instances of remediation.22 But a remediation of what? Since the majority of digital newspaper databases are scans of microfilm, what exactly is “rival[ed] or refashion[ed] … into the real,” as Bolter and Grusin assert?23 Are digital scans best understood as “surrogates,” as librarians often like to call microfilm? If so, what are they substituting for? Lilly Koltun warns that “An irony being played out in the archival profession, as elsewhere, is the belief, particularly prevalent among the techno-rhetoricians, that data is ‘salvaged’ or rescued by transference, for example, from tape media to digital formats; present readability is bought at the cost of even greater ephemerality and more rapid intervals of future reformatting.”24 I suggest that neither the concept of remediation nor that of surrogacy provides an adequate conceptual framework for thinking about the traces that remain of the power, ideologies, discourses and institutional policies that mark objects as having “intrinsic” or “permanent” value in the language of archives.25
To investigate historical newspaper databases is to confront the material traces of the previous encounters of these texts as they are transfigured from one medium to another. Dilip Gaonkar and Elizabeth Povinelli’s concept of transfiguration concerns the way that the demands of different contexts change the function of cultural objects as they circulate. As such, it orients analysis toward questions of power – chiefly, those that concern the institutions that determine the “intelligibility, livability and viability” of objects in circulation.26 In an era where data is described as the “new oil,” cultural institutions like libraries and archives are struggling to find their place among corporate giants like ProQuest, Ebsco and Google (who, until May 2011, offered free access to digital historical newspapers through its Google News Archive program). What’s at stake in the transfiguration of the newspaper page into data is the embedded politics of preservation. In order to understand access in a digital context, it’s necessary to consider what was saved, how it was saved, and who is offering the content.
Material Transfigurations: Paper, Microfilm, Database
As a cultural form, newspapers – at least since the 1880s – have imposed an order and logic that readers have come to understand and adopt. As newspapers became increasingly commercialized, moving away from partisan sources of funding to advertising revenue, new features emerged to replace the hodge-podge of news, witticisms, advertisements and opinions that characterized the pages of the mid-nineteenth century newspaper.27 News became organized by bold headlines and sub-headings summarizing key elements Especially on the front pages, headlines began to span multiple columns. Discrete sections emerged, each replete with their own banners. Recurring features like women’s and children’s pages and sporting news of all kinds also began to appear. These sections did not simply imply an order, a logic that enacted its imagined community; they also created predictable rhythms of consumption. Readers came to know that “Kit’s Kingdom” would be found week after week on page 6 of the Saturday Toronto Mail. At the same time, Sunday newspapers across the United States created vast quantities of coupons and contests that disrupted regular periodicity by encouraging readers to collect, re-organize and save their newspapers.28 As Mark Turner suggests, modern media helped establish new rhythms for habits of leisure that themselves responded to the change from an agrarian economy to an industrial one.29
Rotogravure sections, magazine inserts and supplements of all sorts (from toy theatres to art prints) changed the materiality of the newspaper by inserting differently textured and sized paper amid the more familiar newsprint. These material, generic and format changes meant nineteenth and twentieth century readers were subjected to an increasingly rationalized order. This had much to do with the professionalization of journalism and the shifting labour structure of newspaper publishing from a publisher-editor to an increasingly diversified workforce that separated editorship and publishing while transforming the newspaper into a commodified corporate product. This rationalized order was also connected to the commodification of news, to the cultural need for the newspaper to guide and orient its readers in an increasingly complex cityscape.30
Preserved and stored as microfilm, the newspaper remains a modern cultural object. It is linear, following the chronological flow of its paper original. Photographically fixed as a singular image, its images offer a record of the paper: a sedimented cultural object tied to a specific geographic location and specific community of readers. Interface, at least in the old electro-mechanical microfilm readers, is determined entirely by the variances of particular microfilm readers, since “access” to the original object is not possible. Even digital readers, which also offer distinctive software interfaces that add new layers to the experience of the form, ultimately do not change the way in which a user-reader accesses the data itself.
Regardless of the specific method for accessing this analog form, similar logics to those for reading the newspaper rule the use of microfilm. Scrolling mimics the turning of pages. The dominant logic mirrors the temporal flows of publication – or at least the major flows of publication. In most instances, multiple editions were not filmed, nor were most supplements and inserts. For instance, despite the uniqueness of a Sunday edition for the Toronto World, precious few of its Sunday editions were actually microfilmed.
On microfilm, searches occur title by title, reel by reel, image by image. While it is still in its newsprint form, readers have the option to open the newspaper wherever they like, skipping the sports or passing off the funnies. In microfilm, the tyranny of the binding is doubled: there is no way into the paper but to scroll past all the previously filmed pages. Since microfilmed copies of newspapers were made almost exclusively from bound volumes of newspapers, whatever order and inclusions (or, in some always exciting cases, disorder) the bindery imposed are the ones that microfilm readers must follow decades later.
In other words, microfilm is ordered by residual traces of earlier preservational decisions, as Nicholson Baker’s work argues so forcefully.31 Microfilm is not only organized by the logic of periodicity; layered on top of that logic are the vagaries of binding, often not performed by libraries themselves, as well as a library classification order. Further, these decisions, when performed by the publishers and printers themselves, served specific production plans for serials. Laurel Brake has demonstrated that newspaper publishers frequently transformed successive series into volumes, adding new title pages, indices or tables of content, or bound them in boards, sometimes even covered in leather.32 The process of binding re-commodified formerly ephemeral newsprint publications into a new “information logic” that asserted its cultural value through its reference to the book.33 Layered in the object viewed on the screen, then, are many residual traces of ordering schemes and types of cultural valuation that aren’t always self-evident. Without a concerted network-archaeological investigation, these residual traces might remain totally invisible, yet they shape our reading experience of microfilmed newspapers in subtle and important ways.
Another example of the subtle variety: microfilming newspapers flattens geography and compresses temporality. During microfilming, the exclusion of whole issues, or important pieces of them, such as their covers, adverts, or supplements for breaking news, occurs frequently. The result is that the commercial nature of journalism and a publication’s niche markets, and questions of geographical distribution are eschewed in favour of a centralized, orderly, logical volume that tells a chronological story.34 The geographic scope of a microfilm search not only confronts these decisions; it depends entirely on the holdings of a particular institution. Because libraries are organized to serve particular constituencies, they prioritize newspaper holdings for geographically relevant papers (again, reifying the logic of the newspaper’s circulation itself) and rely on the archival code of “intrinsic value” to determine what other titles they might carry.35
Newspaper databases operate by a different logic altogether, where algorithms define the dynamic arrangement of individual elements on a virtual page. In “Database as Symbolic form,” Lev Manovich argues that the database constitutes a fundamental shift in the organizing principles of culture from linear narratives based on the cultural form of novels and film, to programmable, nonlinear arrays of data that do not privilege any one item over another.36 This organizing principle, which Alan Liu calls “Discourse Network 2000,” bears no resemblance to the ordering schemes imposed by publishers before or after binding, librarians or microfilm companies.37 Within the range of possibilities determined by the interface, the user determines the array of data presented with each search, within a range of possibilities specified by the programmers of the interface software. As Christiane Paul notes, what makes digital databases distinct from their analog predecessors is the ability to retrieve and filter data in multiple ways.38 Further, Manovich writes, “as a cultural form, database represents the world as a list of items, and it refuses to order this list.”39 Networked databases disrupt the temporal and spatial arrangements that once dominated how one read a newspaper (on paper or microfilm), making local small-town papers as available – and potentially interesting as research objects – as major metropolitan papers.40 It is, in short, the end of the tyranny of front pages and big-city dailies, at least on a number of levels. Adverts, personal ads, and birth and death notices exist on the same plane of retrievability, and relevance, as the headline story. Like Mussell puts it, “there is no such thing as ephemera, just data.”41
Though Manovich rightly points out that the database destroys the classic modernist order of narrative, a different set of possibilities around narrative emerges anew. Recent focus on the visualization of data drawn from databases, for instance, helps us to tell different narratives through the rearrangement of data by foregrounding relationships that analog ordering schemas had obscured. Networked databases facilitate the creation of large data sets, and comparative research across larger geographic regions. As many university users know, the databases that are increasingly accessed through university libraries are, in fact, multiple databases connected under universal searches, and sold in increasingly more complex, and more expensive packages by a relatively small number of corporations.42
In the world of historical newspaper databases, ProQuest dominates, with Ebsco offering an ever-growing range of thematically packaged databases that pull from a range of historical documents, archives and serials. With databases containing historical documents pertaining to early Americana, New York, Latin studies, African American history and Revolutionary War orderly books, these packages bring together a range of resources that are often only available through visits to particular archival institutions, or several scattered throughout the United States. These databases, all packaged separately to maximize profit, harness the power of metadata and fully searchable text in order to provide access to “millions of pages, eclipsing all other resources.”43
Further, by providing archival institutions with revenues from licensing, Ebsco also offers archival institutions an important source of income that often makes exclusive agreements that lock up their content in these databases too tempting to ignore. The possibilities of the kinds of searches that databases facilitate dispel the myth “of an eternal and finished past as the essence of the meaning of our records” that has governed traditional approaches to archival practice.44 The current cultural desire to digitize – whether to make records more available, or “moderniz[e] for the digital age”45 – not only favours certain power arrangements (technology, capital, etc), but is constituted itself by residual practices of preservation, access and valuation.
It is equally important to recognize the very real changes and limitations that occur as this “data” moves across forms, formats and media, creating other issues as what was once ephemeral endures.46 Jerome McGann argues that Manovich relies too heavily on binaries between databases and archives, or modern, “privileged” narrative and the post-narrative world of digital texts.47 McGann suggests such binaries fail to account for our documentary heritage sufficiently in understanding what databases actually do or how they work. The “progressivist story”48 about digital tools suggests that databases provide “liberated knowledge” while archives represent “reified knowledge.”49 This narrative fails to understand how constrained databases are because of the ways their hierarchical organizational structures are embedded in their functioning, especially in their interfaces. The power of databases lies precisely in their “ability to draw sharp, disambiguated distinctions” in abstracting data.50
While this discussion of database theory and its materiality is certainly essential to any analysis of the transition to digitized archived records, a fuller understanding of how the archival record was established in the first place is just as crucial to the discussion. In what follows, I turn to a case study of Canada’s first corporate-owned historical newspaper database, tracing the complex and overlapping roles of public institutions, public policy and private corporations. There are no simple exhortations to make about the value-added nature of databases, nor of the purity of their paper predecessors. Indeed, as we trace through the various layers of policy that affected the record which eventually ended up on PaperofRecord.com’s users’ screens, we begin to see a variety of forces which continue to exert their influence on the contemporary record.
The Public Record: Traces of Policy
Public initiatives to put historical newspapers online range from nationalistic projects, like those launched by the Library of Congress, the British Library, or Bibliothèque nationale de France,51 to the smaller, provincial initiatives like those for the Bibliothèque et Archives nationale du Québec, Our Ontario (now known as Our Digital World), or Manitobia, hosted by the Manitoba Library Consortium.52 These projects are important for their commitment to public access, yet they vary greatly in their stability of funding and allocation of resources and staff over the long term. Art Rhyno, a librarian at the University of Windsor, former board member of the Ontario Library Association and former owner of a family-owned newspaper enterprise, became involved in digitization for both the Our Ontario initiative and as a way of digitizing his family’s vast collection of microfilmed issues of the Essex Free Press.53 The project to digitize as many community newspapers as possible began through collaboration with the Internet Archive on several pilot projects and ended with Rhyno using idle computers at night in the University of Windsor’s library to process the digital scans and run the OCR. Rhyno, interested in keeping costs to an absolute minimum, used an American company to digitize microfilm records for as little as 2.5 cents per page (roughly about $20 a reel) and was able to add only the most basic metadata to the records. The Our Ontario project became an amalgam of some half-completed projects from community libraries that had run out of funding, along with clippings, and some complete runs of papers. In sum, the holdings of the eventual database were not as complete as other similar projects.54 Our Ontario provides an ideal example of how a network archeology approach is ideal for analyzing the current online product: without a careful tracing of each libraries’ policy and funding constraints, a robust understanding of the digital object that appears today is simply impossible.
The Decentralization Program
While a casual observer might expect to encounter incomplete projects and lack of funding at the local level rather than the federal level, the national library in Canada has virtually no presence at all online. The reason for this dates back to a series of policy decisions in the late 1970s and 1980s. Though there are, no doubt, many policies there that one could trace, I will trace but one, the Decentralization Program for Canadian Newspapers, which began in 1979. Using the National Library as a coordinating force rather than a centralized depository, it sought to coordinate and consolidate the preservation of newspapers across the country, with the voluntary participation of the provinces, especially for papers that would not be otherwise microfilmed because they were unlikely to be of interest to commercial enterprises.55 The key to the program was that the responsibility to create and nurture the collection and preservation of newspapers rested with local actors, not the National Library. At the time of implementation, there was no union list of newspapers across the country (not completed until 1987).56 Many collections were in poor condition, unsorted and uncatalogued.57 Three years after the announcement of the new plan, after close collaboration with the provinces, the National Library of Canada had a clear mandate. Its job was to coordinate the plan across local districts in order to avoid duplication, to create and maintain a union list and register of microform masters, and acquire all microfilms made at the provincial level, as funds permitted.58 Aside from the commitment to collect and preserve all native and ethnic newspapers, along with the first, last and historic supplements of Canadian newspapers, the National Library did not have its own complete collection or even a preservation mandate.59 The commitment to buy copies of provincial microfilms became particularly important to the development of the microfilm project established by the Canadian Library Association.
The long-term legacy of this policy for today’s digitization projects, however, is that Library and Archives Canada does not own its own copies of microfilms from which to digitize. This situation was further compounded by a policy of the 1985 Conservative government requiring all government institutions to avoid operations that could be better conducted in the private sector. The policy effectively ceased the limited amount of microfilming the National Library was doing.60 Consider how Sandra Burrows, a former newspaper specialist for Library and Archives Canada, put it: “With nothing more to bring to the table than its good name and with the added requirement that access to scanned materials must be made available free of charge to anyone who accesses the LAC website, LAC is often left out of the equation of the scanner and the scanned.”61
The Microfilm Committee
The history of earliest institutionalized microfiliming in Canada provides another telling example of how cultural policy, which is never evident on the page before a reader, impacts decisions made decades later. Launched in 1948 with a $15,000 grant from the Rockefeller Foundation, the Microfilm committee of the Canadian Library Association began its filming efforts with Confederation-era papers. Once these were complete, the committee worked backwards in an effort to film papers from the “earliest dates” in Canada’s colonial history.62 This decision was purely ideological since, even by the committee’s own admission, the papers were printed on good rag-paper stock. Preservation was not driven by deteriorating paper, but by a teleological view of history that focused on firsts.63
The sale price for any of these early CLA-produced microfilms was determined by adding 20% to the cost of producing the negative, which included the cost of collating, labour, insurance and film. In other words, the cost of filming each newspaper was only covered once five copies were sold. By 1949, the Committee raised the price so that it would take only three sales to pay back the cost of producing the film.64 Though the Microfilm committee of the CLA struggled to keep costs of film purchases reasonable, it could only do so if it continued to sell filmed papers. In time, this pressure to break even began to affect the selection of titles to be filmed. Provinces who ordered the most copies of the final product had their newspapers filmed first.65
By 1953, the desire for the sound fiscal management of the Committee had compromised the selection process. Short, relatively rare titles had been favoured over longer runs of what were considered to be average daily papers and periodicals. By this time, some newspaper publishers were going directly to filming and skipping the binding of volumes altogether, but there was little to no incentive for them to take on the task of filming their already-bound volumes. Weekly local papers remained virtually untouched, unless commissioned specifically on a contract basis, despite the likelihood of these papers being of the greatest interest to historians and genealogists.66 This very brief history of microfilming in Canada demonstrates that the uneven collecting and preservation activities of the Library and Archives Canada has meant that other organizations, both public and private, have had to fill in the role of preserving Canada’s newspapers.67
These early curiously disconnected preservation policies have considerable implications for the presence of Canadian content online today. Connecting databases to the Internet, via other networks or databases, creates a global community of readers. Where many other countries have an online presence produced by their own national archives, there is no Canadian cognate. Instead, there are provincial and local initiatives, leaving the field wide open for private corporations, or publishers themselves, to fill in the gaps at the national level – much like the history of microfilming. But many of these newspaper databases are created and managed by corporations, making access to them contingent on a range of factors that are distinct from the question of access faced by public institutions such as libraries and archives. While access to these privately-owned and operated databases remains an issue, in the context of increasingly jeopardized library budgets and public support for such projects, private corporations are meeting essential needs in our information economy. Making historical records of all kinds more widely available to a larger range of people results not only in greater interest in our shared collective past; it also tends to make people more invested in the processes of archiving our history for the future.
Yet the residual material traces of microfilm and paper remain. Almost all databases of historical newspapers are comprised of digital scans of microfilmed copies of bound newspaper volumes.68 This double iteration of paper and microfilm means that, although the database handles the data differently and organizes it in ways that are potentially infinite, access remains limited to choices that were made decades ago. For newspaper historians interested specifically in the ephemeral, in the materiality of newspapers and in the newspaper form as an object of study in its own right (not as a means to learning about something else) databases offer access to only some kinds of data. Colour – a key factor that transformed newspapers in the 1890s – is not restored. Illustrations remain black smears that fail to convey the artistry and technical virtuosity in this form of printing and engraving. Format is no more discernible than it had been previously. Missing sections and inserts of all kinds are irretrievable. That is to say, many of the same limitations that existed with microfilm have persisted – though, of course, many others – important others – have disappeared.
Once filming is done, microfilms can be copied easily and sent far and wide. In the case of papers printed on cheap wood pulp newsprint, microfilm offers what crumbling paper cannot. Volumes of newspapers that are so fragile that they are removed from circulation altogether certainly fail to offer better access. Searchable, networked databases extend access to the newspaper in extraordinary ways, but not necessarily in all of the ways that we need them to.
Tracing the transfiguration of the newspaper page as it has circulated through institutions, ideologies and policies demonstrates that the claims about increasing access through the digital are, in other words, highly circumscribed from the onset. It further demonstrates that a network-archaeological approach to the history of contemporary objects must also attend to the political economy of the residual traces of previous forms. But these issues become further complicated by new values that come to dominate how old newspapers are re-commodified into information products. Databases persist in using microfilm because digitizing from paper copies – assuming there are adequate copies around (and because of microfilming, in many case there are not) – is prohibitively expensive.
Case Study: PaperOfRecord.com
Cold North Wind (CNW), the company that launched PaperOfRecord, Canada’s first website offering digital scans of historical newspapers from Canada and around the world, provides an excellent example. As Bob Huggins, co-founder of CNW, has suggested, digitizing from paper copies is too costly for a for-profit company to bother.69 Scanning microfilms has the advantage of employing an automated system, with relatively little human intervention in the process, allowing CNW to scan millions of records in a very short period of time.70 The logic of production easily trumps the logic of utility. Some of CNW’s first large-scale scanning was for The Toronto Star (the largest investor in CNW) and The Globe and Mail, wherein they were paid by the page. In order for the process to be cost-effective, automation was essential.
CNW developed their digitization process by working with abandoned microfilms found in a storage room in Calgary owned by the Canadian Library Association. Huggins describes these microfilms as “widowed” titles: they tended to be single-reel publications that a library would purchase one copy of, without any hope of an ongoing subscription, as with an actively publishing paper. In fact, these were often titles that the CLA deemed to be historically important enough to microfilm but, depending on when they were filmed, may have cost them more money to film than they recouped in sales. They were of little value to the CLA, and thus 252 titles, including Le Journal de Quebec, the Montreal Herald, and the Toronto Telegram were licensed to CNW, with the option to purchase in future. This arrangement characterized most of the agreements that CNW made with the publishers of the papers it digitized, with the exception of the Toronto Star, Globe and Mail and the National Newspaper Association, who retained exclusive rights to the data and who launched their own websites.71
The necessity to exercise the option to purchase the CLA’s holdings came in 2005, when Google began negotiating with CNW for the inclusion of their data in its expanding Google News Archive. According to Huggins, CNW approached LAC to seek out a partnership, and was at the point of signing a Letter of Intention for the purchase of the database for $1.2m, before, as he put it, “everything went to hell in a handbasket,” largely because of the problem of guaranteeing public access.72 With competition intensifying from ProQuest, who had far more capital to secure licensing agreements with large, important newspapers, CNW sold all its Canadian digitized material to Google. At the time of sale, CNW had digitized 20 million pages of newspaper from around the world. Despite the huge cache of data that was sold to Google, it reportedly paid “much less than $3 million” for it.73
When launched in 2001, PaperOfRecord.com was subscription-based; several years later, its Google-supported version offered free access. After 2006, it largely disappeared, when the content of CNW’s database was sold to Google. Currently, the website is running once again, and is subscription-based, though the costs are quite modest. In May 2011, Google announced it would end its Google News Archive partners program, continuing to offer what had previously been put online, but ceasing the active accumulation of more records. For reasons that remain unclear to this day to most users, even to Huggins, who was involved in negotiating the deal with Google, a large amount of the data never made it online. Users of PaperOfRecord who were in the throes of research suddenly found their source materials gone, with absolutely no explanation from Google about why it had disappeared, or when it might reappear. The promise of integration through networking failed. As Huggins described the situation, the entire situation was puzzling, though a bonanza for its investors. Despite having purchased all of PaperOfRecord’s content, Google allowed PaperOfRecord to re-launch, though its content is once again behind a pay wall. As Huggins described it, “it’s like selling your house, but continuing to live in it for free.”74
Unlike the books project where Google itself scanned the books that it put online (even building some of the hardware necessary to do so), the historical newspaper collection was largely facilitated through partnerships with aggregators, including ProQuest, Heritage microfilm, and CNW.75 Though there was considerable controversy around the launching of Google’s news aggregation site, Google situated its historical content within its larger motto of “organizing the world’s information and making it universally accessible” by allowing users to “explore history as it unfolded”. Yet, as the criticism mounted, Google also demonstrated that its real value lay in offering information about its users to news institutions. For instance, offering access to historical content would show newspapers the true value of their content, and could permit news organizations to adjust subscription rates according to traffic. Access had its limits.
Driven by a huge market of genealogists, sports enthusiasts, academic and corporate libraries, the opportunity to receive licensing fees has meant that many organizations, libraries and newspapers are enticed to sell off their holdings to entities that will scan, aggregate and re-sell them at a profit. Efficiency is key on every level, from licensing to scanning and processing the pages and adding metadata. Source materials are almost always microfilms precisely because digitizing them is the cheapest way to automate the scanning process. Further, the metadata for these files is not usually written by librarians and archivists, but outsourced to IT workers. The ways that these bits of data behave, including their functions, uses, and relationships to other objects, are determined not by recognized indexing and classification standards, but by the cheapest means possible.76
Much like the hierarchical classification systems that dominate traditional archives and libraries, digital systems similarly mobilize hierarchical systems that are equally ideologically-loaded.77 Following Sida Vaidyanathan, I am suggesting that what is desperately needed is an intervention that places the public good at the heart of its venture.78 What governs these private databases is not the logic of the public good, or the logic of scholarship. It is the logic of the marketplace, more specifically, the logic of the junk bond.79 Working with historical materials under the logic of the database requires all of us to consider the epistemological and democratic potential lost and gained by this changing form.
As I’ve argued, the study of cultural policy should be key to the practice of network archaeology for several reasons. It helps to provide explanations for otherwise puzzling decisions about the selection of sources. It also helps to identify the residual material histories of the formal transfigurations that print can undergo before digitization occurs. Historical accounts cannot only dig into a medium, but must look across networks of media as well. This is important because these transfigurations had their own logic – logic sometimes determined decades before digitzation occurred by the policies of geographically and culturally specific institutions. Without doing the patient research necessary to make sense of such transfigurations, it can be difficult, if not impossible, to understand that the forms we see onscreen are not so much faithfully captured images of a mass-produced print artifact as they are dense laminations of bureaucratic and commercial decisions.
Roy Rosenzweig, “Scarcity or Abundance? Preserving the Past in a Digital Era,” The American Historical Review 108, no. 3 (2003): 736. ↩
See, for instance, www.fultonhistory.com ↩
Projects like the Library of Congress’ Chronicling America project and the Gallica database by the Bibliothèque Nationale de France have explicitly nationalist agendas attached to them as well. ↩
This became the motto for Cold North Wind when it discovered that interest in their services went beyond strictly archival sales. Jeff Nelson, a co-founder of CNW, suggested at the time of launch for PaperofRecord.com that their services would provide “unlimited shelf life” to the “limited shelf life” of microfilm (Jeff Buckstein, “Blowing new life into long-forgotten history: Cold North Wind is digitizing crumbling microfilm records of newspapers so that stories hundreds of years old are permanently preserved on the Web.” Ottawa Citizen. 3 Aug 2001). ↩
The observation was originally made by Robert Townsend on the blog for the American Historical Association: http://blog.historians.org/news/771/paper-of-record-disappears-leaving-historians-in-the-lurch. A search of the Google news forums offers a record of the range of complaints and conversations that took place between users after the data disappeared while under Google’s control. ↩
Library and Archives Canada (LAC) was formed in 2002 after the merge of the National Archives of Canada and the National Library of Canada (NLC). The National Library was founded in 1953. ↩
Raymond Williams, Marxism and Literature (Oxford, UK: Oxford University Press, 1977), 122–123. ↩
Jussi Parikka, What Is Media Archaeology? (Cambridge, UK: Polity Press, 2012), 2. ↩
Parikka, What Is Media Archaeology?, 3. ↩
Kevin G. Barnhurst and John C. Nerone, The Form of News: A History (New York: Guilford Press, 2001), 7. ↩
Barnhurst and Nerone, The Form of News, 3. ↩
Benedict Anderson, Imagined Communities: Reflections on the Origin and Spread of Nationalism (London: Verso, 1983). ↩
James W. Carey, Communication as Culture: Essays on Media and Society (Rev. ed. New York: Routledge, 2009), 13–36. ↩
Marshall McLuhan, Understanding Media: The Extensions of Man, 1st ed. (New York ; Toronto: McGraw-Hill, 1964). ↩
McLuhan, Understanding Media, 18. ↩
Marshall McLuhan, “The Relation of Environment to Anti-Environment,” in Marshall McLuhan – Unbound, ed. Eric McLuhan and W. Terrence Gordon, vol. 4, 20 vols., 2005th ed. (Corte Madera, CA: Ginko Press, 1966). ↩
Marshall McLuhan, “Printing and Social Change,” in Marshall McLuhan – Unbound, ed. Eric McLuhan and W. Terrence Gordon, vol. 1, 2005th ed. (Corte Madera, CA: Ginko Press, 1959), 13. ↩
Wendy Hui Kyong Chun, “The Enduring Ephemeral, or the Future Is a Memory,” Critical Inquiry 35, no. 1 (n.d.): 148–171. ↩
Roger Chartier, Forms and Meanings: Texts, Performances, and Audiences. From Codex to Computer (Philadelphia, PA: University of Pennsylvania Press, 1995), 2. ↩
Laurel Brake, “The Longevity of ‘ephemera’: Library Editions of Nineteenth-century Periodicals and Newspapers,” Media History 18, no. 1 (n.d.): 7–20; Katrine Mallan and Eun Park, “Is Digitization Sufficient for Collective Remembering? Access to and Use of Cultural Heritage Collections,” The Canadian Journal of Information and Library Science30, no. 3–4 (2006): 201–220. ↩
Jay D. Bolter and Richard Grusin, Remediation: Understanding New Media (Cambridge, MA: MIT Press, 2000), 65. ↩
Lilly Koltun, “The Promise and Threat of Digital Options in an Archival Age,” Archivaria 47 (1999): 125. ↩
Richard Cox, “The Great Newspaper Caper: Backlash in the Digital Age,” First Monday 5, no. 12 (n.d.), http://frodo.lib.uic.edu/ojsjournals/index.php/fm/article/view/822/731; Mallan and Park, “Is Digitization Sufficient for Collective Remembering? Access to and Use of Cultural Heritage Collections.” ↩
Dilip Parameshwar Gaonkar and Elizabeth A. Povinelli, “Technologies of Public Forms: Circulation, Transfiguration, Recognition,” Public Culture 15 (October 1, 2003): 396. ↩
In Canada, this process occurred in the 1880s, but in the United States, it began considerably earlier. Barnhurst & Nerone mark the start of the period from the 1780s (chapter 3). Important to note too: commercialization always accompanied the work of even the most partisan newspaper – print jobbing is just one example, see Lisa Gitelman, Paper Knowledge: Toward a Media History of Documents (Durham, NC: Duke University Press, 2014) – as is the ways in which commerce influenced the inclusion of commercial news early papers published (see Barnhurst & Nerone, 68). ↩
For instance, the Detroit Free Press had a scheme that saw the “Free Press Man” knocking on random doors, looking for the previous weeks’ Sunday paper in exchange for a dollar. These types of schemes encouraged readers to treat their papers as something other than an ephemeral object whose utility was used up when the sections were read, or the next day’s edition was released (see Sandra Gabriele, “Cross-border Transgressions: The American Sunday Newspaper, the Lord’s Day Alliance and the Reading Public, 1890-1916,” Topia: A Journal of Cultural Studies 25, no. 2 (2011): 115–132.). Art supplements and posters similarly treated the newspaper as a keepsake, disrupting the usual ordering achieved by the periodicity of a commodified object. For more on art supplements see Sandra Gabriele and Paul S. Moore, The Sunday Paper: The Circulation of Magazine, Cinema and Radio Features in North America, 1888-1922 (University of Illinois Press, Forthcoming). ↩
Mark Turner, “Periodical Time in the 19th Century,” Media History 8, no. 2 (2002): 183–196. ↩
Gunther Barth, City People: The Rise of Modern City Culture in Nineteenth-century America (New York: Oxford University Press, 1980); David M. Henkin, City Reading: Written Words and Public Spaces in Antebellum New York (New York: Columbia University Press, 1998). ↩
Nicholson Baker, Double Fold: Libraries and the Assault on Paper (New York, NY: Vintage Books, 2001). ↩
Brake, “The Longevity of ‘ephemera,’” 3. ↩
James Mussell, “The Passing of Print: Digitising Ephemera and the Ephemerality of Print,” Media History 18, no. 1 (2012): 77–92. ↩
Brake, “The Longevity of ‘ephemera,’” 7. ↩
Cox, “The Great Newspaper Caper: Backlash in the Digital Age.” ↩
Lev Manovich, “Database as Symbolic Form,” in Database Aesthetics: Art in the Age of Information Overflow, ed. Victoria Vesna (Minneapolis, MA: University of Minnesota Press, 2007), 39–60. ↩
Alan Liu, “Transcendental Data: Toward a Cultural History and Aesthetics of the New Encoded Discourse” Critical Inquiry (2004): 49–82. ↩
Christiane Paul, “The Database as System and Cultural Form: Anatomies of Cultural Narratives,” in Database Aesthetics: Art in the Age of Information Overflow, ed. Victoria Vesna (Minneapolis: University of Minneapolis Press, 2007), 96. ↩
Manovich, “Database as Symbolic Form,” 44. ↩
This observation is borrowed wholesale from Paul S. Moore. ↩
James Mussell, “The Passing of Print: Digitizing Ephemera and the Ephemerality of Print,” Media History 18, no. 1 (2012): 11. ↩
Price quotes to institutions with costs to subscribe to any of their historical databases provided by ProQuest are considered confidential and cannot be published here. ↩
Ebsco, 5 New Digital Historical Digital Archive Databases on EBSCOhost, 2013, http://support.epnet.com/promotion/promo.php#product. ↩
Koltun, “The Promise and Threat of Digital Options in an Archival Age,” 119. ↩
Library and Archives Canada, “Report on Plans and Priorities, 2011-2012” (Ottawa, 2011). Available at http://www.tbs-sct.gc.ca/rpp/2011-2012/inst/bal/bal-eng.pdf ↩
Chun, “The Enduring Ephemeral, or the Future Is a Memory.” ↩
Jerome McGann, “Database, Interface and Archive Fever,” PMLA 122, no. 5 (2007): 1588–1592. ↩
McGann, “Database, Interface and Archive Fever,” 1589. ↩
McGann problematically includes libraries, museums and archives under the moniker of archives. All these institutions have preservational mandates of one nature or another, but given that McGann’s key argument about databases rests on a fundamental misapprehension of “our paper-based inheritance” (1590), the collapse of institutions, practices, mandates and handling of paper-based artifacts is curious and misguided. ↩
McGann, “Database, Interface and Archive Fever,” 1591. ↩
In an important and historic development, le Bibliotheque nationale de France recently signed agreements with private companies to digitize over 70,000 books, 200,000 sound recordings and other documents belonging (either partially or wholly) to the public domain. The agreements guarantee 10-year exclusive rights to the companies digitizing the materials, with only a limited number of them being offered online by the library. As Communia, a European research consortium, points out: “instead of taking advantage of the opportunities offered by digitization, the exclusivity of these agreements will force public bodies, such as research institutions or university libraries, to purchase digital content that belongs to the common cultural heritage. As such, these partnerships constitute a commodification of the public domain by contractual means.” (“ French national library privatizes public domain materials,” http://www.techdirt.com/articles/20130130/07141521824/french-national-library-privatizes-public-domain-materials.shtml. 31 Jan 2012). In a more recent development, Library and Archives Canada announced a strategic partnership with canadiana.org, a not-for-profit membership alliance that digitizes Canadian print history. In an unusual twist, the funding for the digitization of some 50,000 microfilm reels is coming from the Canadian Research Knowledge Network, a consortium of research libraries, and the Canadian Association of Research Libraries, largely made up of university libraries. Through this 10 year deal the general public will access the digitized records through the candiana.org site, while partner libraries will have access to “enhanced features,” presumably including any specialized search functions (for canadiana.org’s take on the deal see: http://www.canadiana.ca/en/lac-project-faq; for a librarian’s perspective see Mita Williams’ blog, New Jack Almanac, http://librarian.newjackalmanac.ca/2013/06/the-heritage-heritage-minute-and.html). ↩
A not-so-complete list can be found at http://www.collectionscanada.gc.ca/newspapers-at-lac/035005-8000-e.html. Also see the list provided by the International Coalition on Newspapers: http://icon.crl.edu/digitization.htm. ↩
Art Rhyno, Personal Interview, Windsor, ON, July 23, 2012. ↩
Rhyno added that since they were dealing with community publishers, licensing was never really an issue for their work. With an implicit agreement that if a more lucrative option would appear that Our Ontario would remove their digital content, most publishers thought the utility of having access to digitized records themselves was sufficient enough. Community libraries were similarly inclined to participate in the project since they would often have digitized content, but lacked the means to put the content into a database where the data could be better utilized. ↩
Sandra Burrows, “Online Access to Newspaper Content in Canada,” The Serials Librarian 53, no. 1–2 (2007): 152. ↩
Hana Komorous, “Introduction,” in Canadian Newspapers: The Record of Our Past, the Mirror of Our Time, Proceedings of the Second National Newspapers Colloquium, Vancouver, BC, June 11, 1987 (Ottawa, ON: National Library of Canada; Canadian Library Association, 1989), v. ↩
Sandra Burrows, “The Decentralized Plan for Canadian Newspapers: 1983 to 1994 and Beyond,” in Serials Canada: Aspects of Serials Work in Canadian Libraries, ed. W Jones (Haworth Press, 1995), 55–83. ↩
Burrows, “The Decentralized Plan for Canadian Newspapers,” 58. The mandate of the library also included the responsibility to produce bibliographic and preservation standards, provide limited funding to develop master plans for newspaper collections, and facilitate interlibrary loans across the country and internationally. Regrettably, in the latest round of funding cuts to the Library and Archives Canada, the interlibrary loan program was suspended in 2012. ↩
There was no legal requirement to deposit paper or microforms with the National Library until 1988 when a requirement for legal deposit on microfilms was enacted. ↩
Burrows, “Online Access to Newspaper Content in Canada,” 152. ↩
Burrows, “Online Access to Newspaper Content in Canada,” 153. ↩
James J. Talman, “Twenty-two Years of the Microfilm Newspaper Project”,” Canadian Library 25, no. 2 (October 1968): 141. ↩
When the committee was forced to decide between two titles considered to be of equal importance, the paper whose issues were scattered across institutions was chosen so that the most complete run as possible would be provided. The work of the Committee then, was not simply to preserve papers, but also strove to achieve what couldn’t be achieved with original paper copies. In this way, libraries could purchase as complete a run as possible (Talman 141). ↩
Interestingly, in 1965-6, the CLA lowered the price, so that it was distributed over four copies (Talman 142). ↩
Talman, “Twenty-two Years of the Microfilm Newspaper Project,” 143. ↩
Talman, “Twenty-two Years of the Microfilm Newspaper Project,” 143. Of course, these papers, especially the dailies, were filmed. It took a long time though, with The Weekly Globe, 1844-1849, being filmed in 1953—some 5 years after microfilming was begun. By 1968, the years 1858-1869 were completed and sold for $718. And interestingly, the smaller, local papers that were not microfilmed now make up some of the more interesting digital scans that are available precisely because they are scanned from paper originals – not microreproductions. Viewing these images reminds us of how much is stripped away in the process of shrinking down and stripping out detail. Of course, other microfilming was taking place by a handful of private corporations as well (Starr names the most active as Commonwealth Microfilm Products, Microfilm Recording, Prestong, McLaren, INTECH and Société canadienne du microfilm. Mary Jane Starr, “The Preservation of Canadian Newspapers,” Microfilm Review 15, no. 3 (Summer 2986): 162–164). In these cases, the micropublishers, as they are called, film the newspapers, sell copies to libraries, then pay publishers a royalty. In the case of smaller dailies and weeklies, there is often no royalty to pay, as only the costs are typically recouped. In the case of limited circulation and small revenue papers, commercial publishers will typically not film these at all. In this instance, the National Library’s decentralization plan and the work of the Library Association’s microfilming committee became critical in preserving these newspapers. ↩
For a more exhaustive account of microfilming in the United States, see Baker’s Double Fold (2001). ↩
An exception to this is offered by the Chicago Public Library, which has some colour, digitized collection of Chicago papers, including the Tribune and Examiner. http://www.chipublib.org/images/examiner/index.php ↩
Bob Huggins, Personal Interview, Montreal, Quebec. March 20, 2012. ↩
In fact, in 2001 when the site launched, it was converting microfilm into searchable data at a rate of 45,000 pages a day, with plans to increase output to 150,000 pages a day (Jack Kapica, “Ottawa’s Cold North Wind stirs international attention.” Globe and Mail, 28 Feb 2001). ↩
Interestingly, the Globe and Mail and Toronto Star databases are now owned and operated through ProQuest. ↩
Bob Huggins, Personal Interview, Montreal, QC. March 20, 2012. Roch Carrier, former National Librarian, has publicly commented that LAC could not partner with CNW because it charges fees to access content: “we are a national institution and our first principle is free access to information.” (Vito Pilieci, “America’s Chronicles, Only in Canada: Ottawa company digitizes 300 years of U.S. newspapers.” Ottawa Citizen. 12 Jun 2001). According to an Ottawa Citizen article in 2008, the deal, which fell part in 2003, forced the company to lay off more of its staff. After great enthusiasm, the company was already in financial straights with internet users expecting more free content than was financially feasible to offer. The deal with Google permitted the company to run the website for free for two years before the deal was finalized in 2006. The same article also states that the exclusive digital license rights agreement that CNW had made with the CLA for 500 of its titles was set to expire in 2010 (Bert Hill, “Google expected to take over Ottawa data firm.” Ottawa Citizen. 14 Nov 2008). CNW did eventually partner with the Library and Archives Canada to digitize The Dictionary of Canadian Biography in 2003 which was made available on LAC’s website (“Canadian biographies go on-line.” Globe and Mail. 13 Nov 2003). ↩
Hill, “Google expected to take over Ottawa data firm.” ↩
Bob Huggins, Personal Interview, Montreal, QC. March 20, 2012. ↩
By 2006, LexisNexis, Factiva, and Highbeam were also added, especially to offer current content. the agreements meant that these companies often retained their rights to the materials, by making only previews available to users for free, with links back to the companies’ databases for subscription access to the remaining content (Hill, “Google expected to take over Ottawa data firm”). ↩
Burrows, “Online Access to Newspaper Content in Canada”; Mallan and Park, “Is Digitization Sufficient for Collective Remembering? Access to and Use of Cultural Heritage Collections.” ↩
Koltun, “The Promise and Threat of Digital Options in an Archival Age”; Mallan and Park, “Is Digitization Sufficient for Collective Remembering? Access to and Use of Cultural Heritage Collections”; McGann, “Database, Interface and Archive Fever.” ↩
Siva Vaidhyanathan, The Googlization of Everything (and Why We Should Worry) (Berkeley, CA: University of California Press, 2011). ↩
Individually the titles bought by CNW were worth little, but bundled, digitized and made searchable, they become valuable and can command a higher subscription price. Junk bonds, as we saw in the recession of 2008-9, are frequently packaged with higher rated bonds in order to increase their overall credit rating, making them seem like highly profitable, yet relatively safe investments. ↩
The author thanks Ashley McAskill and Leslie Corbay for their invaluable assistance on this project.
Article: Creative Commons Attribution-Non-Commercial-NoDerivs 3.0 Unported License
Image: "Chapter 1: Paragraph 16”
From: "Drawings from A Thousand Plateaus"
Original Artist: Marc Ngui
Copyright: Marc Ngui