Amodern 2: Network Archaeology


From Paper to Microfilm to Database

Sandra Gabriele

Changing material forms not only imply a specific politics of information; they also implicate public policy decisions that determine the possibilities for our documentary heritage. This case is interesting because it points to the political economy of information around these databases, an economy that is the consequence of public policy decisions made by Library and Archives Canada (LAC) years before the launch of PaperOfRecord.8 This complex political economy renders highly problematic any simple claims about the relationship between digitizing and networking historical records and the democratization of information precisely because the residual layers of policy, practices and politics are utterly invisible in the digital record. The invisible residue of the past then becomes what Recall Raymond Williams refers to as “the selective tradition”:

Thus certain experiences, meanings, and values which cannot be expressed or substantially verified in terms of the dominant culture, are nonetheless lived and practiced on the basis of the residue – cultural as well as social – of some previous social and cultural institution or formation … It is in the incorporation of the actively residual – by reinterpretation, dilution, projection, discriminating inclusion and exclusion – that the work of the selective tradition is especially evident.9

Understanding how the residual continues to exert its influence entails considering the transfigurations that historical newspaper collections undergo as they are transfigured from paper to microfilm to digital scan to database.

My approach attempts to “think media archaeologically,”10 to treat “media cultures as sedimented and layered, a fold of time and materiality where the past might be suddenly discovered anew.”11 I will follow this consideration with a brief history of PaperOfRecord, focusing on its relationship to Canada’s various public library institutions in order to demonstrate how an archaeological approach to media history allows us to map out the hidden networks of relations that govern how a newspaper page appears as a digital object. Network archaeology calls attention not just to the history of a given media object, but also the history of the networks that constitute the object. I suggest that the patient tracing outward, not only “downward,” offers a more fulsome account of the present-day object.

Tracing the Material Form: Surrogacy, Remediation, Transfiguration

The shift from one medium or form to another is never a self-evident or seamless transfer of meaning and representation. Form, after all, is never innocent.12 Form sets up the conditions that constitute the relationship of the newspaper with its readers; it materializes and symbolizes the way the newspaper “imagines itself to be and to act. In its physical arrangement, structure and format, a newspaper reiterates an ideal for itself.”13 In other words, form is ideological. This observation, of course, has been made by other media theorists, from Benedict Anderson,14 whose interests in the periodicity of newspapers leads him to consider how communities are formed, to James Carey’s now-famous argument about how newspapers perform and instill rituals of communication,15 to McLuhan’s aphorism “the medium is the message.”16

Suggesting that content worked like a “juicy piece of meat carried by the burglar to distract the watch-dog of the mind,” McLuhan argued that the real content of any medium was other media.17 Though he frequently resorted to simple examples to explain this thought (the content of the movies were plays or operas; the content of writing or print was speech), McLuhan opened the door for thinking about media as sets of relations, especially with other media, rather than discrete entities. In short, McLuhan advocated for thinking about media as an interconnected field of relations, which he termed an “environment.”18 In his essay, “Print and Social Change,” McLuhan suggests treating print not as a singular invention, but an assemblage of technologies and practices accumulated over time, including both the material (advancements in paper) and the cultural (habits of reading, for example).19 The work of print, however, was to fix and thus obscure these relations.

Yet new media routinely perform the same sorts of obfuscation, complicating the relationship between the transitory and the permanent, producing an “enduring ephemeral.”20 No better description could be offered for historical newspaper databases that produce new value, not in stabilizing the otherwise ephemeral newspaper (after all, this act was performed long before in libraries), but in constituting new relationships across a multiplicity of texts. What adds value to the digitized and databased newspaper is its relationality to other texts – that is, its potential circulation.

An examination of forms, then, necessarily confronts how they interact with each other as texts circulate through new contexts, technologies and readerships. Roger Chartier writes in Forms and Meanings that “Each of the forms obeys specific conventions that mold and shape the work according to the laws of that form and connect it, in differing ways, with other arts, other genres, and other texts. If we want to understand the appropriations and interpretations of a text in their full historicity we need to identify the effect, in terms of meaning, that its material forms produced.”21 Here Chartier calls attention to how the circulation of texts transfigures them as they encounter other texts, particularly as material paths create networks between readers, texts, producers, distributors, and so on. Of course, all of this is obvious to the book historian; book history has been invested in tracing these material networks for decades. But what happens in the context of transition from print to a digital network, with various analog mediations along the way? And what happens when the audience is no longer invested in the continuous story of a community, but in a piecemeal investigation of a word, a topic, an event, or other singular asset?

What, exactly, are we looking at when we view a digital scan of an historical newspaper page? How do we consider the materiality of the digital historical newspaper page in relation to the objects we know more familiarly, the archived printed page, or the microfilm? Some theorists, historians and librarians have suggested treating these objects as instances of remediation.22 But a remediation of what? Since the majority of digital newspaper databases are scans of microfilm, what exactly is “rival[ed] or refashion[ed] … into the real,” as Bolter and Grusin assert?23 Are digital scans best understood as “surrogates,” as librarians often like to call microfilm? If so, what are they substituting for? Lilly Koltun warns that “An irony being played out in the archival profession, as elsewhere, is the belief, particularly prevalent among the techno-rhetoricians, that data is ‘salvaged’ or rescued by transference, for example, from tape media to digital formats; present readability is bought at the cost of even greater ephemerality and more rapid intervals of future reformatting.”24 I suggest that neither the concept of remediation nor that of surrogacy provides an adequate conceptual framework for thinking about the traces that remain of the power, ideologies, discourses and institutional policies that mark objects as having “intrinsic” or “permanent” value in the language of archives.25

To investigate historical newspaper databases is to confront the material traces of the previous encounters of these texts as they are transfigured from one medium to another. Dilip Gaonkar and Elizabeth Povinelli’s concept of transfiguration concerns the way that the demands of different contexts change the function of cultural objects as they circulate. As such, it orients analysis toward questions of power – chiefly, those that concern the institutions that determine the “intelligibility, livability and viability” of objects in circulation.26 In an era where data is described as the “new oil,” cultural institutions like libraries and archives are struggling to find their place among corporate giants like ProQuest, Ebsco and Google (who, until May 2011, offered free access to digital historical newspapers through its Google News Archive program). What’s at stake in the transfiguration of the newspaper page into data is the embedded politics of preservation. In order to understand access in a digital context, it’s necessary to consider what was saved, how it was saved, and who is offering the content.

Material Transfigurations: Paper, Microfilm, Database

As a cultural form, newspapers – at least since the 1880s – have imposed an order and logic that readers have come to understand and adopt. As newspapers became increasingly commercialized, moving away from partisan sources of funding to advertising revenue, new features emerged to replace the hodge-podge of news, witticisms, advertisements and opinions that characterized the pages of the mid-nineteenth century newspaper.27 News became organized by bold headlines and sub-headings summarizing key elements Especially on the front pages, headlines began to span multiple columns. Discrete sections emerged, each replete with their own banners. Recurring features like women’s and children’s pages and sporting news of all kinds also began to appear. These sections did not simply imply an order, a logic that enacted its imagined community; they also created predictable rhythms of consumption. Readers came to know that “Kit’s Kingdom” would be found week after week on page 6 of the Saturday Toronto Mail. At the same time, Sunday newspapers across the United States created vast quantities of coupons and contests that disrupted regular periodicity by encouraging readers to collect, re-organize and save their newspapers.28 As Mark Turner suggests, modern media helped establish new rhythms for habits of leisure that themselves responded to the change from an agrarian economy to an industrial one.29

Rotogravure sections, magazine inserts and supplements of all sorts (from toy theatres to art prints) changed the materiality of the newspaper by inserting differently textured and sized paper amid the more familiar newsprint. These material, generic and format changes meant nineteenth and twentieth century readers were subjected to an increasingly rationalized order. This had much to do with the professionalization of journalism and the shifting labour structure of newspaper publishing from a publisher-editor to an increasingly diversified workforce that separated editorship and publishing while transforming the newspaper into a commodified corporate product. This rationalized order was also connected to the commodification of news, to the cultural need for the newspaper to guide and orient its readers in an increasingly complex cityscape.30

Preserved and stored as microfilm, the newspaper remains a modern cultural object. It is linear, following the chronological flow of its paper original. Photographically fixed as a singular image, its images offer a record of the paper: a sedimented cultural object tied to a specific geographic location and specific community of readers. Interface, at least in the old electro-mechanical microfilm readers, is determined entirely by the variances of particular microfilm readers, since “access” to the original object is not possible. Even digital readers, which also offer distinctive software interfaces that add new layers to the experience of the form, ultimately do not change the way in which a user-reader accesses the data itself.

Regardless of the specific method for accessing this analog form, similar logics to those for reading the newspaper rule the use of microfilm. Scrolling mimics the turning of pages. The dominant logic mirrors the temporal flows of publication – or at least the major flows of publication. In most instances, multiple editions were not filmed, nor were most supplements and inserts. For instance, despite the uniqueness of a Sunday edition for the Toronto World, precious few of its Sunday editions were actually microfilmed.

On microfilm, searches occur title by title, reel by reel, image by image. While it is still in its newsprint form, readers have the option to open the newspaper wherever they like, skipping the sports or passing off the funnies. In microfilm, the tyranny of the binding is doubled: there is no way into the paper but to scroll past all the previously filmed pages. Since microfilmed copies of newspapers were made almost exclusively from bound volumes of newspapers, whatever order and inclusions (or, in some always exciting cases, disorder) the bindery imposed are the ones that microfilm readers must follow decades later.

In other words, microfilm is ordered by residual traces of earlier preservational decisions, as Nicholson Baker’s work argues so forcefully.31 Microfilm is not only organized by the logic of periodicity; layered on top of that logic are the vagaries of binding, often not performed by libraries themselves, as well as a library classification order. Further, these decisions, when performed by the publishers and printers themselves, served specific production plans for serials. Laurel Brake has demonstrated that newspaper publishers frequently transformed successive series into volumes, adding new title pages, indices or tables of content, or bound them in boards, sometimes even covered in leather.32 The process of binding re-commodified formerly ephemeral newsprint publications into a new “information logic” that asserted its cultural value through its reference to the book.33 Layered in the object viewed on the screen, then, are many residual traces of ordering schemes and types of cultural valuation that aren’t always self-evident. Without a concerted network-archaeological investigation, these residual traces might remain totally invisible, yet they shape our reading experience of microfilmed newspapers in subtle and important ways.

Another example of the subtle variety: microfilming newspapers flattens geography and compresses temporality. During microfilming, the exclusion of whole issues, or important pieces of them, such as their covers, adverts, or supplements for breaking news, occurs frequently. The result is that the commercial nature of journalism and a publication’s niche markets, and questions of geographical distribution are eschewed in favour of a centralized, orderly, logical volume that tells a chronological story.34 The geographic scope of a microfilm search not only confronts these decisions; it depends entirely on the holdings of a particular institution. Because libraries are organized to serve particular constituencies, they prioritize newspaper holdings for geographically relevant papers (again, reifying the logic of the newspaper’s circulation itself) and rely on the archival code of “intrinsic value” to determine what other titles they might carry.35

Newspaper databases operate by a different logic altogether, where algorithms define the dynamic arrangement of individual elements on a virtual page. In “Database as Symbolic form,” Lev Manovich argues that the database constitutes a fundamental shift in the organizing principles of culture from linear narratives based on the cultural form of novels and film, to programmable, nonlinear arrays of data that do not privilege any one item over another.36 This organizing principle, which Alan Liu calls “Discourse Network 2000,” bears no resemblance to the ordering schemes imposed by publishers before or after binding, librarians or microfilm companies.37 Within the range of possibilities determined by the interface, the user determines the array of data presented with each search, within a range of possibilities specified by the programmers of the interface software. As Christiane Paul notes, what makes digital databases distinct from their analog predecessors is the ability to retrieve and filter data in multiple ways.38 Further, Manovich writes, “as a cultural form, database represents the world as a list of items, and it refuses to order this list.”39 Networked databases disrupt the temporal and spatial arrangements that once dominated how one read a newspaper (on paper or microfilm), making local small-town papers as available – and potentially interesting as research objects – as major metropolitan papers.40 It is, in short, the end of the tyranny of front pages and big-city dailies, at least on a number of levels. Adverts, personal ads, and birth and death notices exist on the same plane of retrievability, and relevance, as the headline story. Like Mussell puts it, “there is no such thing as ephemera, just data.”41

Though Manovich rightly points out that the database destroys the classic modernist order of narrative, a different set of possibilities around narrative emerges anew. Recent focus on the visualization of data drawn from databases, for instance, helps us to tell different narratives through the rearrangement of data by foregrounding relationships that analog ordering schemas had obscured. Networked databases facilitate the creation of large data sets, and comparative research across larger geographic regions. As many university users know, the databases that are increasingly accessed through university libraries are, in fact, multiple databases connected under universal searches, and sold in increasingly more complex, and more expensive packages by a relatively small number of corporations.42

In the world of historical newspaper databases, ProQuest dominates, with Ebsco offering an ever-growing range of thematically packaged databases that pull from a range of historical documents, archives and serials. With databases containing historical documents pertaining to early Americana, New York, Latin studies, African American history and Revolutionary War orderly books, these packages bring together a range of resources that are often only available through visits to particular archival institutions, or several scattered throughout the United States. These databases, all packaged separately to maximize profit, harness the power of metadata and fully searchable text in order to provide access to “millions of pages, eclipsing all other resources.”43

Further, by providing archival institutions with revenues from licensing, Ebsco also offers archival institutions an important source of income that often makes exclusive agreements that lock up their content in these databases too tempting to ignore. The possibilities of the kinds of searches that databases facilitate dispel the myth “of an eternal and finished past as the essence of the meaning of our records” that has governed traditional approaches to archival practice.44 The current cultural desire to digitize – whether to make records more available, or “moderniz[e] for the digital age”45 – not only favours certain power arrangements (technology, capital, etc), but is constituted itself by residual practices of preservation, access and valuation.

It is equally important to recognize the very real changes and limitations that occur as this “data” moves across forms, formats and media, creating other issues as what was once ephemeral endures.46 Jerome McGann argues that Manovich relies too heavily on binaries between databases and archives, or modern, “privileged” narrative and the post-narrative world of digital texts.47 McGann suggests such binaries fail to account for our documentary heritage sufficiently in understanding what databases actually do or how they work. The “progressivist story”48 about digital tools suggests that databases provide “liberated knowledge” while archives represent “reified knowledge.”49 This narrative fails to understand how constrained databases are because of the ways their hierarchical organizational structures are embedded in their functioning, especially in their interfaces. The power of databases lies precisely in their “ability to draw sharp, disambiguated distinctions” in abstracting data.50

While this discussion of database theory and its materiality is certainly essential to any analysis of the transition to digitized archived records, a fuller understanding of how the archival record was established in the first place is just as crucial to the discussion. In what follows, I turn to a case study of Canada’s first corporate-owned historical newspaper database, tracing the complex and overlapping roles of public institutions, public policy and private corporations. There are no simple exhortations to make about the value-added nature of databases, nor of the purity of their paper predecessors. Indeed, as we trace through the various layers of policy that affected the record which eventually ended up on’s users’ screens, we begin to see a variety of forces which continue to exert their influence on the contemporary record.

The Public Record: Traces of Policy

Public initiatives to put historical newspapers online range from nationalistic projects, like those launched by the Library of Congress, the British Library, or Bibliothèque nationale de France,51 to the smaller, provincial initiatives like those for the Bibliothèque et Archives nationale du Québec, Our Ontario (now known as Our Digital World), or Manitobia, hosted by the Manitoba Library Consortium.52 These projects are important for their commitment to public access, yet they vary greatly in their stability of funding and allocation of resources and staff over the long term. Art Rhyno, a librarian at the University of Windsor, former board member of the Ontario Library Association and former owner of a family-owned newspaper enterprise, became involved in digitization for both the Our Ontario initiative and as a way of digitizing his family’s vast collection of microfilmed issues of the Essex Free Press.53 The project to digitize as many community newspapers as possible began through collaboration with the Internet Archive on several pilot projects and ended with Rhyno using idle computers at night in the University of Windsor’s library to process the digital scans and run the OCR. Rhyno, interested in keeping costs to an absolute minimum, used an American company to digitize microfilm records for as little as 2.5 cents per page (roughly about $20 a reel) and was able to add only the most basic metadata to the records. The Our Ontario project became an amalgam of some half-completed projects from community libraries that had run out of funding, along with clippings, and some complete runs of papers. In sum, the holdings of the eventual database were not as complete as other similar projects.54 Our Ontario provides an ideal example of how a network archeology approach is ideal for analyzing the current online product: without a careful tracing of each libraries’ policy and funding constraints, a robust understanding of the digital object that appears today is simply impossible.

The Decentralization Program

While a casual observer might expect to encounter incomplete projects and lack of funding at the local level rather than the federal level, the national library in Canada has virtually no presence at all online. The reason for this dates back to a series of policy decisions in the late 1970s and 1980s. Though there are, no doubt, many policies there that one could trace, I will trace but one, the Decentralization Program for Canadian Newspapers, which began in 1979. Using the National Library as a coordinating force rather than a centralized depository, it sought to coordinate and consolidate the preservation of newspapers across the country, with the voluntary participation of the provinces, especially for papers that would not be otherwise microfilmed because they were unlikely to be of interest to commercial enterprises.55 The key to the program was that the responsibility to create and nurture the collection and preservation of newspapers rested with local actors, not the National Library. At the time of implementation, there was no union list of newspapers across the country (not completed until 1987).56 Many collections were in poor condition, unsorted and uncatalogued.57 Three years after the announcement of the new plan, after close collaboration with the provinces, the National Library of Canada had a clear mandate. Its job was to coordinate the plan across local districts in order to avoid duplication, to create and maintain a union list and register of microform masters, and acquire all microfilms made at the provincial level, as funds permitted.58 Aside from the commitment to collect and preserve all native and ethnic newspapers, along with the first, last and historic supplements of Canadian newspapers, the National Library did not have its own complete collection or even a preservation mandate.59 The commitment to buy copies of provincial microfilms became particularly important to the development of the microfilm project established by the Canadian Library Association.

The long-term legacy of this policy for today’s digitization projects, however, is that Library and Archives Canada does not own its own copies of microfilms from which to digitize. This situation was further compounded by a policy of the 1985 Conservative government requiring all government institutions to avoid operations that could be better conducted in the private sector. The policy effectively ceased the limited amount of microfilming the National Library was doing.60 Consider how Sandra Burrows, a former newspaper specialist for Library and Archives Canada, put it: “With nothing more to bring to the table than its good name and with the added requirement that access to scanned materials must be made available free of charge to anyone who accesses the LAC website, LAC is often left out of the equation of the scanner and the scanned.”61

The Microfilm Committee

The history of earliest institutionalized microfiliming in Canada provides another telling example of how cultural policy, which is never evident on the page before a reader, impacts decisions made decades later. Launched in 1948 with a $15,000 grant from the Rockefeller Foundation, the Microfilm committee of the Canadian Library Association began its filming efforts with Confederation-era papers. Once these were complete, the committee worked backwards in an effort to film papers from the “earliest dates” in Canada’s colonial history.62 This decision was purely ideological since, even by the committee’s own admission, the papers were printed on good rag-paper stock. Preservation was not driven by deteriorating paper, but by a teleological view of history that focused on firsts.63

The sale price for any of these early CLA-produced microfilms was determined by adding 20% to the cost of producing the negative, which included the cost of collating, labour, insurance and film. In other words, the cost of filming each newspaper was only covered once five copies were sold. By 1949, the Committee raised the price so that it would take only three sales to pay back the cost of producing the film.64 Though the Microfilm committee of the CLA struggled to keep costs of film purchases reasonable, it could only do so if it continued to sell filmed papers. In time, this pressure to break even began to affect the selection of titles to be filmed. Provinces who ordered the most copies of the final product had their newspapers filmed first.65

By 1953, the desire for the sound fiscal management of the Committee had compromised the selection process. Short, relatively rare titles had been favoured over longer runs of what were considered to be average daily papers and periodicals. By this time, some newspaper publishers were going directly to filming and skipping the binding of volumes altogether, but there was little to no incentive for them to take on the task of filming their already-bound volumes. Weekly local papers remained virtually untouched, unless commissioned specifically on a contract basis, despite the likelihood of these papers being of the greatest interest to historians and genealogists.66 This very brief history of microfilming in Canada demonstrates that the uneven collecting and preservation activities of the Library and Archives Canada has meant that other organizations, both public and private, have had to fill in the role of preserving Canada’s newspapers.67

These early curiously disconnected preservation policies have considerable implications for the presence of Canadian content online today. Connecting databases to the Internet, via other networks or databases, creates a global community of readers. Where many other countries have an online presence produced by their own national archives, there is no Canadian cognate. Instead, there are provincial and local initiatives, leaving the field wide open for private corporations, or publishers themselves, to fill in the gaps at the national level – much like the history of microfilming. But many of these newspaper databases are created and managed by corporations, making access to them contingent on a range of factors that are distinct from the question of access faced by public institutions such as libraries and archives. While access to these privately-owned and operated databases remains an issue, in the context of increasingly jeopardized library budgets and public support for such projects, private corporations are meeting essential needs in our information economy. Making historical records of all kinds more widely available to a larger range of people results not only in greater interest in our shared collective past; it also tends to make people more invested in the processes of archiving our history for the future.

Yet the residual material traces of microfilm and paper remain. Almost all databases of historical newspapers are comprised of digital scans of microfilmed copies of bound newspaper volumes.68 This double iteration of paper and microfilm means that, although the database handles the data differently and organizes it in ways that are potentially infinite, access remains limited to choices that were made decades ago. For newspaper historians interested specifically in the ephemeral, in the materiality of newspapers and in the newspaper form as an object of study in its own right (not as a means to learning about something else) databases offer access to only some kinds of data. Colour – a key factor that transformed newspapers in the 1890s – is not restored. Illustrations remain black smears that fail to convey the artistry and technical virtuosity in this form of printing and engraving. Format is no more discernible than it had been previously. Missing sections and inserts of all kinds are irretrievable. That is to say, many of the same limitations that existed with microfilm have persisted – though, of course, many others – important others – have disappeared.

Once filming is done, microfilms can be copied easily and sent far and wide. In the case of papers printed on cheap wood pulp newsprint, microfilm offers what crumbling paper cannot. Volumes of newspapers that are so fragile that they are removed from circulation altogether certainly fail to offer better access. Searchable, networked databases extend access to the newspaper in extraordinary ways, but not necessarily in all of the ways that we need them to.

Tracing the transfiguration of the newspaper page as it has circulated through institutions, ideologies and policies demonstrates that the claims about increasing access through the digital are, in other words, highly circumscribed from the onset. It further demonstrates that a network-archaeological approach to the history of contemporary objects must also attend to the political economy of the residual traces of previous forms. But these issues become further complicated by new values that come to dominate how old newspapers are re-commodified into information products. Databases persist in using microfilm because digitizing from paper copies – assuming there are adequate copies around (and because of microfilming, in many case there are not) – is prohibitively expensive.

Case Study:

Cold North Wind (CNW), the company that launched PaperOfRecord, Canada’s first website offering digital scans of historical newspapers from Canada and around the world, provides an excellent example. As Bob Huggins, co-founder of CNW, has suggested, digitizing from paper copies is too costly for a for-profit company to bother.69 Scanning microfilms has the advantage of employing an automated system, with relatively little human intervention in the process, allowing CNW to scan millions of records in a very short period of time.70 The logic of production easily trumps the logic of utility. Some of CNW’s first large-scale scanning was for The Toronto Star (the largest investor in CNW) and The Globe and Mail, wherein they were paid by the page. In order for the process to be cost-effective, automation was essential.

CNW developed their digitization process by working with abandoned microfilms found in a storage room in Calgary owned by the Canadian Library Association. Huggins describes these microfilms as “widowed” titles: they tended to be single-reel publications that a library would purchase one copy of, without any hope of an ongoing subscription, as with an actively publishing paper. In fact, these were often titles that the CLA deemed to be historically important enough to microfilm but, depending on when they were filmed, may have cost them more money to film than they recouped in sales. They were of little value to the CLA, and thus 252 titles, including Le Journal de Quebec, the Montreal Herald, and the Toronto Telegram were licensed to CNW, with the option to purchase in future. This arrangement characterized most of the agreements that CNW made with the publishers of the papers it digitized, with the exception of the Toronto Star, Globe and Mail and the National Newspaper Association, who retained exclusive rights to the data and who launched their own websites.((Interestingly, the Globe and Mail and Toronto Star databases are now owned and operated through ProQuest.))

The necessity to exercise the option to purchase the CLA’s holdings came in 2005, when Google began negotiating with CNW for the inclusion of their data in its expanding Google News Archive. According to Huggins, CNW approached LAC to seek out a partnership, and was at the point of signing a Letter of Intention for the purchase of the database for $1.2m, before, as he put it, “everything went to hell in a handbasket,” largely because of the problem of guaranteeing public access.71 With competition intensifying from ProQuest, who had far more capital to secure licensing agreements with large, important newspapers, CNW sold all its Canadian digitized material to Google. At the time of sale, CNW had digitized 20 million pages of newspaper from around the world. Despite the huge cache of data that was sold to Google, it reportedly paid “much less than $3 million” for it.((Hill, “Google expected to take over Ottawa data firm.”))

When launched in 2001, was subscription-based; several years later, its Google-supported version offered free access. After 2006, it largely disappeared, when the content of CNW’s database was sold to Google. Currently, the website is running once again, and is subscription-based, though the costs are quite modest. In May 2011, Google announced it would end its Google News Archive partners program, continuing to offer what had previously been put online, but ceasing the active accumulation of more records. For reasons that remain unclear to this day to most users, even to Huggins, who was involved in negotiating the deal with Google, a large amount of the data never made it online. Users of PaperOfRecord who were in the throes of research suddenly found their source materials gone, with absolutely no explanation from Google about why it had disappeared, or when it might reappear. The promise of integration through networking failed. As Huggins described the situation, the entire situation was puzzling, though a bonanza for its investors. Despite having purchased all of PaperOfRecord’s content, Google allowed PaperOfRecord to re-launch, though its content is once again behind a pay wall. As Huggins described it, “it’s like selling your house, but continuing to live in it for free.”((Bob Huggins, Personal Interview, Montreal, QC. March 20, 2012.))

Unlike the books project where Google itself scanned the books that it put online (even building some of the hardware necessary to do so), the historical newspaper collection was largely facilitated through partnerships with aggregators, including ProQuest, Heritage microfilm, and CNW.72 Though there was considerable controversy around the launching of Google’s news aggregation site, Google situated its historical content within its larger motto of “organizing the world’s information and making it universally accessible” by allowing users to “explore history as it unfolded”. Yet, as the criticism mounted, Google also demonstrated that its real value lay in offering information about its users to news institutions. For instance, offering access to historical content would show newspapers the true value of their content, and could permit news organizations to adjust subscription rates according to traffic. Access had its limits.

Driven by a huge market of genealogists, sports enthusiasts, academic and corporate libraries, the opportunity to receive licensing fees has meant that many organizations, libraries and newspapers are enticed to sell off their holdings to entities that will scan, aggregate and re-sell them at a profit. Efficiency is key on every level, from licensing to scanning and processing the pages and adding metadata. Source materials are almost always microfilms precisely because digitizing them is the cheapest way to automate the scanning process. Further, the metadata for these files is not usually written by librarians and archivists, but outsourced to IT workers. The ways that these bits of data behave, including their functions, uses, and relationships to other objects, are determined not by recognized indexing and classification standards, but by the cheapest means possible.((Burrows, “Online Access to Newspaper Content in Canada”; Mallan and Park, “Is Digitization Sufficient for Collective Remembering? Access to and Use of Cultural Heritage Collections.”))

Much like the hierarchical classification systems that dominate traditional archives and libraries, digital systems similarly mobilize hierarchical systems that are equally ideologically-loaded.73 Following Sida Vaidyanathan, I am suggesting that what is desperately needed is an intervention that places the public good at the heart of its venture.74 What governs these private databases is not the logic of the public good, or the logic of scholarship. It is the logic of the marketplace, more specifically, the logic of the junk bond.75 Working with historical materials under the logic of the database requires all of us to consider the epistemological and democratic potential lost and gained by this changing form.

As I’ve argued, the study of cultural policy should be key to the practice of network archaeology for several reasons. It helps to provide explanations for otherwise puzzling decisions about the selection of sources. It also helps to identify the residual material histories of the formal transfigurations that print can undergo before digitization occurs. Historical accounts cannot only dig into a medium, but must look across networks of media as well. This is important because these transfigurations had their own logic – logic sometimes determined decades before digitzation occurred by the policies of geographically and culturally specific institutions. Without doing the patient research necessary to make sense of such transfigurations, it can be difficult, if not impossible, to understand that the forms we see onscreen are not so much faithfully captured images of a mass-produced print artifact as they are dense laminations of bureaucratic and commercial decisions.

The author thanks Ashley McAskill and Leslie Corbay for their invaluable assistance on this project.

Article: Creative Commons Attribution-Non-Commercial-NoDerivs 3.0 Unported License
Image: "Chapter 1: Paragraph 16”
From: "Drawings from A Thousand Plateaus"
Original Artist: Marc Ngui
Copyright: Marc Ngui