Events

History of the Future

17 June 2021 History of… the Future? As we saw in Archives of the Future, the fragility of digital storage media and the rise of big data have reverse the historian’s relationship with our primary sources. Future historians will have access to what we choose to save for them. Historians have been trained to stress continuity and gradual change rather than rupture. Technological determinism reminiscent of Elizabeth Eisenstein’s thesis of a putative “print revolution” is an approach most historians view with a skeptical eye. Still, the more historians engage with today’s information technology, it becomes apparent that the historical profession is about to experience a radical departure from established practices. Are we prepared to recover data from abandoned data centers? Authenticate digital media sources? Decrypt secure data? Stitch together sharded files? Utilize a supercomputer? Jeffrey Tharsen and Scott Weingart are experienced researchers at the cutting edge of digital humanities research. […]

Read More

Big Data Epistemology

20 May 2021 The May session of Big Data and the Historian’s Craft shifts our attention to data science and methodological reflections. Text analysis, GIS, network analysis, and machine learning have become part and parcel of digital methods training. However, historians used cadastral surveys, census records, ledgers, maps, and genealogies in their research long before the digital turn. The long view should help us better understand what distinguishes our current paradigm from previous attempts to incorporate quantitative methods and information visualization strategies into historical research. Nonetheless, most digital history projects at the moment utilize sub-gigabyte digital representations of primary sources originally in print form. What changes when digital history scales up to petabyte-scale databases distributed across multiple servers? Will the goal be to account for “everything” or to create subsets or samples? Will the historical profession turn into an extension of data science? Or will some aspects of the historian’s […]

Read More

Energetics of Big Data

15 April 2021 In “Energetics of Big Data,” we will discuss big data as energy. Data centers are energy gobblers that consume anywhere between 1% to 5% of the world’s electricity, depending on how you run the estimate. While historians have known that the carving of woodblocks, mass printing, and climate-controlled archives consume energy, we do not usually associate our research activity with energy use. The personal computers researchers use need relatively little electricity and can be powered with a solar panel the size of a backpack. To boot hundreds of thousands of servers in a data center, however, historians require megawatts of electricity. What are we to make of this? Registration: https://tinyurl.com/bigdatahistory

Read More

Securing Big Data Archives

18 March 2021 The topic of our March roundtable is security as physical infrastructure and personal data protection. Future historians will have access to only what we choose to archive for them. We need to consider anonymizing and classifying selections of today’s social media data that may prove useful for researchers of the 2010s and 2020s. Is this possible? What do we earmark for long-term preservation? Writings? Pictures? Location? Biometric and medical data? If we do, how should such big data archives be designed? Anonymized? Encrypted? Classified for a time and then be made available for public use after 100 years? And where? In the arctic vault? Former nuclear bunkers? Is there anything we can do collectively today? With data scientist Shannon Stewart and anthropologist A. R. E. Taylor, we will discuss data anonymization and data center security from the perspective of historical research.

Read More

Archives of the Future

18 February 2021 Historians and archivists have expressed concerns about the Digital Dark Age for decades. The short life span of storage hardware (about 7 years for SSDs under stress, 30 years for HDDs, and 100/200 years for optical discs under climate-controlled conditions only) inverts our relationship with primary sources. Advanced encryption methods employed in data centers for privacy and security reasons further complicate the challenges of creating digital archives. We are forced to think about long-term preservation today and be prepared to lose access to the original medium. Please join us in the February roundtable of our Big Data and the Historian’s Craft series, featuring Yennie Jun, Ian Milligan, and Jerry Wang. Yennie Jun has researched the challenges of archiving big data at the SNU Big Data Studies Lab and is currently finishing her MSc in Social Data Science at Oxford. Ian Milligan, an Associate Professor of History at […]

Read More

Big Data and History: Some Provocations

21 January 2021 The advent of big data challenges long-held assumptions in historical research. The 59 zettabytes of data generated until 2020 reside in fragile storage devices with an average life span of 7 to 30 years. Unlike paper documents, the networked existence of big data across hundreds of thousands of servers in power-hungry data centers requires considerations of energy demands and digital information’s environmental impact. Not to mention, primary sources are no longer exclusively records left behind by human observers but also include detailed logs and photorealistic data captured by billions of smart devices. From material bibliography to data science and cultural studies, how should we prepare ourselves for this sea change and train the next generation of historians? From January to June 2021, the Department of History at Lingnan University, the Big Data Studies Lab at Seoul National University, and the History Lab at Columbia University are hosting […]

Read More