SXSWi: Preserving our Digital Legacy and the Individual Collector

Preserving our Digital Legacy and the Individual Collector
Moderator: Carrie Bickner Web Developer, The New York Public Library
Carrie Bickner Web Developer, The New York Public Library
Josh Greenberg Assoc Dir Research Projects, Center for History & New Media
William Stingone Curator of Manuscripts, The New York Public Library
Megan Winget Professor, UT at Austin

Bickner: (moderator) opening with story of Zora Neale Hurston that when she passed away she had left no instructions for her possessions – the custom in that town was that they started to burn her possessions. Someone came up and realize (Deputy Sherrif Duval) and started pulling things off the fire. Those burned letters and manuscripts sat on his porch for two years before making it to a library. This works with paper, this doesn’t yet work with digital records.

Q from moderator: people are scanning their family albums and putting them on CD to archive. What is the problem?

Winget: digital records are ephemeral, they start decaying immediately upon creation. “The Viewing Problem” – you need technology to read digital documents. Refreshing – documents on 5.25” then transfer to 3.5” to flash to network access. Migration – related to software. Difficult to open documents prior to 2 iterations from the current software. Open a document created in WordPerfect and migrate it to Word 5, to 97, to 2000 to XP, etc. When you have one doc it is ok, but when you have 10,000 you have to write a program that opens each one. Inelegant solution. It also changes little tiny pieces like fonts, tabs, etc. This could make a little or a large problem. Emulation – building software that will emulate older version of the software. Problem with emulation is you are running OS X on OS 30. You are running this software on a super-fast box, it is now different than it was. For art, or for programs with interaction, these changes change what the applications or artwork was. This is a problem. If you are not using an open source product for emulation, you can’t mess with what you need to mess with.

Software is offensive from a free/rights perspective but also from a usability perspective.

Q from moderator: You recently worked with Sep 11th fund. Can you speak about that?

Stingone: Sep 11th fund raise 100 million in 3 days. They wanted to give grants to people somehow affected by 9/11. They approached us to take their records. We document things for historians to study later on. It was our first potential donor that had entirely digital records. They had large databases shared over different spaces, they had legal contracts, they have network with very informal folder systems. Their office manager handed me a CD with 500mb of files on them. He reorganized files for me – first violation of archiving, leave in original order. It was relatively small collection, we could open most things, but there were about 50 different formats in those 500 mb

We have 5.25” floppies sitting in file folders. We may need to revisit that before it is too late if it isn’t already. I am worried about readability. How will we look at these records in the future?

Greenberg: it may seem that new media art is more esoteric than text, images, etc. The web is its own problem, rendering of HTML pages is already problematic, in the future it will be migration, emulation, etc. We’re not going to have an archive of a Google maps mashup – it is not just saving the HTML. How do you archive the server and the information it was giving then? It starts to look a lot more like the new media performance art problem than the Word document problem. There are big systems that can store bits for a long time, but what is the lightweight system for storing personal digital information. We are working on an open source library solution, and we need to get it out fast, maybe missing some library standards.

Q from moderator: what is it that we need to save and how do we go about doing that? We have a declaration of independence with Thomas Jefferson’s scribbling notes all over it. In many ways that is more important than the final as it showed what it might have been. So we often keep the final version of things rather than the process which might be more interesting. The e-mails, the fights, the discussions. What should we save?

Stingone: I try to avoid record collectors, but go for the records creators. I want the record that people created unconsciously while they were doing what they did. People think we don’t want their letters, but that is exactly what I want. I want people to keep records rather than collect them. People keep more records now because they don’t take up space in your house and you don’t have to file them. The problem is they easily go away if you neglect them or get a new computer, etc. One problem is we need to get to people much earlier so that they haven’t gone through 7 laptops before we realize they are an important person.

Greenberg: we need tools that keep track. Versioning in wikis seems very powerful for more apps. It creates historical traces as the wiki page is built. There isn’t a notion yet that once a project is completed you leave the process stuff somewhere in long-term, climate-controlled rooms.

Q from moderator: there is often real ignorance in value of saving work

Winget: the idea of preserving digital documents is changing. In the past to the file, then the box, then the closet, then the warehouse. Now it is much more dependent on individuals to make decisions along the way – file it, digitize it, etc.

Talked to a scientist about his lab notebooks. He had enormous negative findings that he wanted to ensure were in the archives. Then people could move on from where he left out. Then I wanted to archive his lab notebooks and he said all the lab notebooks were totally useless. This is what archivists want. He is in charge of that but does not see any value to it. It is all digital now, he would have to download it, put it on CD’s, store it, etc. In industry lab notebooks are a key piece of intellectual property – there are serious methods with how you collect, store. They prove prior discovery, etc. They are important. People think they are mundane, but that is not true.

Stingone: people want to give us their records and come in and explain them to us. They want us to store their story, but the records are the story. People want to organize things before they give to us. People should just keep them how they are, their natural state.

Q from moderator: how do we deal with privacy? If you archive e-mail, people don’t want all their e-mail read.

Greenberg: The Library of Congress has been interested in problem of saving digital information. Mostly library and information scientists building the infrastructures. There is a historian (University of Maryland) studying technological failures: thinking about what happening to Internet bust companies. He saw that a law firm had gone out of business. He went to bankruptcy trustee office and said let me help you preserve the digital record so that it doesn’t die. It could reflect so much about what happened in those moments. They are legal records, they are private. You don’t know what will happen in 150 years from now; perhaps the law will be different. If later on you can look at them, but don’t have them, the point is moot. It ends up in a “dark archive” – an archive you can’t look at. Census records are in a closed room that you can look at, but there are rules about what you can take out. Technical (room) and legal guidelines.

Q from moderator: What does the future of research look like? What will survive, what will be looking at?

Greenberg: large databases have made larger research analysis possible across multiple locations. We have been building algorithmic approaches to research. Researchers will expect an API that lets them pull in raw materials and then “work” with it to find what they are looking for.

Q from moderator: I want to look at National Endowment for the Humanities – I get archive of e-mail. I am pretty sure that Thomas Jefferson’s letter is his letter, but we will have to have faith in the custodians of the e-mails over the last 50 years not to have changed anything.

Stingone: we have always had this problem. I have to trust my historian colleagues of the past who say this is a diary of so and so.

My thoughts: another good panel. It is overwhelming just thinking about the amount of files we have in our school across laptops, USB drives, CD’s, DVD, file cabinets, network drives. How can we possible keep an archive? We have a new Director of Archives at our school and she is battling the paper records my school has from the 1920’s along with objects, clothes, awards, records and more. How will this intersect?

My big question though is about things like photos. I have photos of my grandparents from 50 years ago. Will my Flickr photos be around in 50 years? I want them to be, but I only have a couple photos of my grandparents, does someone need my 10,000 photos? Anyway, good guidelines from Digital Preservation (Library of Congress) and Managing the Digital University Desktop on how to store your own digital files. Follow their advice!

technorati tags:, , , , , , , , , , , , , , , , ,

Blogged with Flock

arvind s. grover

I am a progressive educator, a podcaster (, a blogger, and dean of faculty of JK-11 school (building a high school) in New York City.