Wroclaw University Library in Poland is working to digitally preserve nearly 800,000 pages of rare European manuscripts, books, and maps dating back to the Middle Ages, and share them online for the first time.
The Library was founded in 1811 with the specific mission of preserving these rare documents, some of them among the few surviving copies in the world. The documents include the works of Luther, Shakespeare, and Cervantes, rare hand-drawn maps, as well as important Polish cultural works.
Click the image below for a gallery of photos from the project.
Wroclaw University Library
The Wroclaw University Library, founded in 1811, houses unique documents such as medieval manuscripts and old prints, including the works Martin Luther, Cervantes, and Shakespeare, as well as rare maps and liturgical texts.
See more images from the Library -- or see these in greater detail -- on Flickr.
We asked Adam Zurek, Head of the Department of Scientific Documentation of Cultural Heritage at the library, to write about how they are applying modern technology to this centuries-old mission. The project is a partnership with IBM, which sponsors Internet Evolution.
Internet Evolution: What special techniques are used to preserve the originals when scanning historical documents of this type?
Adam Zurek: We're talking about exceptionally fragile, sensitive documents, some dating back more than 700 years. Any physical handling and exposure to air, light, and moisture can be damaging to them. So, we have to take great care throughout the process. That's really one of the major goals of the project Ė reduce the amount of physical handling of these documents overall, while also allowing greater access to the content.
Our specially trained technicians wear gloves whenever handling the documents and use a variety of scanners and other devices that take into account things like the thickness and fragility of the materials. These devices operate without UV light or IR radiation, which can also damage the manuscripts.
Some of the books are so fragile they can't even be fully opened without the danger of damaging them. One of the devices we used is designed to scan the pages with the texts opened only about 45 degrees.
IE: Have you seen any insights or benefits from the program already?
AZ: Well, beyond the opportunity to better protect and preserve these rare documents, the digitization process allows us to share them in ways we never could before, opening up tremendous new educational opportunities.
In the past, these texts were available only to a handful of students and scholars who would have to reside at or travel to our university to view and study them under very heavily prescribed conditions.
Now that we're making them available online, we can literally share them with the world. Historians, academicians, students, and enthusiasts of everything from old cartography to rare liturgical texts to great literary works can now see and study them instantly online. In just a few months since we first began putting this collection online, we've been getting more than a hundred visits daily to our digital archive. Our site is still a beta version and will officially launch on September 30. But I invite your readers to visit our rare cartography collections -- just press "search" to choose the document you want to see and you can view them right on the screen.
We also hope that our project can serve as an example and inspiration to other universities around the world to do the same thing.
IE: How is the university applying big-data techniques to these documents?
AZ: One of the major challenges of a project like this is the sheer size and volume of data involved. You have to remember that we're not just talking about rendering printed text online. We're capturing high-resolution images of these documents so people can see and experience them as accurately as if they were holding the actual manuscripts.
As you can imagine, that means the individual file sizes are substantial -- most are a minimum of one gigabyte apiece. So, it wasn't just a matter of scanning and filing the images. We needed a system that could provide fast, efficient processing and storage for quick online retrieval.
Our total system capacity is 300 terabytes -- that's a lot of data -- and our IBM System x servers and Storwize and SAN Storage systems provide the robust architecture and performance for the quality of service and fast online access we require, as well as the capacity to add more content as we continue digitizing more texts.
Web Wide World - Warsaw
A Historical Reality Check on Internet-Fueled Revolutions
Polish University Taps IBM Big Data Tools to Bring Middle Age Manuscripts to the Public