The one-year project Spanish chapbooks 1700-1900 in CUDL: dating ephemeral literature, made possible by a Cambridge Humanities Research Grant (CHRG) with the support of Cambridge Digital Humanities, has come to an end (see our earlier blog post on the project here).Continue reading
Cambridge University Library was recently awarded a Cambridge Humanities Research Grant to continue work on the Spanish chapbooks catalogued and digitised under the “Wrongdoing in Spain, 1800-1936” project, as featured in the Cambridge Digital Library. This new year-long project aims to reliably date about 67% of chapbooks bearing estimated dates, often drawn from the printer’s period of activity. To establish more accurate dates of printing for these items, we aim to conduct visual search on woodcut illustrations within the chapbooks to compare prints made from the same woodblocks.
Printing houses used woodblocks (as well as metal stereotype plates in the nineteenth century) to illustrate the chapbooks. Woodblocks were expensive to produce, so printers often had a limited stock that they reused, sometimes through several generations of printers. Earlier woodblocks were crudely made on softwood, but the technique developed to produce much more detailed woodblocks etched with metal-engraving tools on harder wood. More intricate images are typical of the later period, although many older woodcuts continued to be used in later years to cut costs. It comes as no surprise then that wood blocks deteriorated over time, becoming less sharp, developing cracks. We see how, after many printings, the finest lines began to fade, and it is this wear-and-tear that we are hoping to use to our advantage to date the Cambridge Digital Library Spanish chapbooks more accurately.
During the first phase of the project (October 2021-to date) images of the chapbooks were run through a machine learning model created by Oxford University’s Visual Geometry Group. The model was pre-trained on similar Scottish chapbooks from the National Library of Scotland. This process recognized the woodcut images and created annotations to mark them using bounding boxes, but the result was not perfect. Manual input was needed to ensure that the gathering of images suited the parameters of the project. Our aim was to isolate individual woodblock prints (i.e., woodcuts made from a single woodblock). The software missed the fact that some images consisting of two or three separate woodblocks had been combined to make an individual image. It also missed borders and garlands and made “false detections”, so manual input was essential not just to serve our purposes for the project, but also to train the machine learning model to make more accurate predictions in the future.
On the next phase of the project, all the images and annotations, alongside metadata from Cambridge Digital Library, will be imported into an instance of VISE (Virtual Geometry Group Image Search Engine). VISE will allow us to visually search many images (we annotated a total of 18,757 images out of 26,527 scanned images of chapbooks). By using an image or a metadata field as a search query, we are hoping to use machine learning and computer vision to explore relationships between the illustrations and not only narrow down the publication dates of the chapbooks, but also open up fields for research in printing and social history.
Sonia Morcillo García
Thanks to strong support from academics and students following the February blog post advertising trial access to the Sovetskaia kul'tura digital archive, the University Library's Accessions Committee agreed to purchase permanent access to the archive, with financial support from money left to the Library by Dr Catherine Cooke. The purchase was made later in the spring, but it is only in the last few weeks that the digital archive has been fully updated from the pre-purchase state it had been in.
The archive contains as full a set as East View have so far been able to amass of the various titles under which the current weekly newspaper Kul'tura has been published. The earliest title was Rabochii i iskusstvo (Worker and art), which started in 1929, followed by Sovetskoe iskusstvo (Soviet art); this title ran from 1931-1953, with the exception of some of 1942-1944 when Literatura i iskusstvo (Literature and art) was used instead), and Sovetskaia kul'tura (Soviet culture; this ran to 1991, after which the current name, Kul'tura (Culture), was adopted). Any gaps in the collection are detailed within each title's main page, but East View assure us that the search for all remaining copies and also for better copies of issues which have scanned poorly will continue. As with other East View digital archives, the Sovetskaia kul'tura archive contains scanned pages which are text-searchable in Cyrillic and in transliteration.