I'm putting some thought into some next steps for my algorithmic rotoscope work, which is about the training and applying of image style transfer machine learning models. I'm talking with Jason Toy (@jtoy) over at Somatic about the variety of use cases, and I want to spend some thinking about image style transfers from the perspective of a collector or curator of images — brainstorming how they can organize and make available their work(s) for use in image style transfers.
OK, let's start with the basics. What am I talking about when I say image style transfer? I recommend starting with a basic definition of machine learning in this context, provided by my girlfriend and partner-in-crime Audrey Watters. Beyond that, I am just referring to training a machine learning model by directing it to scan an image. This model can then be applied to other images, essentially transferring the style of one image to any other image. There are a handful of mobile applications out there right now that let you apply a handful of filters to images taken with your mobile phone; Somatic is looking to be the wholesale provider of these features.
Training one of these models isn't cheap. It costs me about $20 per model in GPUs to create, and this doesn't consider my time, just my hard compute costs (AWS bill). Not every model does anything interesting. Not all images, photos, and pieces of art translate into cool features when applied to images. I've spent about $700 training 35 filters. Some of them are cool, and some of them are meh. I've had the most luck focusing on dystopian landscapes, which I can use in my storytelling around topics like immigration, technology, and the election.
This work ended up with Jason and I talking about museums and library collections, thinking about opportunities for them to think about their collections in terms of machine learning, and specifically algorithmic style transfer. Do you have images in your collection that would translate well for use in graphic design, print, and digital photo applications? I spend hours looking through art books for the right textures, colors, and outlines. I also spend hours looking through graphic design archives for the movie and gaming industry, as well as government collections. I'm looking for just the right set of images that will transfer and produce an interesting look, as well as possibly transfer something meaningful to the new images that I am applying styles to.
Sometimes, style transfers just make a photo look cool, bringing some general colors, textures, and other features to a new photo. There really isn't any value in knowing what image was behind the style transfer; it just looks cool. Other times, the image can be enhanced knowing about the image behind the machine learning model and not just transfer styles between images but also potentially transfer some meaning as well. You can see this in action when I took a Nazi propaganda poster and applied to it to photo of Ellis Island or an old Russian propaganda poster and applied to images of the White House. In a sense, I was able to take some of the 1,000 words applied to the propaganda posters and transfer them to new photos I had taken.
It's easy to think you will make a new image into a piece of art by training a model on a piece of art and transferring its characteristics to a new image using machine learning. Where I find the real value is actually understanding collections of images while also being aware of the style transfer process and thinking about how images can be trained and applied. However, this only gets you so far, there has to still be some value or meaning in how it's being applied, accomplishing a specific objective and delivering some sort of meaning. If you are doing this as part of some graphic design work, it will be different than if you are doing for fun on a mobile phone app with your friends.
To further stimulate my imagination and awareness I'm looking through a variety of open image collections, from a variety of institutions:
- Digital Public Library of America (DPLA): The DPLA is a platform. Developers make apps that use the library’s data in many different ways.
- The British Library: A collection of over one million public domain images from digitized copies of 17th-, 18th-, and 19th-century books.
- Europeana: Explore millions of items from a range of Europe's leading galleries, libraries, archives, and museums.
- The Library of Congress Prints and Photographs Reading Room: Photographs, historical prints, posters, cartoons, documentary drawings, fine prints, and architectural and engineering designs.
- Metropolitan Museum of Art: Selected artworks are available under the Open Access for Scholarly Content (OASC) license.
- National Aeronautics and Space Administration (NASA) Image Galleries: Public domain photos.
- The New York Public Library Digital Collections: Digital collections of high-resolution prints, images, and maps with no known copyright restrictions. Rights statement included for individual items.
- The Ohio State University Health Sciences Library Digital Image Collections: A list of resources available through the OSU Health Science Library.
- Public Health Image Library (PHIL): Provided by the Centers for Disease Control and Prevention.
- Samuel Zeller Archive: Professional photographs available under a Creative Commons Attribution (CC BY) license.
- U.S. Government Photos and Images: A collection of photo and image galleries for multiple federal government agencies.
I am also using some of the usual suspects when it comes to searching for images on the web:
- Google Image Search: The old standby place to work through ideas.
- Flickr: Creative Commons: Many Flickr users have chosen to offer their work under a Creative Commons license, and you can browse or search through content under each type of license.
- Flickr: The Commons: Includes photos with “no known copyright restrictions” uploaded by participating cultural institutions.
- Internet Archive: Contains books, movies, software, music, and more.
- pond5 Public Domain Project: Public domain images. Registration/free account required to download materials.
- Wikimedia Commons: A database of freely usable media files to which anyone can contribute.
I am working on developing specific categories that have relevance to the storytelling I'm doing across my blogs, and sometimes to help power my partners work, as well. I'm currently mining the following areas, looking for interesting images to train style transfer machine learning models:
- Art: The obvious usage for all of this, finding interesting pieces of art that make your photos look cool.
- Video game: I find video game imagery to provide a wealth of ideas for training and applying image style transfers.
- Science fiction: Another rich source of imagery for the training of image style transfer models that do cool things.
- Electrical: I'm finding circuit boards, lighting, and other electrical imagery to be useful in training models.
- Industrial: I'm finding industrial images to work for both sides of the equation in training and applying models.
- Propaganda: These are great for training models and then transferring the texture and the meaning behind them.
- Labor: Similar to propaganda posters, potentially some emotional work here that would transfer significant meaning.
- Space: A new one I'm adding for finding interesting imagery that can train models and experiencing what the effect is.
As I look through more collections, and gain experience training style transfer models, and applying models, I have begun to develop an eye for what looks good. I also develop more ideas along the way of imagery that can help reinforce the storytelling I'm doing across my work. It is a journey I am hoping more librarians, museum curators, and collection stewards will embark on. I don't think you need to learn the inner workings of machine learning, but at least develop enough of an understanding that you can think more critically about the collection you are knowledgeable about.
I know Jason would like to help you, and I'm more than happy to help you along in the process. Honestly, the biggest hurdle is money to afford the GPUs for training the image. After that, it is about spending the time finding images to train models as well as to apply the models to a variety of imagery as part of some sort of meaningful process. I can spend days looking through art collection, then spend a significant amount of AWS budget training machine learning models, but if I don't have a meaningful way to apply them, it doesn't bring any value to the table — and it's unlikely I will be able to justify the budget in the future.
My algorithmic rotoscope work is used throughout my writing and helps influence the stories I tell on API Evangelist, Kin Lane, Drone Recovery, and now Contrafabulists. I invest about $150 each month training to image style transfer models, keeping a fresh number of models coming off the assembly line. I have a variety of tools that allow me to apply the models using Algorithmia and now Somatic. I'm now looking for folks who have knowledge and access to interesting image collections who would want to learn more about image style transfer, as well as graphic design and print shops, mobile application development shops, and other interested folks who are just curious about WTF image style transfers are all about.