Last fall, I did a presentation at B&H on using your camera as a scanner, based on my book Digitizing Your Photos. The webinar proved a pretty detailed overview of the camera scanning process for prints, slides and negatives. For those unfamiliar with the process, or for people who have been struggling to get high quality scans, there is a lot of good information in here.
I just finished leading my first Maine Media Workshop. It was a week -long intensive workshop that focused on total collection management using Lightroom. Each of my students brought in a large body of work, from 10,000 images up to 400,000. We focused on the processes that would help them preserve, organize and curate the photos.We had a great dinner Thursday night at Skip’s house.
First a quick shout out to the class – Charlotte, Gary, George, Nancy and Skip. They were an outstanding group to spend a week with: passionate about their photos, eager to learn all they could, and patiently allowing me to spend individual time with each person in the class. They were also a delight to spend a week with – interesting, funny and kind.
I’d also like to thank my teaching assistant Sophie Schwartz and Alyson who both helped to keep things running smoothly.
The Maine Media Workshop (MMW) provides a great environment for focusing on an area like collection management. It’s a no-frills camp setting on the outskirts of picturesque Rockport/Camden. The total immersion approach allows the class to push through barriers and make substantial progress.I hope I’ll be back again next year.
True to its new name, there was a lot more at the MMW than photography. The roster of classes included a number of ones on video production, podcasting and writing. The end of week slideshow included some very nice work done by the different groups.
It’s been a long time since I’ve done workshops, and it was very gratifying. Now that we have two new books out, we are actively working to get some future workshops lined up. If you are interested in taking a workshop, we have a place for you to tell us what you’re interested in on this page.
It is with great pleasure that we can announce the release of the full digital version of The DAM Book 3.0. In the nine years since the last version was published, our use of visual media has become marked by increasing connectivity – between images, people, data, applications and web services. The new book helps you understand this connectivity and how to evaluate which tools are right for you.
The change in covers between TDB2 and TDB3 help to illustrate the differences between the versions. In 2009, it was fine to think about collection management as “a big ball of my stuff.” In our connected world, that’s not enough. Your image collection will connect to other people, images, services, applications, ideas and information. You will also probably have additional satellite collections on other devices or services. And, yes, like Tinkertoys, these connections often plug into each other in standardized ways.
In the new book, I’ve laid out the elements of the connected media ecosystem. We’re seeing connection and integration in all aspects of the way we make and use images. Connectivity is often born into an image at the moment of capture, and increases as your images move through the lifecycle. Almost everything that hosts or touches your images has some type of connection.
The new world of cloud and mobile workflow are impacting every part of the way we manage and make use of images. File formats, storage, backup and metadata are evolving to incorporate these new capabilities. I’ve rewritten the book from the ground-up so that connectivity is built in to each topic.
Of course, connectivity is not the only change that has come to digital media over the last nine years. The use and importance of photography has expanded dramatically, and anyone who wants to understand how visual media works can find important context in this book.
Dateline – Athens, Georgia – We’ve released the next set of chapters for The DAM Book 3.0, adding 325 more pages to the initial Chapter 1 release, for a total of 363 pages. These chapters cover some of the most fundamental and important parts of image management.
Chapters released today include:
- Image Objects and File Formats – a discussion of the evolution of digital images and the formats used to embody them.
- How Metadata Works – A deep dive into the nuts and bolts of modern metadata.
- Using Metadata – a guide to the effective use of metadata to achieve goals that are important to you.
- Logical Structure – discussion of the different types of file systems that we now use (computer, mobile, cloud, NAS) and a file and folder strategy for each of these.
- Digital Storage Hardware – a comprehensive look at the current storage options for your digital media.
- Backup, Restoration and Verification – preservation of your archive requires you to think of these processes as part of a unified system.
Anyone who has purchased the pre-release copy should have gotten an email with instructions for downloading the new version. And if you have not ordered yet, you can still get in on the 10% pre-release discount. The discount runs until the release of the final version, scheduled for the end of April.
Huge thanks to the DAM Useful production team for their Herculean effort in getting this release out on time. Elinore Wrigley of Me Jayne Design Cape Town, South Africa, Dominique le Roux of Moonshine Media, Vientiane, Laos and Bobbi Kittner of Kittner Design, Takoma Park, Maryland did another outstanding job.
Special thanks to Josie Krogh and Steve Thomas for letting us set up an Athens, GA field office for the final push.
We are pleased to announce that The DAM Book 3.0 is now available for pre-order! As with our previous books, you can pre-order the book for at a discount.
Here are the details:
- Electronic book: Regular price $34.95
- Pre-order discount price: $31.46
- All pre-orders will get an advance copy of Chapter 1, Visually Speaking at the time of purchase.
- We will deliver at least 7 of the additional 11 chapters by March 31st, 2018.
- Additional chapters will deliver in April 2018.
Print copies will be available over the summer. Your purchase of an electronic copy can be applied to a print copy, once it’s available.
More Info: Click The DAM Book 3.0 product page here.
I’ve got a number of appearances scheduled for the coming months. Here’s a list, followed by a link to an interview I did with Photofocus.
APPO Raleigh, NC March 21-24
I’ll be giving a general session at the Association of Professional Photo Organizers on the use of Artificial Intelligence in asset management, as well as a breakout session on using your camera as a scanner.
APPO is an organization for people who help (mostly) private individuals scan, tag, preserve and make use of their photographic legacies. More info here.
Palm Springs Photo Festival
I’m thrilled to be headed back to Palm Springs for the 2018 festival. I’ll be doing two programs. The first is Wednesday, May 9th program on scanning with your camera and the second is What’s new in DAM program on May 10th. More Info here.
Maine Media Workshops
I’ll be giving a week-long workshop on managing your mage collection with Lightroom the week of June 10th. I’ve never taught here, but I’m really excited to give it a whirl. I know a number of people who have had life-changing experiences at the workshop. More info here.
Available now! – Web Interview on PhotoFocus
I had a really enjoyable hour speaking with Rich Harrington, Tim Grey and Kevin Ames about getting organized. The interview has been archived and you can find it here.
Editor’s Note: This post combines a couple threads I’ve been writing about. I’ll provide some real-world methods for converting visible text to searchable metadata as discussed in Digitizing Your Photos. In the course of this, I’ll also be fleshing out real-world workflow for Computational Tagging as discussed in The DAM Book 3 Sneak Peeks.
In the book Digitizing Your Photos, I made the case for digitizing textual documents as part of any scanning project. This includes newspaper clippings, invitations, yearbooks and other memorabilia. These items can provide important context for your image archive and the people and events that are pictured.
Ideally, you’ll want to change the visible text into searchable metadata. Instead of typing it out, you can use Optical Character Recognition (OCR) to automate the process. OCR is one of the earliest Machine Learning technologies, and it’s common to find in scanners, fax machines and in PDF software. But there have not been easy ways to automatically convert OCR text to searchable image metadata.
In Digitizing Your Photos, I show how you can manually run images through Machine Learning services and convert any text in the image into metadata through cut-and-paste. And I promised to post new methods for automating this process as I found them. Here’s the first entry in that series.
The Any Vision Lightroom Plugin
I’ve been testing a Lightroom plugin that automates the process of reading visible text in an image and pasting it into a metadata field. Any Vision from developer John Ellis uses Google’s Cloud Vision service to tag your images for several types of information, including visible text. You can tell Any Vision where you want the text to be written, choosing between one of four fields. as shown below.
Here is part of the Any Vision interface, with only OCR selected. As you can see, you have the ability to target any found text to either the Caption, Headline, Title or Source filed. I have opted to use the Headline field myself, since I don’t use it for anything else.
Here are my findings, in brief:
- Text that appears in real-life photos (as opposed to copies of textual documents) might be readable, but the results seem a lot less useful.
- Google does a very good job reading text on many typewritten or typeset documents. If you have scanned clippings or a scrapbook, yearbook or other typeset documents, standard fonts seem to be translated reasonably well.
- Google mostly did a poor job of organizing columns of text. It simply read across the columns as though they were one long line of non-sensical text. Microsoft Cognitive Services does a better job, but I’m not aware of an easy way to bring this into image metadata.
- Handwriting is typically ignored.
- For some reason, the translate function did not work for me. I was scanning some Danish newspapers and the text was transcribed but not translated. I will test this further.
(Click on images to see a larger version)
Let’s start with an image that shows why I’m targeting the Headline field rather than the caption field. This image by Paul H. J. Krogh already has a caption, and adding a bunch of junk to it would not be helping anybody.
You can also see that the sign in the background is partially recognized, but lettering in red is not seen and player numbers are ignored even though they are easily readable.
In the example below, from my mother’s Hollins College yearbook, you can see that the text is read straight across, creating a bit of nonsense. However, since the text is searchable, this would still make it easy to find individual names or unbroken phrases in a search of the archive.
You can also see that the handwriting on the page is not picked up at all.
The Bottom Line
If you have a collection of scanned articles or other scanned textual documents in Lightroom, this is a great way to make visible text searchable. While Google is not the best OCR, thanks to Any Vision, it’s the easiest way I know of to add the text to image metadata automatically.
AnyVision is pretty geeky to install and use, but the developer has laid out very clear instructions for getting it up and running and for signing up for your Google Cloud Vision account. Read all about it here.
Google’s Cloud Vision is very inexpensive – it’s priced at $.0015/per image (which works out to $1.50 for 1000 images.) Google will currently give you a $300 credit when you create an account, so you can test this very thoroughly before you run up much of a bill.
Watch for another upcoming post where I outline some of the other uses of Any Vision‘s tagging tools.
I had the pleasure of making a visit to the National Archives and Records Agency early this month, meeting with Steve Greene and Cary McStay who are in charge of scanning the official photos from the Nixon administration. They are using a digital camera to do the scanning, in much the same way as I outline in Digitizing Your Photos.
The film from the White House Photo Office was transferred to NARA so that it would be sure to be preserved. (This was done along with the audio tapes which are still being digitized.) There were 258,318 images on 14,526 rolls. Most of it is 35mm b&w or C22 negative film. There is also some 4×5 film.
NARA first used conventional scanners for the project, but it was clear that conventional scanning was going to take too long to accomplish. They began testing camera scans, and became very comfortable that the quality coming out of a D810 was high enough for the vast number of uses of the archive.
Shown below are some of the photos that were scanned as part of this project. Eventually, the images will be transferred to the Nixon Library in Yorba Linda, where they will be available to scholars, authors and others interested in this period in our history.
PS – NARA is also digitizing the Nixon audio tapes. These are going more slowly due to a number of factors, and are still in production.
The window into a hallway at NARA counts down the remaining Nixon tapes to digitize. I had not noticed it at the time, but there is a nice collection of Nixon-themed figurines sitting on the window sill, including the presidential series of Pez dispensers and Futurama figurines.
This post is adapted from the forthcoming The DAM Book3.
There is a lot of hype and hazy discussion about the future of AI, but it’s often very loosely defined. In a previous blog post, I made the case for lumping a lot of this into a category I’m calling Computational Tagging. In the second post, I made a distinction between Artificial Intelligence, Machine learning, and Deep Learning, In this post, I’ll outline a number of the capabilities that fall under the rubric of Computational Tagging.
What can computers tag for?
The subject matter will be an ever growing list, and in large part will be determined by the willingness of people and companies to pay for these services. but as of this writing, the following categories are becoming pretty common.
- Objects shown – This was one of the first goals of AI services, and has come a long way. Most computational tagging services can identify objects, landscapes and other generically identifiable elements.
- People and activities shown – AI services can usually identify if a person appears in a photo, although they may not know who it is unless it is a celebrity or unless the service has been trained for that particular person. Many activities can now be recognized by AI services, running the gamut from sports to work to leisure.
- Specific People – Some services can be trained to recognize specific people in your library. Face tagging is part of most consumer-level services and is also found in some trainable enterprise services.
- Species shown – Not long ago, it was hard for Artificial Intelligence to tell the difference between a cat and a dog. Now it’s common for some services to be able to tell you which breed of cat or dog (as well as many other animals and plants.) This is a natural fit for a machine learning project, since plants and animals are well-categorized training set and there are a lot of apparent use cases.
- Adult content – Many computational tagging services can identify adult content, which is quite useful for automatic filtering. Of course, notions of what constitutes adult content varies greatly by culture.
- Readable text – Optical Character Recognition has been a staple of AI services since the very beginning. This is now being extended to handwriting recognition.
- Natural Language Processing – It’s one thing to be able to read text, it’s another thing to understand its meaning. Natural Language Processing (NLP) is the study of the way that we use language. NLP allows us to understand slang and metaphors in addition to strict literal meaning. (e.g. we can understand what the phrase “how much did those shoes set you back?”). NLP is important in tagging, but even more important in the search process.
- Sentiment analysis – Tagging systems may be able to add some tags that describe sentiments. (e.g. It’s getting common for services to categorize facial expressions as being happy, sad or mad.) Some services may also be able to assign an emotion tag to images based upon subject matter, such as adding the keyword “sad” to a photo of a funeral.
- Situational analysis – One of he next great leaps in Computational Tagging will be true machine learning capability for situational analysis. Some of this is straightforward (e.g. “this is a soccer game”.) Some is more difficult (“This is a dangerous situation.”) At the moment, a lot of situational analysis is actually rule based. (e.g. Add the keyword vacation when you see a photo of a beach.)
- Celebrities – There is a big market of celebrity photos, and there are excellent training sets.
- Trademarks and products – Trademarks are also easy to identify, and there is a ready market for trademark identification (e.g. alert me whenever our trademark shows up in someone’s Instagram feed). When you get to specific products, you probably need to have a trainable system.
- Graphic elements – ML services can evaluate images according to nearly any graphic component. This includes shapes and colors in an image, These can be used to find similar images across a single collection or on the web at large. This was an early capability of rule-based AI service, and remains an important goal for both ML and DL services. .
- Aesthetic ranking – Computer vision can do some evaluation of image quality. It can find faces, blinks and smiles. It can also check for color, exposure and composition and make some programmatic ranking assessments.
- Image Matching services – Image matching as a technology is pretty mature, but the services built on image matching are just beginning. Used on the open web, for instance, image matching can tell you about the spread of an idea or meme. It can also help you find duplicate or similar images within your own system, company or library.
- Linked data – There is an unlimited body of knowledge about the people, places and events shown in an image collection – far more than could ever be stuffed in to a database. Linking media objects to data stacks will be a key tool to understanding the subject matter of the photo in a programmatic context.
- Data exhaust – I use this term to mean the personal data that you create as you move through the world, which could be used to help understand the meaning and context of an image. Your calendar entries, texts or emails all contain information that is useful for automatically tagging images. There are lots of difficult privacy issues related to this, but it’s the most promising way to attach knowledge specific to the creator to the object.
- Language Translation – We’re probably all familiar with the ability to use Google Translate to change a phrase from one language to another. Building language translation into image semantics will help to make it a truly transcultural communication system.
It is with a healthy dose of chagrin that I report that the publication of The DAM Book 3 will be postponed yet again. I have been working on the book full time for the last three months (and quite a bit before that), and it is simply taking a long time to get it done properly.
When I announced an outline and publication date in early September, I was assuming that I could reuse as much as 40% of the copy in the book. As it currently stands, that number is hovering at close to 1%. Changes in the digital photography ecosystem and in the book’s scope have driven a need to rewrite everything.
Not only has the rewriting been time consuming, but the changes in imaging and associated technologies has required a lot of research. I’ve been chasing down a lot of details on topics like artificial intelligence and machine learning, new technologies like depth mapping, and the state of the art in emerging metadata standards. It’s been a lot more work than I anticipated.
We saw a couple late-breaking changes that have been very important to include in the book. October’s release of a cloud-native version of Lightroom helps to complete the puzzle of where imaging and media management
Complicating matters, I’m going in for ankle replacement surgery in early December. I’ll be finishing the book while my leg is healing. But the pace at which I can work while recuperating is unknown, so I’m not prepared to make another announcement about publication dates.
In the end, I’ve had to choose between hitting a deadline and making the book be as good as possible. I’ve opted for quality.
Sneak Peek blog posts
I’ve been working with my editor to identify and publish content from the new book as we continue in production. The first series of these posts will provide some insight on Computational Tagging, a subject I first posted about last month.