Anna Dickson and I have, once again, made a proposal for SXSW. This time it’s called the Machine Learning Bake-off. In this presentation we’ll do some real-world comparisons of Machine Learning services for analyzing photos. We will test services and present findings on the good, the bad and the ugly.
Here’s the proposal, including the link to vote.
And here’s the proposal info.
Machine Learning Bake-off
Is ML the solution for making sense of vast collections of images? In demo form, it looks amazing. But does it really provide actionable information for you, or does it junk up your tags with a lot of low value (and wrong) information? Time for a taste test! In this presentation, you’ll see the results of real world testing from leading services – Google, Amazon, Microsoft and Clarifai. Our test set includes a wide variety of images representing multiple industries and tagging challenges. We’ll show you where each ML shines, and where each misfires, and how the serviuces have evolved. Armed with our evidence and conclusions, you can decide if it’s delicious, or not yet ready to eat. As a bonus, we’ll show you how to easily run your own test on tens of thousands of images for under $200.
• Get a solid idea of the info that Machine Learning can currently add to image collections. Understand what it’s good and bad for.
• Get a handle on the differences between ML services and how each can help you. Get a better idea of how to evaluate your options.
• There is no substitute for some real-world testing on your own material – at scale – if you want to determine the value of a service. Here’s how.
These notes are prepared for the attendees of my talk at Henry Stewart DAM Europe summer 2019. In this talk I show how you can use Lightroom and the Anyvision plugin to run a collection of images through a Machine Learning tagging service (Google Cloud Vision) and evaluate whether the tags may be of use for your collection and your users.
If you don’t already have it, you’ll need to get Adobe Lightroom Classic (or one of the previous version of Lightroom 5.7 or later. This comes with an Adobe Creative Cloud subscription. There is also a “photographer’s plan” which is £9.98/month in the UK that gives you Lightroom and Photoshop. Here’s a link: https://www.adobe.com/uk/creativecloud/photography.html
The plugin is licensed on a “pay what you think is fair” model. It’s a very nice piece of work. If you’re using it for a corporate collection, $30 or $40 seems fair.
Prior to testing, you need to make sure you have Lightroom Classic or other compatible version of Lightroom, download the plugin, and install.
Create a sample collection of at least a few thousand images to test with. I suggest a broad range of subject matter and sources.
Add these images to a Lightroom catalog dedicated to the test
If you want to test ML tags only, strip all other info first
Close the catalog and make a duplicat of the entire catalog. This will be useful in later testingNow let’s run the first test to see the entire universe of tags that Google might assign.
Select all images and run Plugins>Anyvision>Analyze
Set per screenshot below:Some notes on the settings:
I have set all threshold to 0 to get the largest number of tags. In all likelihood, we’re going to want to set these to a higher number like 75. (With the exception of Landmarks, which seem to include very few false positives).
I have this set to write and OCRd text to the Headline field, which is often empty. You could also write it to the Caption (also known as Description) field. Caption is a more broadly accessible field.
I have included the scores, which will only show up in the Anyvision panel in Lightroom’s metadata panel.
I have checked the box to have Anyvision make letter-based subgroups of returned results to help keep the tags visually organized in the keywords panel.
I’ve also asked it to add GPS data whenever it recognizes a landmark.
I’ve checked the Reanalyze box, although this is only of use when running these images through a second time for comparison purposes.
I only run the translation on the OCR text, but it you have need to make the keywords available in multiple languages, you could do that here.
Making multiple catalog
Once you’ve run the images through Anyvision, you can repeat the process at different confidence levels to see what level is optimal for your own collection and metadata usage. I did that by running it at 0, 50, 75 and 90. To run again, here’s what I suggest:
Take the duplicate catalog made above, and duplicate it again.
Rename the catalog for the confidence level which you would like to run the process.