Category Archives: The DAM Book 3

Where does “the truth” live?

This post is adapted from The DAM Book 3.0. In this post, I outline the structural approaches for media management and how they are changing in the cloud/mobile era.  

Back in the early digital photography days, there was a debate about where the authoritative version of a file’s metadata should live. People who liked file browsers would say “the truth should be in the file.” People like me who advocated for database management would say “the truth should be in the database.”

The argument here was how to store and manage metadata, and especially how to handle changes and conflicts between different versions of image metadata. This is a fundamental DAM architecture question.

For a number of years, the argument was largely settled – the only way to effectively manage large collections required the use of a catalog database to be the source of truth. This still holds true for most of my readers. But there’s a new paradigm for managing metadata/versions/collaboration, and eventually it’s going to be the best way forward.

The truth can also live in the cloud. And that’s the way that app-managed library software is being designed. It’s what we see with Lightroom CC, Google Photos, and Apple Photos. Because the cloud is connected to all versions of a collection, it can resolve differences between them and keep different instances synchronized. Typically, it does this by letting the most recent change “win,” and propagating those to the other versions.

Allowing a cloud-based application to synchronize versions and resolve conflicts is really the only way to provide access across multiple devices, or multiple users and keep everything unified.

The truth in the cloud is also the paradigm for enterprise cloud DAM like Widen and Bynder. It’s fast becoming the preferred method to allow distributed collaboration, even for people in the same office.

But there’s a rub, at least for now.

Cloud-based applications will not work for some people – at least not yet. The library may be so large that it’s too costly to store it in the cloud. Or you may not have enough bandwidth to upload and download everything in a reasonable time frame. Or storing stuff on other people’s computers may make you uncomfortable. Some of these problems will be solved by the march of technology and some may never be solved.At the moment, it’s often best to take a hybrid approach where the ultimate source of truth lives in a private archive that is stored on hardware in your own possession. Files can be pushed to the cloud component to be used for distribution and collaboration.

As you decide which system best suits your needs, understanding where “the truth” lives is an essential component for creating distributed access to your collection.

The DAM Book 3.0 Index

We’ve created an index for The DAM Book 3.0. While this was not terribly necessary for electronic versions of the book, it’s quite helpful for the print version (at the printer now – expected delivery before the end of July).I’ve never personally created an index before, so this was a learning experience for me. It ended up being a tremendous amount of work – maybe 50 hours of combing through the book, making entries, organizing information and then reorganizing it.

If you have already bought the PDF, you’ll soon get an announcement of the update along with a download link. If you don’t have a copy of the book, the index will give you a very good idea of the breadth and depth of the content it includes.

Here’s a PDF of the Index. You can click the top right to see it full screen, or download it onto your computer.

Download (PDF, 4.12MB)

Embedded photos as platforms for information or commerce

This post is adapted from The DAM Book 3.0. In that book, I describe the ways that connectivity is changing the way we use visual images. In this post, I outline how embedded media can enable new kinds of connections between people, ideas and commerce. 

As connected images become more essential for communication and engagement, image embedding creates a new opportunity to gather and disseminate information. A traditional web page uses images packaged up as JPEGs and sent out as freestanding files. But images can also be displayed using embedding techniques. Embedded images (like embedded videos), reside on a third party server and are displayed in a frame or window on another site’s web page.Embedded media offers a direct connection from the server, through the web page or application all the way to the end user. This can provide a two-way flow of information, as well as the ability to customize the embedded media to suit the needs of the end user with updates, custom advertising or other messaging.

Let’s call these embedded objects, because they are actually more complicated than freestanding images. A YouTube video embedded on a web page is an example of an embedded object. The web page draws a box and asks the YouTube media server to fill that box with a video stream.

There is a live link which runs through the webpage, between the viewer’s device and the YouTube server. Because there is a link between YouTube and the viewer, there is a two-way
flow of data back and forth. This allows YouTube to gather all kinds of information, and it allows YouTube to also push out customized information through the window.

The media server can know who sees an image, how they got there, what they are interested in, who they interact with, what other sites they go to, what they search on and more. And the media server can present customized information to the end viewers based on what it knows about them. Remember, these windows are basically open pipelines that serve up the media on-demand.

Once only for video, now for still images too
Of course, the practice outlined above has been part of the business model for video services for a long time. Videos on web pages have historically been hosted by third-party servers, and we have been accustomed to YouTube ads for a decade. But it’s relatively new for still images, which could always be easily and cheaply added to web pages as JPEGs. The most significant marker for change was the introduction of free embedding by Getty Images.

When the stock photography giant decided to make vast numbers of images available for free embedding, it signaled that embedded objects were going to be an important part of its strategy moving forward. Getty has opened up millions of individual pipelines through blogs and other web pages, with the ability to collect and serve information in service of new business strategies.

The use case for images as platforms for two-way communication should be favorable moving forward. Mobile devices increasingly rely on photos instead of text headlines, and methods for connectivity are improving. In the last few years, we’ve seen several companies hang their business models on embedded image objects.

At this writing, Getty has gotten the most traction in such a service, but others are trying. Retailers are using embedded images as mini storefronts, and mission-driven organizations can use them to spread their messages in a viral manner.

What can you do with Embedded objects?
There are several valuable things you an do with embedded objet that are much harder or impossible with standard JPEGs.
• You can add a level of copyright protection that disables right-click saving.
• You can enable deep zoom features that are managed by the server.
• You can add purchase buttons or “more info” links directly onto the image.
• You can update the image when something changes (e.g. product updates.)

Okay, I’m interested – now what?
Making use of embedded media for still photos is an emerging capability. Several companies have taken a run at it, but none has fully cracked the code yet (and even Getty has not publicly disclosed how they intend to monetize the technology).  SmartFrame is offering this embedding as a service that bolts on to your DAM. The thing I like about their business model is that it works in service of the image owner, not the middleman like Getty and YouTube do.

SmartFrame can help you with security, sharing, tracking and monetizing.

And the International Image Interoperability Framework is also building around this concept. (“Come for the deep zoom, stay for the great metadata interchange.”) I’ll have more on this project in another post.

I’m keeping close watch on this capability, and I’ll report as more information comes in. I first wrote about this topic in 2013 in this post.


The DAM Book 3.0 released

It is with great pleasure that we can announce the release of the full digital version of The DAM Book 3.0. In the nine years since the last version was published, our use of visual media has become marked by increasing connectivity – between images, people, data, applications and web services. The new book helps you understand this connectivity and how to evaluate which tools are right for you.

The change in covers between TDB2 and TDB3 help to illustrate the differences between the versions. In 2009, it was fine to think about collection management as “a big ball of my stuff.” In our connected world, that’s not enough. Your image collection will connect to other people, images, services, applications, ideas and information. You will also probably have additional satellite collections on other devices or services. And, yes, like Tinkertoys, these connections often plug into each other in standardized ways. 

In the new book, I’ve laid out the elements of the connected media ecosystem. We’re seeing connection and integration in  all aspects of the way we make and use images. Connectivity is often born into an image at the moment of capture, and increases as your images move through the lifecycle. Almost everything that hosts or touches your images has some type of connection.

The new world of cloud and mobile workflow are impacting every part of the way we manage and make use of images. File formats, storage, backup and metadata are evolving to incorporate these new capabilities. I’ve rewritten the book from the ground-up so that connectivity is built in to each topic.

Of course, connectivity is not the only change that has come to digital media over the last nine years. The use and importance of photography has expanded dramatically, and anyone who wants to understand how visual media works can find important context in this book.

The DAM Book 3.0 Release 2

Dateline – Athens, Georgia – We’ve released the next set of chapters for The DAM Book 3.0, adding 325 more pages to the initial Chapter 1 release, for a total of 363 pages. These chapters cover some of the most fundamental and important parts of image management.
Chapters released today include:

  • Image Objects and File Formats – a discussion of the evolution of digital images and the formats used to embody them.
  • How Metadata Works – A deep dive into the nuts and bolts of modern metadata.
  • Using Metadata – a guide to the effective use of metadata to achieve goals that are important to you.
  • Logical Structure – discussion of the different types of file systems that we now use (computer, mobile, cloud, NAS) and a file and folder strategy for each of these.
  • Digital Storage Hardware – a comprehensive look at the current storage options for your digital media.
  • Backup, Restoration and Verification – preservation of your archive requires you to think of these processes as part of a unified system.

Anyone who has purchased the pre-release copy should have gotten an email with instructions for downloading the new version. And if you have not ordered yet, you can still get in on the 10% pre-release discount. The discount runs until the release of the final version, scheduled for the end of April.

Huge thanks to the DAM Useful production team for their Herculean effort in getting this release out on time. Elinore Wrigley of Me Jayne Design Cape Town, South Africa, Dominique le Roux of Moonshine Media, Vientiane, Laos and Bobbi Kittner of Kittner Design, Takoma Park, Maryland did another outstanding job.

Special thanks to Josie Krogh and Steve Thomas for letting us set up an Athens, GA field office for the final push.

The DAM Book 3.0 now available for pre-order!

We are pleased to announce that The DAM Book 3.0 is now available for pre-order! As with our previous books, you can pre-order the book for at a discount.

Here are the details:

  • Electronic book:  Regular price $34.95
  • Pre-order discount price:  $31.46
  • All pre-orders will get an advance copy of Chapter 1, Visually Speaking at the time of purchase.
  • We will deliver at least 7 of the additional 11 chapters by March 31st, 2018.
  • Additional chapters will deliver in April 2018.
(Click for larger view)

Print Copies
Print copies will be available over the summer. Your purchase of an electronic copy can be applied to a print copy, once it’s available.
More Info: Click The DAM Book 3.0 product page here.

Computational Tagging – What is it good for? (Absolutely something!)

This post is adapted from the forthcoming The DAM Book3.

There is a lot of hype and hazy discussion about the future of AI, but it’s often very loosely defined.  In a previous blog post, I made the case for lumping a lot of this into a category I’m calling Computational Tagging. In the second post, I made a distinction between Artificial Intelligence, Machine learning, and Deep Learning, In this post, I’ll outline a number of the capabilities that fall under the rubric of Computational Tagging.

What can computers tag for?

The subject matter will be an ever growing list, and in large part will be determined by the willingness of people and companies to pay for these services. but as of this writing, the following categories are becoming pretty common.

  • Objects shown – This was one of the first goals of AI services, and has come a long way. Most computational tagging services can identify objects, landscapes and other generically identifiable elements.
  • People and activities shown – AI services can usually identify if a person appears in a photo, although they may not know who it is unless it is a celebrity or unless the service has been trained for that particular person. Many activities can now be recognized by AI services, running the gamut from sports to work to leisure.
  • Specific People – Some services can be trained to recognize specific people in your library. Face tagging is part of most consumer-level services and is also found in some trainable enterprise services.
  • Species shown – Not long ago, it was hard for Artificial Intelligence to tell the difference between a cat and a dog. Now it’s common for some services to be able to tell you which breed of cat or dog (as well as many other animals and plants.) This is a natural fit for a machine learning project, since plants and animals are well-categorized training set and there are a lot of apparent use cases.
  • Adult content – Many computational tagging services can identify adult content, which is quite useful for automatic filtering. Of course, notions of what constitutes adult content varies greatly by culture.
  • Readable text – Optical Character Recognition has been a staple of AI services since the very beginning. This is now being extended to handwriting recognition.
  • Natural Language Processing – It’s one thing to be able to read text, it’s another thing to understand its meaning. Natural Language Processing (NLP) is the study of the way that we use language. NLP allows us to understand slang and metaphors in addition to strict literal meaning. (e.g. we can understand what the phrase “how much did those shoes set you back?”). NLP is important in tagging, but even more important in the search process.
  • Sentiment analysis – Tagging systems may be able to add some tags that describe sentiments. (e.g. It’s getting common for services to categorize facial expressions as being happy, sad or mad.) Some services may also be able to assign an emotion tag to images based upon subject matter, such as adding the keyword “sad” to a photo of a funeral.
  • Situational analysis – One of he next great leaps in Computational Tagging will be true machine learning capability for situational analysis. Some of this is straightforward (e.g. “this is a soccer game”.) Some is more difficult (“This is a dangerous situation.”) At the moment, a lot of situational analysis is actually rule based. (e.g. Add the keyword vacation when you see a photo of a beach.)
  • Celebrities – There is a big market of celebrity photos, and there are excellent training sets.
  • Trademarks and products – Trademarks are also easy to identify, and there is a ready market for trademark identification (e.g. alert me whenever our trademark shows up in someone’s Instagram feed). When you get to specific products, you probably need to have a trainable system.
  • Graphic elements – ML services can evaluate images according to nearly any graphic component. This includes shapes and colors in an image, These can be used to find similar images across a single collection or on the web at large. This was an early capability of rule-based AI service, and remains an important goal for both ML and DL services. .
  • Aesthetic ranking – Computer vision can do some evaluation of image quality. It can find faces, blinks and smiles. It can also check for color, exposure and composition and make some programmatic ranking assessments.
  • Image Matching services – Image matching as a technology is pretty mature, but the services built on image matching are just beginning. Used on the open web, for instance, image matching can tell you about the spread of an idea or meme. It can also help you find duplicate or similar images within your own system, company or library.
  • Linked data – There is an unlimited body of knowledge about the people, places and events shown in an image collection – far more than could ever be stuffed in to a database.  Linking media objects to data stacks will be a key tool to understanding the subject matter of the photo in a programmatic context.
  • Data exhaust – I use this term to mean the personal data that you create as you move through the world, which could be used to help understand the meaning and context of an image. Your calendar entries, texts or emails all contain information that is useful for automatically tagging images. There are lots of difficult privacy issues related to this, but it’s the most promising way to attach knowledge specific to the creator to the object.
  • Language Translation – We’re probably all familiar with the ability to use Google Translate to change a phrase from one language to another. Building language translation into image semantics will help to make it a truly transcultural communication system.

Update on DAM Book 3

It is with a healthy dose of chagrin that I report that the publication of The DAM Book 3 will be postponed yet again. I have been working on the book full time for the last three months (and quite a bit before that), and it is simply taking a long time to get it done properly.

When I announced an outline and publication date in early September, I was assuming that I could reuse as much as 40% of the copy in the book. As it currently stands, that number is hovering at close to 1%. Changes in the digital photography ecosystem and in the book’s scope  have driven a need to rewrite everything.

Not only has the rewriting been time consuming, but the changes in imaging and associated technologies has required a lot of research. I’ve been chasing down a lot of details on topics like artificial intelligence and machine learning, new technologies like depth mapping, and the state of the art in emerging metadata standards. It’s been a lot more work than I anticipated.

We saw a couple late-breaking changes that have been very important to include in the book. October’s release of a cloud-native version of Lightroom helps to complete the puzzle of where imaging and media management

Complicating matters, I’m going in for ankle replacement surgery in early December. I’ll be finishing the book while my leg is healing. But the pace at which I can work while recuperating is unknown, so I’m not prepared to make another announcement about publication dates.

In the end, I’ve had to choose between hitting a deadline and making the book be as good as possible. I’ve opted for quality.

Sneak Peek blog posts

I’ve been working with my editor to identify and publish content from the new book as we continue in production. The first series of these posts will provide some insight on Computational Tagging, a subject I first posted about last month.

Computational Tagging – Artificial Intelligence, Machine Learning, and Deep learning

This post is adapted from the forthcoming The DAM Book3.

There is a lot of hype and hazy discussion about the future of AI, but it’s often very loosely defined.  In a previous blog post, I made the case for lumping a lot of this into a category I’m calling Computational Tagging. In this post, I’ll split that into some large component parts. (Read the next post here).

What’s the difference between Computational Tagging, Artificial Intelligence, Machine Learning, and Deep Learning?

While the definitions of these processes have a lot of overlap, we can draw some useful distinctions. Let’s use a Venn diagram to illustrate the relationships.

Computational tagging refers to any system of automated tagging that is done by a computer. This includes the metadata added by your camera. It also includes information like a Wikipedia page or other network-accessible information  that could be added by simple linking.

Artificial Intelligence (AI) encompasses any computer technology that appears to emulate human reasoning. AI could be as simple as a set of rules that can create an intelligent looking behavior (e.g. a self-driving car could be taught the “rule” that you don’t want to cross a double yellow line.) AI also includes the more complex services  outlined below.

Machine Learning (ML) is a subset of AI that is more complex. Instead of just following an established  set of rules, in an ML environment, the system can be trained to discover the rules. An ML system for identifying species, for instance, uses a training set of tagged images to figure out what a Labrador retriever looks like.

Deep Learning (DL) is a specific type of ML that makes use of a predictive model in its learning process. This process actually mimics the way the brain works. In Deep Learning, the system does not just look at  results, but it uses a predictive model to train itself.  It is constantly testing a hypothesis against results, and adjusting the hypothesis according to this results.

Here’s how it works in your brain. The central nervous system is providing constant  input stimulus. Your brain then makes constant predictions about what the next input should be. When the input does not match the prediction, it recalibrates. You experience this process when you taste something you expect to be sweet and it’s salty, or when you take a step and the level of the ground is not where you expect it to be.

Read the next post here.

Computational Tagging

In my SXSW panel this year, Ramesh Jain and Anna Dickson and I delved into the implications of Artificial Intelligence (AI) becoming a commodity, which will be a commonplace reality by the end of 2017.  We looked at several classes of services and considered what they were good for.

I’ve been spending a lot of time on the subject over the last few months writing The DAM Book 3. Clearly AI will be important in collection management and the deployment of images for various types of communication.

But I  hate using the term AI to describe the array of services that help you make sense of your photos. There’s actually a bunch of useful stuff that is not technically AI. Adding date or GPS info is definitely not AI. And linking to other data (like a wikipedia page) is not really AI. ( It’s actually just linking). Machine Learning and programmatic tagging comes in a lot of forms – some is really basic, and some is complex.

The term Computational Imaging was pretty obscure when the last version of The Dam Book was published, but it’s become a very common term. I think this is a useful concept to extend to the whole AI/Machine Learning/Data Scraping/Programmatic Tagging stack.

In The DAM Book 3, I’m using the term Computational Tagging to refer to all the computer-based tagging methods that involve some level of automation. This runs from the tags made by the computer in my camera to the sophisticated AI environments of the future. At the moment, it’s not widely-used term (Google shows 138 instances on the web), but I think it’s the best general description for the automatic and computer-assisted tagging that are becoming an essential part of working with images.