The DAM Forum
Welcome, Guest. Please login or register.
June 19, 2013, 11:25:00 AM

Login with username, password and session length
Search:     Advanced search
Jan 9, 2012
John Beardsworth's new Lightroom site
Lightroom Solutions
27968 Posts in 5116 Topics by 2914 Members
Latest Member: imthedamstar
* Home Help Search Login Register
+  The DAM Forum
|-+  DAM Stuff
| |-+  Keywords and Controlled Vocabulary
| | |-+  Controlled vocabulary , keywords, software
« previous next »
Pages: [1] 2 Print
Author Topic: Controlled vocabulary , keywords, software  (Read 7679 times)
themandarin
Newbie
*
Posts: 7


View Profile
« on: February 17, 2006, 03:39:49 AM »

Hi,
   I would like to start adding keywords to my pictures but in an organised way using this idea of controlled vocabulary. Theres seems to be some software that has lists of this controlled vocabulary based on the site of the same name...

What I would like to have  some software that allows me to use/choose this controlled  vocabulary for my keywords and create some of my own and then allow me to write straight to my photo files..

Iview, Bridge allows you to create and write keywords , unfortunately there is no integrated database to choose my keywords  from.

Anyone come across this sort of thing ?

Andrew
« Last Edit: February 18, 2006, 10:18:15 AM by peterkrogh » Logged
peterkrogh
Administrator
Hero Member
*****
Posts: 5682


View Profile
« Reply #1 on: February 17, 2006, 06:00:50 AM »

Andrew,
We are getting close to this being a reality inside DAM software.  No one's quite there yet.  This is a VERY frequent request.
Peter
Logged
G-Force
Newbie
*
Posts: 13


View Profile WWW
« Reply #2 on: February 17, 2006, 08:55:19 AM »

What about the vocabulary builder in MediaPro 3? You can define a list of terms and then constrain the editing to the defined terms if you choose.
Logged

peterkrogh
Administrator
Hero Member
*****
Posts: 5682


View Profile
« Reply #3 on: February 17, 2006, 11:34:38 AM »

It does have a controlled vocab, but it is not really hierarchical, at least in the interface.  I'd hold off a little while before putting much work into the current format.
Peter
Logged
roberte
Sr. Member
****
Posts: 289


View Profile WWW
« Reply #4 on: February 17, 2006, 05:08:29 PM »

Hi,

IMatch (Windows) can do this using its categories which can be mapped to IPTC fields such as keywords. However IMatch has a GUI only a mother could love.

I just mentioned Breeze Browser (yes Windows again) on another post. It can assign keywords as hierarchical structure. However being an image browser BB it doesn't let you search them. The keyword list is a tab delimited file easily edited outside the app. Breeze Browser Pro comes with a sample Controlled Vocabulary from David Riecks.

So it is possible. Now iView just need to implement it with MediaPro.

-- Robert.
Logged

havezet
Full Member
***
Posts: 176



View Profile WWW
« Reply #5 on: February 17, 2006, 05:15:26 PM »

This seems like something I can add to idImager. How would you expect such a feature to work? What would be the purpose of the "hierarchical" vocabulary? Is the hierarchy based on input properties?

Thanks

Hert
Logged

Author of IDimager
http://www.idimager.com
peterkrogh
Administrator
Hero Member
*****
Posts: 5682


View Profile
« Reply #6 on: February 17, 2006, 05:36:18 PM »

Hert,
this could be very useful.  a couple of issues:

THe IPTC keywords field does not support hierarchy, it is a flat set.  I would think the best way to do this would be to define the hierarchy in the catalog, and be able to export that hierarchy information in XMP form.  So the same information would live as a flat set in the IPTC fields, and as a relational set in the XMP data.

You need to define several relationships:: parent/child, synonyms, and see also's are the main ones.

This is apart from the ability to import and use a predefined controlled vocabulary dictionary. The conventional wisdom has been that one would access some very large CV document that would encompass all place names, genus and species, concepts, and the like.  I would suggest that it would be better to break this up into subsets.  This could be distributed Wiki style (user created). So I might make one for Washington DC place names, and make it available to other people who might need it.

This brings up the question of interchange of the CV dictionaries, which is a whole different subject.

Peter
Logged
havezet
Full Member
***
Posts: 176



View Profile WWW
« Reply #7 on: February 18, 2006, 06:27:40 AM »

Hi Peter,

THe IPTC keywords field does not support hierarchy, it is a flat set.  I would think the best way to do this would be to define the hierarchy in the catalog, and be able to export that hierarchy information in XMP form.  So the same information would live as a flat set in the IPTC fields, and as a relational set in the XMP data.

Thanks for this suggestion. idImager already does this. When storing catalog info to XMP it automatically flattens existing keywords and stores tham as flat XMP keywords (dc:subject) and also maps this to IPTC. It also stores the hierarchical catalog assignments (in a hierarchical form) to the "idImager Core XMP Schema". This hierarchical is accessable from any good XMP supporting tool and should enable other tools to reconstruct a catalog structure in one decides to migrate away from IDI.



This example shows an image's full XMP data. You see that there are 3 labels assigned and I have expanded the first one in full so you see how the label "Hertwig" is organized in the catalog tree.

Fyi; it also stores the versioning (derived images) information to XMP.

You need to define several relationships:: parent/child, synonyms, and see also's are the main ones.

Quote from: peterkrogh
This is apart from the ability to import and use a predefined controlled vocabulary dictionary. The conventional wisdom has been that one would access some very large CV document that would encompass all place names, genus and species, concepts, and the like.  I would suggest that it would be better to break this up into subsets.  This could be distributed Wiki style (user created). So I might make one for Washington DC place names, and make it available to other people who might need it.

This brings up the question of interchange of the CV dictionaries, which is a whole different subject.

Thanks for this info Peter. I will add such a feature soon.

Regards

Hert

PS; I would appreciate it if you could drag this topic out of the "iView Media Pro" forum. I is not related at all and I feel uncomfortable to post in a forum which is dedicated to another product.
« Last Edit: February 18, 2006, 06:37:06 AM by havezet » Logged

Author of IDimager
http://www.idimager.com
ianw
Full Member
***
Posts: 162


View Profile
« Reply #8 on: February 18, 2006, 11:48:19 AM »

Time to stop lurking...

What I think is required is a defined format for controlled vocabularies that can then be used by the DAM package of your choice to enable quick classification of images etc.  It should be XML based and open-source and, importantly, not controlled by a single entity but rather a committee of interested parties.

I've quickly knocked up a couple of examples that would be useful to me.  First is for US states / cities.  I would see the 'Terms' as what is selected within the DAM package and added to the keywords / pre-defined fields.  The 'Use' details are extra key words to be added as appropriate, so if you select the term New York City you also get Big Apple thrown in.  To be universal you could add language specific extras, for example adding the French name for the country as well, if that was the chosen language.  With more information you could specify which IPTC fields each level maps to, so in this case details would go in Country / State / City fields.  It could be extended by someone with local knowledge to produce a New York specific version with an extra level for location.  Then you could have Manhattan, Queens, Bronx, Staten Island etc. and then further down to 5th Avenue, Wall Street etc.  The number of levels should be limited only be requirements / imagination.

Code:
<Vocabulary>
<Description>United States : Country / State / City</Description>
<Levels>
<Level1>Country</Level1>
<Level2>State</Level2>
<Level3>City</Level3>
</Levels>
<Term1>
<Value>United States Of America</Value>
<Term2>
<Value>New York</Value>
<Term3>
<Value>New York City</Value>
<Use>Big Apple</Use>
</Term3>
<Term3>
<Value>Albany</Value>
<Use>Albany</Use>
</Term3>
<Use>New York</Use>
<Use>Empire State</Use>
</Term2>
<Term2>
<Value>California</Value>
<Term3>
<Value>San Francisco</Value>
</Term3>
<Term3>
<Value>Los Angeles</Value>
</Term3>
<Use>Golden State</Use>
</Term2>
<Use lang="fr">Etats Unis</Use>
</Term1>
</Vocabulary>

Another example, using the exact same structure as above, is for bird classification.  This only has two levels of detail.  The 'use' details here are for latin order and family names at level 1 and latin species name at level 2.  Again there are language specific values as well.

Code:
<Vocabulary>
<Description>Birds</Description>
<Levels>
<Level1>Family</Level1>
<Level2>Species</Level2>
</Levels>
<Term1>
<Value>Grebes</Value>
<Term2>
<Value>Little Grebe</Value>
<Use>Podiceps ruficollis</Use>
<Use lang="en">Dabchick</Use>
<Use lang="fr">Grèbe Castagneux</Use>
<Use lang="it">Tuffetto</Use>
<Use lang="es">Zampullín Chico</Use>
</Term2>
<Term2>
<Value>Black-necked Grebe</Value>
<Use>Podiceps nigricollis</Use>
<Use lang="fr">Grèbe a cou noir</Use>
<Use lang="it">Svasso Piccolo</Use>
<Use lang="es">Zampullín Cuellinegro</Use>
</Term2>
<Use>Podicipediformes</Use>
<Use>Podicipedidae</Use>
</Term1>
<Term1>
<Value>Herons and Bitterns</Value>
<Term2>
<Value>American Bittern</Value>
<Use>Botaurus lentiginosus</Use>
<Use lang="fr">Butor D'Amerique</Use>
<Use lang="it">Tarabuso Americano</Use>
<Use lang="es">Avetoro Lentiginoso</Use>
</Term2>
<Term2>
<Value>Grey Heron</Value>
<Use>Ardea cinerea</Use>
<Use lang="fr">Héron Cendré</Use>
<Use lang="it">Airone Cenerino</Use>
<Use lang="es">Garza Real</Use>
</Term2>
<Use>Ciconiiformes</Use>
<Use>Ardeidae</Use>
</Term1>
</Vocabulary>

I'd welcome it if my DAM package implemented something 'open' like this.  I have thousands of pictures to classify and while category sets are a way to do this I'm going to have a rather large number to deal with.  For example I've all my images geographically tagged, but only down to City level.  I have hundreds of pictures of New York that I'd like to classify a bit better.  If I could use an existing Vocabulary set to assist me I'm sure it would be much quicker, more accurate and easier.

Of course I'm full of good ideas - just need someone else to do them !!!.

Regards,

Ian
Logged
peterkrogh
Administrator
Hero Member
*****
Posts: 5682


View Profile
« Reply #9 on: February 18, 2006, 02:19:46 PM »

Ian,
That's quite a load to drop without properly introducing yourself. ;-)
(Forgive me if I am spacing out on a previous introduction, or if we correspond on another list...)
I agree wholeheartedly with the approach. 
Now how can we get the ball rolling on something like this...
Peter
Logged
johnbeardy
Administrator
Hero Member
*****
Posts: 1813


View Profile WWW
« Reply #10 on: February 18, 2006, 02:42:34 PM »

Ian/Peter

That's exactly how I envisage this all working, though I'm a little sceptical about getting DAM packages to adopt such an approach when certainly iView and Portfolio have gone for a more flat file approach.

But storing controlled vocabularies' base data file as XML would make it pretty easy to transform it with an XSL file into the format required by iView for its Vocabulary Editor or into a predefined list file for Portfolio etc.

To do this, as well as repositories of XML data files, people will need a store of XSL files specific to each cataloguing application. While users could supply these, the onus should perhaps be on the vendor to write stylesheets for their required formats and perhaps a web form where the user loads the XML data file, selects the desired output file type, clicks OK and receives an output file in the format required for the application.

How does that sound?

John
Logged
ianw
Full Member
***
Posts: 162


View Profile
« Reply #11 on: February 18, 2006, 04:47:58 PM »

Peter/John,

I rarely make postings on forums as I either get ignored or shot down in flames!  I've bought the Rank and File tool so that's my only communication with Peter, and based on where John is from I've probably ignored him on a train a few times as I'm also from Sarf (East) London!  I'm just an amateur photographer who's too serious about his hobby!  By chance I happened to come across a certain book recently and have since had to rethink everything I do after the shot taken.

I work in IT for a bank, currently on messaging systems for within the bank and between institutions. Almost gone are the days when every communication was proprietary and usually flat in format, though they still exist in large numbers.  Now everything is to an industry standard, in my case trying to implement ISO standard XML messages.  From my first post you can see I maybe think in messages or data structures more than is probably healthy?  I spend too much time in front of a computer during the week, only to come home and do the same.  This is mainly due to me converting 2+ years of pictures to a DNG workflow!

How does this move forward?  Well how many DAM packages are there?  I've used 2 - excluding Photoshop and Bridge - but dropped one for being too hard to use. The one I use is from a small (?) company; the other from a one-man band. How easy would it be for these two companies to agree to use a 'standard'?  It's probably very difficult, if only because they are based in different countries.  As you then include more packages it gets exponentially harder to achieve. That is unless there is something in it for them.  In the banking world it is (now) easier to get standards adopted, as you can always dangle cost savings in front of managers.  Problem is there's no equivalent carrot for DAM packages.  I see two ways of it being taken up.  The first is that a big gun such as Adobe does it.  When they jump everyone else jumps, albeit with varying delays and levels of success.  The other is that a particular package comes up with the killer solution – the problem is convincing them to use an open standard.

Short term that means that John’s suggestion is probably the more realistic way to move forward.  If you have just the hierarchical data such as Country / State / City / Location in an XML document then XSL should easily convert this to any format required by a DAM package.  However how do you get this to add extra data, such as New York = Big Apple?  It also makes internationalisation difficult, if not impossible, unless you have separate English, French, German etc vocabularies.

In the medium term would it not be possible to develop a Bridge script that can handle this?  Select your image(s) then select the vocabulary set and finally the term(s) and your fields and keywords get populated.  Didn't I say I’m full of ideas!  I’m sure that it’s not easy to do.  It would also exclude those who don’t use Bridge, but how many people would that be?

How do you then create the vocabularies?  Once a 'standard' is agreed then this is probably the easy part.  There are too many ways of doing it, from manually in Notepad and upwards.  If they correctly implement it and the resulting vocabulary is valid against the schema then they would all be right, so let the best one win.

Two posts in a day is just too much so I'm off to bed!

Regards,

Ian
Logged
johnbeardy
Administrator
Hero Member
*****
Posts: 1813


View Profile WWW
« Reply #12 on: February 19, 2006, 04:11:14 AM »

Yeah, probably ignored you too, or didn't move down the carriage and let you squeeze on the train!

I was going for a realistic solution and for the sort of reasons you give. Software vendors aren't likely to be quick enough changing the format of the lists they use to populate drop down boxes or autofill lists, and that's not really necessary. I've had no trouble converting the XML of IPTC news codes and ISO country codes into the formats needed by Portfolio and iView. So, software vendors only need to make those formats user-editable and users can make XML files available and figure out the rest.

I doubt there can be a standard XML structure format except where there's something as obvious as your country - location structure. Even then, remember all the changes we've undergone in Europe in the past 16 years, plus you add in internationalisation and synonyms. (Incidentally, the NY=Big Apple question is probably easy enough to resolve for iView since its vocabulary list is 2 dimensional with comma separated synonyms. Type in NY and all the synonyms are filled in, too).

Another reason we can't hope for an agreed standard structure is that rich and structured XML data is going to be available in larger quantities from people with completely different interests. For instance, I just tried to find bird species in XML and quickly found a couple of large files which convinced me and looked easy enough to transform into keywords for iView or Bridge. Another incidental point is that Bridge stores its keywords as XML but you can't swap XML files on the fly. Also you can transform XML into Bridge's metadata templates. Once the data's there in XML format, the world's your lobster.

So to me it seems it's going to be most practical to couple an XML collecting exercise (maybe Peter's suggestion of a Wiki style distributed effort) with advocacy of the transformation techniques. Even small things like pointing people to the best XML editing tools might make a big difference.

John
« Last Edit: February 19, 2006, 04:38:56 AM by johnbeardy » Logged
havezet
Full Member
***
Posts: 176



View Profile WWW
« Reply #13 on: February 19, 2006, 04:38:19 AM »

I like the concept of that Ian's XML above. But one question that instantly pops into my mind is: why not use XMP for this instead of XML. The advantage is that XMP is becomming generally accepted in DAM solutions, it offers standarized support for internationalization/localization and is very easy in in working with bags etc, also in a standard manner. To me I would say XMP is a better solution as the base for a Voc solution.

It's my guess that the different vendors are not going to be open in the short term to support a generic voc. solution. Main reason is that the DAM vendors are not "revolutionary" enough and don't want to be the party to introduce new (open) standards. This is something that will need to be pushed from the market or as Ian said, from a big gun like Adobe (and only Adobe). Such a big gun causes the push from the market so the DAM vendors must follow. If you ever want Adobe to even consider a standarized voc you'll need to walk their path and use technological standards that are they consider strategic. XMP is one of them.

If you don't agree that XMP would be a good solution, please ventilate so we can have that discussion at the beginning.

Anyway, I am currently defining a concept on how a voc could look at in XMP; also based on Ian's XML. I will share my results here for you all to shoot at.

Important aspects are:
Terminology used, outline the scope.

Afterwards:
Getting this generally accepted in the market will not be simple. We need to think about the constraints for this. Myself I'd say for now: 1. there "must" be a freeware tool allowing people to create new "domains", 2. there must be a website/community where existing voc domains can be downloaded/shared, 3. there must be a board that "accepts" new domains to prevent redundant overkill

Hert
« Last Edit: February 19, 2006, 04:40:30 AM by havezet » Logged

Author of IDimager
http://www.idimager.com
Muzza
Jr. Member
**
Posts: 51


View Profile WWW
« Reply #14 on: February 19, 2006, 05:54:57 AM »

I too am intrested in keywords with a hierarchy. I'm at the stage of adding all the hierarchial keywords to my files in a flat sturcture, not because I want too, it's just the only way I know. So I got the idea of adjusting Bridge's Keyword pallet with the help of a friend. Below is an example of a *.xsd file which is referenced by the keword *.xml file in Bridge. This still would be a flat sturcture, but I'm only giving an extra idea and please take it for whats it's worth.

<?xml version="1.0" encoding="utf-8"?>
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="
http://www.w3.org/2001/XMLSchema">
   <xs:element name="keywords">
      <xs:complexType>
         <xs:sequence>
            <xs:element name="Events" type="xs:string" />
            <xs:element name="People" type="xs:string" />
            <xs:element name="Places">
               <xs:complexType mixed="true">
                  <xs:sequence>
                     <xs:element name="Suburbs">
                        <xs:complexType mixed="true">
                           <xs:sequence>
                              <xs:element name="Streets" type="xs:string" />
                           </xs:sequence>
                        </xs:complexType>
                     </xs:element>
                  </xs:sequence>
               </xs:complexType>
            </xs:element>
            <xs:element name="OtherKeywords" type="xs:string" />
         </xs:sequence>
      </xs:complexType>
   </xs:element>
</xs:schema>[/table]

All the best Muzza!!!
Logged
Pages: [1] 2 Print 
« previous next »
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.11 | SMF © 2006-2009, Simple Machines LLC Valid XHTML 1.0! Valid CSS!