The DAM Forum
Welcome, Guest. Please login or register.
December 22, 2014, 12:52:41 PM

Login with username, password and session length
Search:     Advanced search
28021 Posts in 5141 Topics by 2910 Members
Latest Member: kbroch
* Home Help Search Login Register
+  The DAM Forum
|-+  DAM Stuff
| |-+  Data Validation
| | |-+  Great Windows Utility For Verifying Archive Photos
« previous next »
Pages: 1 [2] Print
Author Topic: Great Windows Utility For Verifying Archive Photos  (Read 16417 times)
danaltick
Hero Member
*****
Posts: 1616

evaa-xdtb@spamex.com danaltick
View Profile WWW Email
« Reply #15 on: July 23, 2009, 07:27:01 PM »

Peter,

I see.  I've been fixated on the DNG's, but the derivatives are also relevant here.  Macro validation sounds like a worthwhile persuit.  Thanks for clearing that up.  I guess the only concern here though might be the longevity of the validation software or the proprietary nature of the hash codes.

Dan
« Last Edit: July 23, 2009, 07:31:39 PM by danaltick » Logged

WindowsXP, ImageIngester Pro, RapidFixer, IVMP 3, ACR4, Photoshop CS4, Controlled Keyword Catalog, Canon EOS50D
peterkrogh
Administrator
Hero Member
*****
Posts: 5682


View Profile Email
« Reply #16 on: July 24, 2009, 01:20:24 PM »

Sounds like the hashes are open-source MD-5 (or other).  I assume the mechanics of tracking how the hashes correspond to the directory structure is proprietary.

Peter
Logged
Dan Zemke
Jr. Member
**
Posts: 51


View Profile
« Reply #17 on: July 24, 2009, 03:04:04 PM »

Peter's essentially right except for the proprietary assumption.

MD5 is freely available from many sources.  All Linux and Mac systems ship with it.  Programming languages such as Python and Ruby also have MD5 routines as part of their standard libraries.  And MD5 is a formal part of the internet.  There's a group named the Internet Engineering Task Force that basically defines how the internet is to work.  The MD5 internet standard is documented in rfc1321.

ExactFile outputs a single text line for every file it computes a hash code for.  The line starts with 32 hex characters which represent the computed hash code followed by the path to the file relative to relative to the text file that ExactFile creates.  This is an example of one line from my system (the star indicates that the path is a relative one):

382b6a99e0632abf51cd9b36684b3c2d *DRZ_Photos\2009\2009_01\Ice Storm\DRZ_090107-c0443-O1.CR2

Dan
Logged
JoeThePhotographer
Full Member
***
Posts: 208


View Profile
« Reply #18 on: November 22, 2009, 05:03:39 AM »

Forgive my ignorance, but does this go beyond the capabilities of Syncback SE?

Joe
Logged
peterkrogh
Administrator
Hero Member
*****
Posts: 5682


View Profile Email
« Reply #19 on: November 22, 2009, 07:25:01 AM »

Joe,
there are several kinds of data validation.
Syncback can tell you that a file was copied properly.
That does not tell you anything about the file even one instant after the copying has finished.
A checksum or hash can be used to tell if the file changes at any point in the future.

http://www.dpbestflow.com/data-validation/data-validation-overview
Peter
Logged
Jake Livni
Newbie
*
Posts: 5


View Profile
« Reply #20 on: October 24, 2010, 02:51:42 PM »

Dan,

I want to try out ExactFile; it sounds like it does (did?) some of the things I need at the moment.
(I want to confirm the integrity of a copy of a photo archive of 250 - 300 GB before deleting the old archive.  It seems that ExactFile can do this.)

Do you have any updated advice on this since you last wrote about it?
Any warnings or suggestions?
Alternatives?

It sounds simple enough.

Thanks.

Jake

Logged

Canon, Nikon Coolscan, Win XP
JoeThePhotographer
Full Member
***
Posts: 208


View Profile
« Reply #21 on: February 13, 2011, 06:43:27 PM »

Any updates on ExactFile?  It looked promising, but development seems to have slowed.

Would it actually be hard to write a script that creates an MD5 hash (which I understand is freely transferable) in a separate file for every file on a disk?

Joe
Logged
Jake Livni
Newbie
*
Posts: 5


View Profile
« Reply #22 on: February 14, 2011, 01:14:29 AM »

I have been using ExactFile in the last few months and like it very much.  It works very nicely and does what I want - it gives me confidence that data on disk is precisely the data I put there and what I expect it to be.  Every single bit.

In principle, it's quite simple.  It calculates a Hash (MD5 or your choice of many others) for each file in a directory structure (including all sub-directories, if you choose) and saves the calculated hash together with the filespec on a single line of a simple text file (which they call a Digest).  The result is a text file, the Digest, (easily read in Notepad) that lists the files with the hashes, one line per file.  You can choose where to save this Digest. 

Then, at any time in the future, you can run ExactFile again against the Digest and it will again create hashes for each of the files listed in the Digest and also compare them to the hashes that were produced in the original run.  If everything matches, every bit is in the right place and ExactFile will report this.  If there are any discrepancies or missing files, ExactFile will alert you to which was the failing (or missing) file.  (If you've added a file to the directory somewhere and it isn't listed in the Digest, ExactFile will not spot this, so it isn't a complete mirror-checker.  However a simple "properties" check in Windows to check number of files and space-on-disk can confirm complete mirroring.)

MD5 is a good all purpose hash method, though it might mathematically be a touch less than 1,000%.  Other hash methods  may be more reliable or less so and may take more or less CPU to calculate.  For my purposes in photography, MD5 is more than satisfactory.  (I am not a mathematician by training.)

Something I haven't done but which is available with ExactFile is the ability to also create an applet to store with the Digest, say on optical media.  This way, you can distribute (or save) burned DVDs and let someone in another place (or time) run the applet to confirm that the files are all valid, even without installing the application again.  Neat.

I have used ExactFile a number of times to confirm the integrity of archive files after transfer from disk to disk and before deleting the originals.  I have used it on hundreds of GB of data - and it works just fine even when run in a single run with thousands of image files in a deep directory structure and with a hundred or so GB of data.  In my experience, it will on rare occasion flag a false negative error when there are sub-directory names in a foreign language (right-to-left text), claiming a file to be missing; a quick check of the Digest will show a one-char spelling error and that the file is indeed present.  (You can also create a single-file hash to check things manually on a single file.)  If you're using English-language file names (and character-set) throughout, I wouldn't expect this problem to come up.

When creating the Digest, be careful to UNCHECK the "Include Full Paths in Output" so that you can compare the source files to the target files stored in ANOTHER location.  Otherwise, when checking files, it will check the originals, only.  If that's what you want to do (e.g. on DVD), fine.  For file transfers, though, you want this UNCHECKED.

Yes, the copy I downloaded a few months ago seems to be Beta but it works very nicely.  The confidence it inspires in file transfers (or storage) is valuable.

Jake

Logged

Canon, Nikon Coolscan, Win XP
Pages: 1 [2] Print 
« previous next »
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.11 | SMF © 2006-2009, Simple Machines LLC Valid XHTML 1.0! Valid CSS!