If you've got the potential for duplicate files which have had metadata embedded *since* the duplication, then a checksum based approach won't help - change one bit in the file (including metadata) and the checksum will differ. Likewise, if you're trying to also sort out derivative images which may have found their way back into your catalog, checksums won't help.
Without knowing more about your situation, and how technically-oriented you are (are you comfortable with the command line and scripting?) I can offer these bits of advice:
1. Checksums are great and easy for a first pass
2. If you've got multiple copies of a DNG, perhaps even with different settings/metadata applied, look into the XMP property "RawDataUniqueID" as reported by exiftool - this should remain the same value across multiple edits of the DNG
3. If you have a mix of different file formats, sizes, etc, and need a visual similarity tool, I've found VSDIF (Windows only) to be EXCELLENT - available at
http://www.mindgems.com/products/VS-Duplicate-Image-Finder/VSDIF-About.htm