Distances: my personal experience

Just a few suggestions according to my own experience. For a more formal and detailed treatment go to Wikipedia or a maths manual.

Quantitative Data Arrays

For quantitative data (e.g. transcription signals from microarray experiments) you can use

Binary Arrays

Hamming distance is not good in my opinion when the strings compared have a very unequal 1/0 content, and the meaning of 1s and 0s is related to set-membership (as in the example above).BR E.g., consider these strings, and the choice made by Hamming and Jaccard:BR

This is particularly important as some computational guys would probably go for Hamming as the first choice, without checking for its validity in the context of use.

DanieleMerico/HowtoDirectory/Distances (last edited 2007-11-15 23:25:56 by DanieleMerico)

MoinMoin Appliance - Powered by TurnKey Linux