in reply to Edit distance between regular expression
The mathematical concept of "distance" is quite broad but then again rigorously defined. In general the distance function (or metric) has to have certain properties. The Levenshtein Distance (LD) is a metric for measuring the "amount" of difference between two strings.
In your case the question is different. You don’t compare two strings but a string and a glob. By definition this is not possible unless the glob results in one string only. Otherwise the LD is simply not defined.
In the example you give: you suggest that LD(*cxd*,adcdef)=1. But this is only so when *cxd* expands into one string only which is one edit operation away from adcdef.
Maybe you could generalize the distance function for your particular application. It might not be a proper distance function anymore but may still be useful depending on your particular problem. After expanding the glob into {s1, s2, …, sn} you could determine the LD for all those strings with the original string: {d1, d2, …, dn}. Depending on what meaning it has to you, I don’t know the particular application of it, you can then take the lowest, highest, average or whatever value that makes sense to you.
I wonder what the particular application is you're after.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Edit distance between regular expression
by amegcita (Initiate) on Sep 24, 2008 at 08:40 UTC |