Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re^2: Most specific pattern

by thor (Priest)
on Jul 01, 2005 at 17:02 UTC ( [id://471773]=note: print w/replies, xml ) Need Help??


in reply to Re: Most specific pattern
in thread Most specific pattern

I agree with this...however, the devil's in the details. Given an arbitrary regex, how does one create a metric around how many non-wildcarded characters are in it? Is there a module that takes care of this? Or at least one that one could bend to make it fit?

thor

Feel the white light, the light within
Be your own disciple, fan the sparks of will
For all of us waiting, your kingdom will come

Replies are listed 'Best First'.
Re^3: Most specific pattern
by jhourcle (Prior) on Jul 02, 2005 at 15:04 UTC

    That was just one of the metrics that I could think of... unless the number of regexes were so large that you couldn't rank them yourself (Going with the assumption that I know more about the process than a regex does)

    If I had to go completely on just odds of matching, I would think it'd be easiest to take a representative sample of inputs, and test them against each of the regexes, and build a table with the odds.

    If you don't have a log of those inputs for testing, then we'd have to get more creative ... I might use something like the following --

    • Any character or zero width assertion gets 1 point. (unless the assertion is pointless, like '\W\b\w'
    • Any character class of n characters gets f(n) points, where f(n) yields a number less than one, and decreases as n increases (maybe 1/n, or sqrt(1/n) )
    • Quantifiers reduce the value of the items they modify ... perhaps as multipliers... ( ? = 0.5; + = 0.6; * = 0.25; +? = 0.7; *? = 0.35 ) (I'm just pulling numbers out of the air...you'd want to tweek the numbers 'till you get good results for your situation).
    • Alterations provide something less than the points value of each of its possibilities. (I have no clue on a formula for this one...)

    I'm not aware of a module to do this sort of things, but that doesn't mean that there isn't one out there.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://471773]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (2)
As of 2024-04-25 20:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found