LanX has asked for the wisdom of the Perl Monks concerning the following question:
I'm meditating about a regex based heuristic to roughly detect if a text paragraph (multilines delimited by '\n\n') is rather perl source code than normal text.
The best idea I had so far was: using regexes to count the line endings with ';' or '}' possibly followed with a '#' part.
Another to check the frequency of words starting with a sigil.
I'm not talking about a valid parser, just a fuzzy detector.
Any better ideas?
One use case could be a JS that checks the contents of a posting in the monastery and warns about missing <code> tags, offering to include them.
(I'm a bit tired of unreadable posts here, and all the following edit-considerations and replies)
Cheers Rolf
PS: I'm not sure if this thread better belongs to PM-Discussions.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: heuristic to detect (perl) code
by tobyink (Canon) on Jan 19, 2013 at 11:45 UTC | |
by LanX (Saint) on Jan 19, 2013 at 12:03 UTC | |
by 7stud (Deacon) on Jan 20, 2013 at 09:22 UTC | |
by Anonymous Monk on Jan 20, 2013 at 10:25 UTC | |
by LanX (Saint) on Jan 20, 2013 at 10:46 UTC | |
|
Re: heuristic to detect (perl) code
by Anonymous Monk on Jan 19, 2013 at 08:40 UTC | |
by LanX (Saint) on Jan 19, 2013 at 08:48 UTC | |
by Anonymous Monk on Jan 19, 2013 at 09:02 UTC | |
by LanX (Saint) on Jan 19, 2013 at 09:53 UTC |