Conceptually, to separate the sheep from the goats, you either need to really know what a sheep looks like, or really know what a goat looks like.
Do the traps that you want to keep generally start with a newline?
Do all the lines you want start with a time tag?
Is there a reliable way to recognize the end of a trap?
Do the junk lines have any regularity to them at all?
More generally, how good does your filter have to be?
How big a deal is it if some of the junk slips thru?
How big a deal is it if your filter throws a few real traps away?
throop
In reply to
Re: text parsing question
by
throop
in thread
text parsing question
by
perlAffen
Title:
Use:
<p> text here (a
p
aragraph) </p>
and:
<code> code here </code>
to format your post, it's "
PerlMonks-approved HTML
":
Posts are HTML formatted.
Put
<p> </p>
tags around your paragraphs. Put
<code> </code>
tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read
Where should I post X?
if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
How do I compose an effective node title?
How do I post a question effectively?
Markup in the Monastery
Posts may use any of the
Perl Monks Approved HTML tags
:
a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
For:
Use:
&
&
<
<
>
>
[
[
]
]
Link using PerlMonks shortcuts!
What shortcuts can I use for linking?
See
Writeup Formatting Tips
and other pages linked from there for more info.