currently I am doin anti spam with perl....

i wan to remove header of the email (title, from,Delivered-To,Received etc) to increase my accuracy...

how can i do it?? can anyone help me??

i wan get the contain starting from "Martin A posted:" the bold words...till the rest of the mail so the header word will not include in total word that contain in the mail...

below is some of my code to get full word count(698 words total) of the mail.

i copy and paste the mail into textfile and use textfile to handle it

--------------------------------------------------------------------------------
---------------------------------------------------------------------- +---------- #!/usr/local/bin/perl use strict; use warnings; my $count=0; open(FILE, "C:/Perl/testfile.txt"); while(<FILE>) #count the total words { $count++ while m/[a-zA-Z]\w*/g;} print "total word = $count \n"; ---------------------------------------------------------------------- +----------

this is the sample mail

From Steve_Burt@cursor-system.com Thu Aug 22 12:46:39 2002 Return-Path: <Steve_Burt@cursor-system.com> Delivered-To: zzzz@localhost.netnoteinc.com Received: from localhost (localhost [127.0.0.1]) by phobos.labs.netnoteinc.com (Postfix) with ESMTP id BE12E43C34 for <zzzz@localhost>; Thu, 22 Aug 2002 07:46:38 -0400 (EDT) Received: from phobos [127.0.0.1] by localhost with IMAP (fetchmail-5.9.0) for zzzz@localhost (single-drop); Thu, 22 Aug 2002 12:46:38 +0100 (IST +) Received: from n20.grp.scd.yahoo.com (n20.grp.scd.yahoo.com [66.218.66.76]) by dogma.slashnull.org (8.11.6/8.11.6) with SMTP id g7MBkTZ05087 for <zzzz@example.com>; Thu, 22 Aug 2002 12:46:29 +0100 X-Egroups-Return: sentto-2242572-52726-1030016790-zzzz=example.com@ret +urns.groups.yahoo.com Received: from [66.218.67.196] by n20.grp.scd.yahoo.com with NNFMP; 22 Aug 2002 11:46:30 -0000 X-Sender: steve.burt@cursor-system.com X-Apparently-To: zzzzteana@yahoogroups.com Received: (EGP: mail-8_1_0_1); 22 Aug 2002 11:46:29 -0000 Received: (qmail 11764 invoked from network); 22 Aug 2002 11:46:29 -00 +00 Received: from unknown (66.218.66.217) by m3.grp.scd.yahoo.com with QM +QP; 22 Aug 2002 11:46:29 -0000 Received: from unknown (HELO mailgateway.cursor-system.com) (62.189.7. +27) by mta2.grp.scd.yahoo.com with SMTP; 22 Aug 2002 11:46:29 -0000 Received: from exchange1.cps.local (unverified) by mailgateway.cursor-system.com (Content Technologies SMTPRS 4.2.10) wit +h ESMTP id <T5cde81f695ac1d100407d@mailgateway.cursor-system.com> for <forteana@yahoogroups.com>; Thu, 22 Aug 2002 13:14:10 +0100 Received: by exchange1.cps.local with Internet Mail Service (5.5.2653. +19) id <PXX6AT23>; Thu, 22 Aug 2002 12:46:27 +0100 Message-Id: <5EC2AD6D2314D14FB64BDA287D25D9EF12B4F6@exchange1.cps.loca +l> To: "'zzzzteana@yahoogroups.com'" <zzzzteana@yahoogroups.com> X-Mailer: Internet Mail Service (5.5.2653.19) X-Egroups-From: Steve Burt <steve.burt@cursor-system.com> From: Steve Burt <Steve_Burt@cursor-system.com> X-Yahoo-Profile: pyruse MIME-Version: 1.0 Mailing-List: list zzzzteana@yahoogroups.com; contact forteana-owner@yahoogroups.com Delivered-To: mailing list zzzzteana@yahoogroups.com Precedence: bulk List-Unsubscribe: <mailto:zzzzteana-unsubscribe@yahoogroups.com> Date: Thu, 22 Aug 2002 12:46:18 +0100 Subject: [zzzzteana] RE: Alexander Reply-To: zzzzteana@yahoogroups.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Martin A posted: Tassos Papadopoulos, the Greek sculptor behind the plan, judged that t +he limestone of Mount Kerdylio, 70 miles east of Salonika and not far fro +m the Mount Athos monastic community, was ideal for the patriotic sculpture. + As well as Alexander's granite features, 240 ft high and 170 ft wide, +a museum, a restored amphitheatre and car park for admiring crowds are planned --------------------- So is this mountain limestone or granite? If it's limestone, it'll weather pretty fast. ------------------------ Yahoo! Groups Sponsor ---------------------~- +-> 4 DVDs Free +s&p Join Now http://us.click.yahoo.com/pt6YBB/NXiEAA/mG3HAA/7gSolB/TM ---------------------------------------------------------------------~ +-> To unsubscribe from this group, send an email to: forteana-unsubscribe@egroups.com Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/ter +ms/

In reply to perl remove mail(textfile) header(eg. title, from,Delivered-To,Received) by stillcool

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.