in reply to regex help please

Hi sitnalta,

Before jumping to any conclusions as to why the first response to your question should or should not work, please show us what you've tried, and what you expect as the results. I'm especially interested in why you would want to match only the first <0a> and not the second, especially considering your original statement is that you want them completely removed. Are you using a multi-line regex and you have data that you do NOT want to delete between those lines?

Please see I know what I mean. Why don't you? for more information.



--chargrill
s**lil*; $*=join'',sort split q**; s;.*;grr; &&s+(.(.)).+$2$1+; $; = qq-$_-;s,.*,ahc,;$,.=chop for split q,,,reverse;print for($,,$;,$*,$/)

Replies are listed 'Best First'.
Re^2: regex help please
by sitnalta (Initiate) on Oct 27, 2006 at 19:29 UTC
    What I am attempting to do is a log file, put it into perl and to output it into a more readable form and eventually do some form of reporting on it. Who knows maybe even drop some info into a database. This is all for learning experience and simply for the challenge, since perl is fun even though I am not to good with it.

    Here is what the original message looks like, keep in mind I changed the original text message and dstaddr for personal reasons:

    15025 0 0 Note;MMG_MDR:ref=0:mdrname=DELIVERED:sysid=localhost/16503:svcname=GSM:host=atlmmg01:proc=SMEGwy-concen:rcvtime=20 06102600002184816-:sndtime=2006102600002199516-:rcvuser=SMPP-XMIT7:snduser=CHI_RT:rcvtype=SERVER:sndtype=CLIENT:rcvnet=SMPP SRV:sndnet=:rcvacc=localhost/16551:sndacc=CHI_RT:dstaddr=1234567890:orgaddr=1010100001:intmrf=0CD90A2E420D454032D51D5:extmrf= ABB0nyJk:msgstat=[0] Delivered:msgoper=SUBMIT:msgtype=NORMAL:usrinfo=:msglen=100:msgtext=FRM\3aJodi<0a>SUBJ\3alater<0a>MSG\3a TEXT OF MESSAGE GOES HERE - www .maybe-even-a-url .com –

    So what I noticed is the common deliminater is “:” so I figured I should learn howto use split and and take advantage of that. After doing so I wanted to see what it looked like if I searched for keywords in the message within the log which I pumped into an array. This is where you will see if (/Sshare/, /Ttime/) { which I eventually want to learn howto make this common line arguments. So after searching for keywords the following output was given:

    rcvtime=2006102611380086416-sndtime=2006102611380004316-msgtext=FRM\3aDang<0a>SUBJ\3alater<0a>MSG\3aMessage I looked for goes here - www.maybeevenaURL .com –

    So that’s cool, I learned how cool split is and howto to semi work with it. Now I wanted to chop out the junk so to say. This is when I started to slowly carve it out with the following before I got bored of doing it and realized I should just learn to do it correctly, here are the examples which are in the script commented out in the script of what I tried:

    # s/rcvtime=[0-9]+//; # s/sndtime=[0-9]+//; # s/msgtext//; # s/FRM//; # s/3a[a-z,[A-Z]+//;

    Here is where I am at right now, all I am trying to do so far is learn to script with perl. Suggestions are most welcomed.

    #!/usr/bin/perl -w use strict; open(SPAMMESSEGES, "spammessages"); while (<SPAMMESSEGES>) { my $spam = <SPAMMESSEGES>; my @spammer = split (/:/); foreach (@spammer) { if (/[Ss]hare/, /[Tt]ime/) { # s/rcvtime=[0-9]+//; # s/sndtime=[0-9]+//; # s/msgtext//; # s/FRM//; # s/3a[a-z,[A-Z]+//; s/rcvtime(.*)\SBJ>//; print $_; } } }
      Your problem appears to be that
      while (<SPAMMESSEGES>) {
      Assigns one line to $_ and then
      my $spam = <SPAMMESSEGES>;
      Assigns the next line to $spam so you are only processing every second line.