gbwien has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I am trying to get back into Perl and I have a question regarding how best to parse the following

I have a file which contains many of the following entries

Time: 1/9/2015-09:01:43.010 Protocol: SMPP ESME: ehttp_rknoe Direction: Outgoing From: 10.247.231.212/2775 To: 10.247.231.212/35173 PDU Type: Full PDU PDU Length: 16 PDU Data: 00000010800000040000005800029a92 Decode Error 0x00000000 Decoded PDU: [ smpp hdr 16 octets ] 00000000: 00 00 00 10 command_length: 16 00000004: 80 00 00 04 command_id: 0x80000004 submit_sm_re +sp 00000008: 00 00 00 58 command_status: 0x00000058 ESME_RTH +ROTTLED 0000000C: 00 02 9A 92 sequence_number: 170642

I want to parse the complete file in perl and print out time and only those entries that have ESME_RTHROTTLED.

The file may contain many other entries but I only want to print out this entry

Thanks tom

Replies are listed 'Best First'.
Re: Parsing text string in Perl
by BrowserUk (Patriarch) on Sep 01, 2015 at 22:18 UTC

    Something like this?:

    #! perl -slw use strict; $/ = 'Time:'; while( <DATA> ) { chomp; m[\s+(\S+).+?ESME_RTHROTTLED] and print "Time:", $_; } __DATA__ Time: 1/9/2015-09:01:43.010 Protocol: SMPP ESME: ehttp_rknoe Direction: Outgoing From: 10.247.231.212/2775 To: 10.247.231.212/35173 PDU Type: Full PDU PDU Length: 16 PDU Data: 00000010800000040000005800029a92 Decode Error 0x00000000 Decoded PDU: [ smpp hdr 16 octets ] 00000000: 00 00 00 10 command_length: 16 00000004: 80 00 00 04 command_id: 0x80000004 submit_sm_re +sp 00000008: 00 00 00 58 command_status: 0x00000058 ESME_RTH +ROTTLED 0000000C: 00 02 9A 92 sequence_number: 170642 Time: 1/9/2015-09:01:43.010 Protocol: SMPP ESME: ehttp_rknoe Direction: Outgoing From: 10.247.231.212/2775 To: 10.247.231.212/35173 PDU Type: Full PDU PDU Length: 16 PDU Data: 00000010800000040000005800029a92 Decode Error 0x00000000 Decoded PDU: [ smpp hdr 16 octets ] 00000000: 00 00 00 10 command_length: 16 00000004: 80 00 00 04 command_id: 0x80000004 submit_sm_re +sp 00000008: 00 00 00 58 command_status: 0x00000058 0000000C: 00 02 9A 92 sequence_number: 170642 Time: 1/9/2015-09:01:43.010 Protocol: SMPP ESME: ehttp_rknoe Direction: Outgoing From: 10.247.231.212/2775 To: 10.247.231.212/35173 PDU Type: Full PDU PDU Length: 16 PDU Data: 00000010800000040000005800029a92 Decode Error 0x00000000 Decoded PDU: [ smpp hdr 16 octets ] 00000000: 00 00 00 10 command_length: 16 00000004: 80 00 00 04 command_id: 0x80000004 submit_sm_re +sp 00000008: 00 00 00 58 command_status: 0x00000058 ESME_RTH +ROTTLED 0000000C: 00 02 9A 92 sequence_number: 170642 Time: 1/9/2015-09:01:43.010 Protocol: SMPP ESME: ehttp_rknoe Direction: Outgoing From: 10.247.231.212/2775 To: 10.247.231.212/35173 PDU Type: Full PDU PDU Length: 16 PDU Data: 00000010800000040000005800029a92 Decode Error 0x00000000 Decoded PDU: [ smpp hdr 16 octets ] 00000000: 00 00 00 10 command_length: 16 00000004: 80 00 00 04 command_id: 0x80000004 submit_sm_re +sp 00000008: 00 00 00 58 command_status: 0x00000058 0000000C: 00 02 9A 92 sequence_number: 170642
    Outputs:
    C:\test>1140723.pl Time: 1/9/2015-09:01:43.010 Protocol: SMPP ESME: ehttp_rknoe Direction: Outgoing From: 10.247.231.212/2775 To: 10.247.231.212/35173 PDU Type: Full PDU PDU Length: 16 PDU Data: 00000010800000040000005800029a92 Decode Error 0x00000000 Decoded PDU: [ smpp hdr 16 octets ] 00000000: 00 00 00 10 command_length: 16 00000004: 80 00 00 04 command_id: 0x80000004 submit_sm_re +sp 00000008: 00 00 00 58 command_status: 0x00000058 ESME_RTH +ROTTLED 0000000C: 00 02 9A 92 sequence_number: 170642 Time: 1/9/2015-09:01:43.010 Protocol: SMPP ESME: ehttp_rknoe Direction: Outgoing From: 10.247.231.212/2775 To: 10.247.231.212/35173 PDU Type: Full PDU PDU Length: 16 PDU Data: 00000010800000040000005800029a92 Decode Error 0x00000000 Decoded PDU: [ smpp hdr 16 octets ] 00000000: 00 00 00 10 command_length: 16 00000004: 80 00 00 04 command_id: 0x80000004 submit_sm_re +sp 00000008: 00 00 00 58 command_status: 0x00000058 ESME_RTH +ROTTLED 0000000C: 00 02 9A 92 sequence_number: 170642

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
    In the absence of evidence, opinion is indistinguishable from prejudice.
    I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!
Re: Parsing text string in Perl
by 1nickt (Canon) on Sep 01, 2015 at 22:14 UTC

    Update: My earlier answer (preserved below) was written in haste on the phone. Back where I can type now, and I see that the two answers provided in the meantime, focusing on setting the record delimiter variable $/, both fail to meet the specs you gave, if I understand them. I'm taking

    I want to parse the complete file in perl and print out time and only those entries that have ESME_RTHROTTLED.

    to mean that you only want to print the time, and only for the records that have 'ESME_RTHROTTLED'. If I'm wrong about that, sorry.

    This should do what you want:

    $/ = 'Time: '; while (<DATA>) { print( (split "\n")[0] ) if /ESME_RTHROTTLED/; }
    Output:
    $ perl 1140724.pl There were RTHROTTLED incidents at: 2015-09-01T09:02:43.010 2015-09-17T09:03:43.634 2015-09-22T09:05:17.007 $
    Complete program:
    #!/usr/bin/perl use strict; use warnings; use feature qw/ say /; use DateTime::Format::Strptime qw/ strptime /; $/ = 'Time: '; say 'There were RTHROTTLED incidents at:'; while (<DATA>) { say ' ' . strptime('%d/%m/%Y-%T.%3N', (split "\n")[0])->strftime('% +FT%T.%3N') if /ESME_RTHROTTLED/; } __END__ Time: 1/9/2015-09:01:43.010 Protocol: SMPP ESME: ehttp_rknoe Direction: Outgoing From: 10.247.231.212/2775 To: 10.247.231.212/35173 PDU Type: Full PDU PDU Length: 16 PDU Data: 00000010800000040000005800029a92 Decode Error 0x00000000 Decoded PDU: [ smpp hdr 16 octets ] 00000000: 00 00 00 10 command_length: 16 00000004: 80 00 00 04 command_id: 0x80000004 submit_sm_re +sp 00000008: 00 00 00 58 command_status: 0x00000058 0000000C: 00 02 9A 92 sequence_number: 170642 Time: 1/9/2015-09:02:43.010 Protocol: SMPP ESME: ehttp_rknoe Direction: Outgoing From: 10.247.231.212/2775 To: 10.247.231.212/35173 PDU Type: Full PDU PDU Length: 16 PDU Data: 00000010800000040000005800029a92 Decode Error 0x00000000 Decoded PDU: [ smpp hdr 16 octets ] 00000000: 00 00 00 10 command_length: 16 00000004: 80 00 00 04 command_id: 0x80000004 submit_sm_re +sp 00000008: 00 00 00 58 command_status: 0x00000058 ESME_RTH +ROTTLED 0000000C: 00 02 9A 92 sequence_number: 170642 Time: 17/9/2015-09:03:43.634 Protocol: SMPP ESME: ehttp_rknoe Direction: Outgoing From: 10.247.231.212/2775 To: 10.247.231.212/35173 PDU Type: Full PDU PDU Length: 16 PDU Data: 00000010800000040000005800029a92 Decode Error 0x00000000 Decoded PDU: [ smpp hdr 16 octets ] 00000000: 00 00 00 10 command_length: 16 00000004: 80 00 00 04 command_id: 0x80000004 submit_sm_re +sp 00000008: 00 00 00 58 command_status: 0x00000058 ESME_RTH +ROTTLED 0000000C: 00 02 9A 92 sequence_number: 170642 Time: 1/9/2015-09:04:43.987 Protocol: SMPP ESME: ehttp_rknoe Direction: Outgoing From: 10.247.231.212/2775 To: 10.247.231.212/35173 PDU Type: Full PDU PDU Length: 16 PDU Data: 00000010800000040000005800029a92 Decode Error 0x00000000 Decoded PDU: [ smpp hdr 16 octets ] 00000000: 00 00 00 10 command_length: 16 00000004: 80 00 00 04 command_id: 0x80000004 submit_sm_re +sp 00000008: 00 00 00 58 command_status: 0x00000058 0000000C: 00 02 9A 92 sequence_number: 170642 Time: 22/9/2015-09:05:17.007 Protocol: SMPP ESME: ehttp_rknoe Direction: Outgoing From: 10.247.231.212/2775 To: 10.247.231.212/35173 PDU Type: Full PDU PDU Length: 16 PDU Data: 00000010800000040000005800029a92 Decode Error 0x00000000 Decoded PDU: [ smpp hdr 16 octets ] 00000000: 00 00 00 10 command_length: 16 00000004: 80 00 00 04 command_id: 0x80000004 submit_sm_re +sp 00000008: 00 00 00 58 command_status: 0x00000058 ESME_RTH +ROTTLED 0000000C: 00 02 9A 92 sequence_number: 170642

    Earlier answer: It depends on what you have for"record" delimiters. If the first line is always the time, you could store that in a variable and print it out if you hit your token before the next record starts, or reload the variable with the new time if you hit that first.

    The way forward always starts with a minimal test.

      Thank you very much for your help. I would also like to print the value of ESME in the example below it would be ehttp_rkoe

      ESME: ehttp_rknoe

      $/ = 'Time: '; while (<DATA>) { print( (split "\n")[0] ) if /ESME_RTHROTTLED/; }

      how does split work in the context of your program ?

      Thanks tom

        Sorry, not at computer. You have records separated by "Time: ". As you loop through the record, you put the lines in an array by splitting on the new line character. The first element, or line, will contain the timestamp. You would need to loop through the elements of the array and do a regexp match, capturing the part of the line you want. But leave the existing test in there so you only loop through the record that have the interesting line.

        Update: added code example

        Here's a simplified example. I would not split into lines but use capturing regexps on the whole record, but doing the split into lines may may it simpler for you to see what's going on.

        #!/usr/bin/perl use strict; use warnings; use feature qw/ say /; use DateTime::Format::Strptime qw/ strptime /; say 'RTHROTTLED incidents:'; $/ = 'Time: '; while ( my $record = <DATA> ) { if ( $record =~ /ESME_RTHROTTLED/ ) { my @lines = split "\n", $record; # We know the time is in the first line my $time = strptime('%d/%m/%Y-%T.%3N', $lines[0])->strftime('%FT% +T.%3N'); my $ESME; foreach my $line ( @lines ) { $ESME = $1 if $line =~ /ESME:\s+(.+$)/; } say ' ' . "$time $ESME"; } }
        The way forward always starts with a minimal test.
      dupe
Re: Parsing text string in Perl
by Anonymous Monk on Sep 01, 2015 at 22:17 UTC

    Tom

    Does the header 'Time:' always begin each record?

    If so, you can set the record delimiter, $/, to the value 'Time:' and then,

    { local $/ = 'Time:'; while(my $record = <$fh>) { # Skip over everything that doesn't have ESME_RTHROTTLED next unless $record =~ /ESME_RTHROTTLED/; # If we made it here, then this is a record that matches. # ... so do something with it ... } }
Re: Parsing text string in Perl
by shadowsong (Pilgrim) on Sep 02, 2015 at 09:29 UTC

    Tom,

    Assuming your log file is called throttle_log.txt

    #! perl -slw use strict; open FH_INPUT, "<throttle_log.txt" or die $!; $/ = 'Time:'; while ( <FH_INPUT> ) { chomp; m{ESME_RTHROTTLED} and (m{^\s+([0-9/\.\:\-]+)\n} and print "THROTT +LING DETECTED: $1"); }

    will give an output of:

    THROTTLING DETECTED: 31/8/2015-09:01:43.010 THROTTLING DETECTED: 1/9/2015-09:01:43.010

    Provided your file looks like this:

    Time: 31/8/2015-09:01:43.010 Protocol: SMPP ESME: ehttp_rknoe Direction: Outgoing From: 10.247.231.212/2775 To: 10.247.231.212/35173 PDU Type: Full PDU PDU Length: 16 PDU Data: 00000010800000040000005800029a92 Decode Error 0x00000000 Decoded PDU: [ smpp hdr 16 octets ] 00000000: 00 00 00 10 command_length: 16 00000004: 80 00 00 04 command_id: 0x80000004 submit_sm_re +sp 00000008: 00 00 00 58 command_status: 0x00000058 ESME_RTH +ROTTLED 0000000C: 00 02 9A 92 sequence_number: 170642 Time: 1/9/2015-09:01:43.010 Protocol: SMPP ESME: ehttp_rknoe Direction: Outgoing From: 10.247.231.212/2775 To: 10.247.231.212/35173 PDU Type: Full PDU PDU Length: 16 PDU Data: 00000010800000040000005800029a92 Decode Error 0x00000000 Decoded PDU: [ smpp hdr 16 octets ] 00000000: 00 00 00 10 command_length: 16 00000004: 80 00 00 04 command_id: 0x80000004 submit_sm_re +sp 00000008: 00 00 00 58 command_status: 0x00000059 ESME_XTH +ROTTLED 0000000C: 00 02 9A 92 sequence_number: 170642 Time: 1/9/2015-09:01:43.010 Protocol: SMPP ESME: ehttp_rknoe Direction: Outgoing From: 10.247.231.212/2775 To: 10.247.231.212/35173 PDU Type: Full PDU PDU Length: 16 PDU Data: 00000010800000040000005800029a92 Decode Error 0x00000000 Decoded PDU: [ smpp hdr 16 octets ] 00000000: 00 00 00 10 command_length: 16 00000004: 80 00 00 04 command_id: 0x80000004 submit_sm_re +sp 00000008: 00 00 00 58 command_status: 0x00000058 ESME_RTH +ROTTLED 0000000C: 00 02 9A 92 sequence_number: 170642