in reply to Parsing text string in Perl

Update: My earlier answer (preserved below) was written in haste on the phone. Back where I can type now, and I see that the two answers provided in the meantime, focusing on setting the record delimiter variable $/, both fail to meet the specs you gave, if I understand them. I'm taking

I want to parse the complete file in perl and print out time and only those entries that have ESME_RTHROTTLED.

to mean that you only want to print the time, and only for the records that have 'ESME_RTHROTTLED'. If I'm wrong about that, sorry.

This should do what you want:

$/ = 'Time: '; while (<DATA>) { print( (split "\n")[0] ) if /ESME_RTHROTTLED/; }
Output:
$ perl 1140724.pl There were RTHROTTLED incidents at: 2015-09-01T09:02:43.010 2015-09-17T09:03:43.634 2015-09-22T09:05:17.007 $
Complete program:
#!/usr/bin/perl use strict; use warnings; use feature qw/ say /; use DateTime::Format::Strptime qw/ strptime /; $/ = 'Time: '; say 'There were RTHROTTLED incidents at:'; while (<DATA>) { say ' ' . strptime('%d/%m/%Y-%T.%3N', (split "\n")[0])->strftime('% +FT%T.%3N') if /ESME_RTHROTTLED/; } __END__ Time: 1/9/2015-09:01:43.010 Protocol: SMPP ESME: ehttp_rknoe Direction: Outgoing From: 10.247.231.212/2775 To: 10.247.231.212/35173 PDU Type: Full PDU PDU Length: 16 PDU Data: 00000010800000040000005800029a92 Decode Error 0x00000000 Decoded PDU: [ smpp hdr 16 octets ] 00000000: 00 00 00 10 command_length: 16 00000004: 80 00 00 04 command_id: 0x80000004 submit_sm_re +sp 00000008: 00 00 00 58 command_status: 0x00000058 0000000C: 00 02 9A 92 sequence_number: 170642 Time: 1/9/2015-09:02:43.010 Protocol: SMPP ESME: ehttp_rknoe Direction: Outgoing From: 10.247.231.212/2775 To: 10.247.231.212/35173 PDU Type: Full PDU PDU Length: 16 PDU Data: 00000010800000040000005800029a92 Decode Error 0x00000000 Decoded PDU: [ smpp hdr 16 octets ] 00000000: 00 00 00 10 command_length: 16 00000004: 80 00 00 04 command_id: 0x80000004 submit_sm_re +sp 00000008: 00 00 00 58 command_status: 0x00000058 ESME_RTH +ROTTLED 0000000C: 00 02 9A 92 sequence_number: 170642 Time: 17/9/2015-09:03:43.634 Protocol: SMPP ESME: ehttp_rknoe Direction: Outgoing From: 10.247.231.212/2775 To: 10.247.231.212/35173 PDU Type: Full PDU PDU Length: 16 PDU Data: 00000010800000040000005800029a92 Decode Error 0x00000000 Decoded PDU: [ smpp hdr 16 octets ] 00000000: 00 00 00 10 command_length: 16 00000004: 80 00 00 04 command_id: 0x80000004 submit_sm_re +sp 00000008: 00 00 00 58 command_status: 0x00000058 ESME_RTH +ROTTLED 0000000C: 00 02 9A 92 sequence_number: 170642 Time: 1/9/2015-09:04:43.987 Protocol: SMPP ESME: ehttp_rknoe Direction: Outgoing From: 10.247.231.212/2775 To: 10.247.231.212/35173 PDU Type: Full PDU PDU Length: 16 PDU Data: 00000010800000040000005800029a92 Decode Error 0x00000000 Decoded PDU: [ smpp hdr 16 octets ] 00000000: 00 00 00 10 command_length: 16 00000004: 80 00 00 04 command_id: 0x80000004 submit_sm_re +sp 00000008: 00 00 00 58 command_status: 0x00000058 0000000C: 00 02 9A 92 sequence_number: 170642 Time: 22/9/2015-09:05:17.007 Protocol: SMPP ESME: ehttp_rknoe Direction: Outgoing From: 10.247.231.212/2775 To: 10.247.231.212/35173 PDU Type: Full PDU PDU Length: 16 PDU Data: 00000010800000040000005800029a92 Decode Error 0x00000000 Decoded PDU: [ smpp hdr 16 octets ] 00000000: 00 00 00 10 command_length: 16 00000004: 80 00 00 04 command_id: 0x80000004 submit_sm_re +sp 00000008: 00 00 00 58 command_status: 0x00000058 ESME_RTH +ROTTLED 0000000C: 00 02 9A 92 sequence_number: 170642

Earlier answer: It depends on what you have for"record" delimiters. If the first line is always the time, you could store that in a variable and print it out if you hit your token before the next record starts, or reload the variable with the new time if you hit that first.

The way forward always starts with a minimal test.

Replies are listed 'Best First'.
Re^2: Parsing text string in Perl
by gbwien (Sexton) on Sep 02, 2015 at 14:18 UTC

    Thank you very much for your help. I would also like to print the value of ESME in the example below it would be ehttp_rkoe

    ESME: ehttp_rknoe

    $/ = 'Time: '; while (<DATA>) { print( (split "\n")[0] ) if /ESME_RTHROTTLED/; }

    how does split work in the context of your program ?

    Thanks tom

      Sorry, not at computer. You have records separated by "Time: ". As you loop through the record, you put the lines in an array by splitting on the new line character. The first element, or line, will contain the timestamp. You would need to loop through the elements of the array and do a regexp match, capturing the part of the line you want. But leave the existing test in there so you only loop through the record that have the interesting line.

      Update: added code example

      Here's a simplified example. I would not split into lines but use capturing regexps on the whole record, but doing the split into lines may may it simpler for you to see what's going on.

      #!/usr/bin/perl use strict; use warnings; use feature qw/ say /; use DateTime::Format::Strptime qw/ strptime /; say 'RTHROTTLED incidents:'; $/ = 'Time: '; while ( my $record = <DATA> ) { if ( $record =~ /ESME_RTHROTTLED/ ) { my @lines = split "\n", $record; # We know the time is in the first line my $time = strptime('%d/%m/%Y-%T.%3N', $lines[0])->strftime('%FT% +T.%3N'); my $ESME; foreach my $line ( @lines ) { $ESME = $1 if $line =~ /ESME:\s+(.+$)/; } say ' ' . "$time $ESME"; } }
      The way forward always starts with a minimal test.
Re^2: Parsing text string in Perl
by 1nickt (Canon) on Sep 02, 2015 at 03:12 UTC
    dupe