how use map and grep or several loops?

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks,

I have a communication log now like below

Message:
11 started at: 2018-06-29 16:20:07

Transmit: 
ATV1[0D]


Transmit: 
[01]10179311000=[03]

Receive: 
[01]20179321157>[02]00000068801400000000000000000000000000000000006880
+14000000
0000000068801400000000000000000000400000000040000000004000000000400000
+00000000
000000000000000000[03][00]

Transmit: 
[01]10179312000>[03]

Receive: 
[01]20179331157?[02]00000068801400000000000000000000000000000000006880
+14000000
0000000068801400000000000000000000400000000040000000004000000000400000
+00000000
000000000000000000[03][00]

Transmit: 
[01]10179313000?[03]

Receive: 
[01]201793411578[02]00000068801400000000000000000000000000000000006880
+14000000
0000000068801400000000000000000000400000000040000000004000000000400000
+00000000
000000000000000000[03][00]

Transmit: 
[01]101793140008[03]

Receive: 
[01]201793511579[02]00000068801400000000000000000000000000000000006880
+14000000
0000000068801400000000000000000000400000000040000000004000000000400000
+00000000
000000000000000000[03][00]

Transmit: 
[01]101793150009[03]

Receive: 
[01]001793611578[02]00000068801400000000000000000000000000000000006880
+14000000
0000000068801400000000000000000000400000000040000000004000000000400000
+00000000
000000000000000000[03][00]



Message:
reading of spontaneous buffer not ordered

Message:
Periodic Buffer: Start: 2018-06-29 13:15:00 End: 2018-06-29 16:10:00
Periods: 36  Dec: 8  Points: 15  bytes collected: 5564  estimated: 556
+4

Message:
amount of bytes collected ok, data accepted 

Message:
11 ended at: 2018-06-29 16:20:46
[download]

the log has 3 sections: Message, Transmit and Receive. Every section compose of 3 parts: head(like Message:,Transmit:etc), data(mix hex char,ascii code, and return because every data can't be larger than 80 bytes so maybe one received data could be split to 2 or 3lines), and tail(only a return).

my job is retrieved all data in receive section, remove every return tailed fromdata, convertascii code tohex string,likebelow:

Receive: 
[01]201793511579[02]00000068801400000000000000000000000000000000006880
+14000000
0000000068801400000000000000000000400000000040000000004000000000400000
+00000000
000000000000000000[03][00]                              #old

01 32 30 31...........30 03 00                 #newstring has only one
+line, is allmade of hex string and no []
[download]

what I do is create 2 tempery array to deal with it:

my $start_str = "Receive"; my $end_str ="\n";
my @first_loop =
  grep {
     /$start_str/../^$end_str$/ ? 1 : 0;
  } @logs;
  my @second_loop; my $data = ""; my $k = 0;my $j=0;
  for(@first_loop){                              #first loop is to rem
+ove return, head and tail
     do { $k = 1, next } if $_ =~ /$start_str/;
        do { $k = 0; push @second_loop, $data; $data = ""; next;} if $
+_ =~ /^$end_str$/;
        do { $sctm_data .= $_; chomp $sctm_data; next } if $k == 1;
  }
my @strings; my $refined_logs; my $hex_str;
  for(@second_loop){              # second loop is to convert and remo
+ve []
    my $data_str = $_;
    chomp $data_str;
    for(split(//, $data_str)) {
        do { $j = 1, next } if $_ =~ /\[/;
        do { $j = 0;push @strings, $hex_str; $hex_str = ""; next;} if 
+$_ =~ /\]/;
        do { $hex_str .= $_; next } if $k == 1;
            push @strings, sprintf("%02X", ord($_));
    }
my $last_str = join( " ", @strings );
   @strings =();
    push @refined_logs, $last_str;
 }
[download]

It works, but I'd like to only use map plus grep to deal with it. like @output_array = map{} map {} grep{} grep{} input_array. I think it's more tidy and easy understanding. Could you enlightened me using several examples? or in this senario, map and grep is not a good way? Thanks for your help!

Comment on how use map and grep or several loops? Select or Download Code

Replies are listed 'Best First'.
Re: how use map and grep or several loops? by kcott (Archbishop) on Jul 05, 2018 at 07:19 UTC
If you're dealing with log files, they're often large and copying all their data into arrays, and then creating more arrays from that data, is generally not a good idea: it's likely to be very slow and you could run into memory issues. Instead, read the files one line at a time or, in your case here, one paragraph at a time. The following line (which you'll see in the script I've provided below) turns on paragraph mode: `local $/ = '';` [download] I'll also just comment on all those `do` statements. Is there a reason you coded it that way? Instead of `do { STATEMENTS } if CONDITION;` [download] why not write `if (CONDITION) { STATEMENTS }` [download] Anyway, I truncated your data quite substantially but left the main features: head, multiline data and tail. The data you posted had a space after every "Receive:" — I've left that in but you should check if it's there in the original data (it could be an artefact of copy/paste, HTML rendering, etc.). Here's the script to process it: #!/usr/bin/env perl use strict; use warnings; { local $/ = ''; my $wanted = "Receive: \n"; my $re = qr{(\[\d\d\]\|.)}; while (<DATA>) { chomp; next unless substr($_, 0, length $wanted, '') eq $wanted; #print "\|$_\|\n"; # For demo only - see current $_ value my @hex; while (/$re/g) { push @hex, length $1 == 1 ? sprintf '%02X', ord $1 : substr($1, 1, 2); } print "@hex\n"; } } __DATA__ Message: 11 started at: 2018-06-29 16:20:07 Transmit: ATV1[0D] Transmit: [01]10179311000=[03] Receive: [01]20179321157>[02]00 00 00[03][00] Transmit: [01]10179312000>[03] Receive: [01]20179331157?[02]00 00 00[03][00] Message: amount of bytes collected ok, data accepted Message: 11 ended at: 2018-06-29 16:20:46 [download] Output: `01 32 30 31 37 39 33 32 31 31 35 37 3E 02 30 30 30 30 30 30 03 00 01 32 30 31 37 39 33 33 31 31 35 37 3F 02 30 30 30 30 30 30 03 00` [download] If you uncomment that demo print line, you'll probably find it a bit easier to compare the data being processed and the data being output. It changes the output to this: `\|[01]20179321157>[02]00 00 00[03][00]\| 01 32 30 31 37 39 33 32 31 31 35 37 3E 02 30 30 30 30 30 30 03 00 \|[01]20179331157?[02]00 00 00[03][00]\| 01 32 30 31 37 39 33 33 31 31 35 37 3F 02 30 30 30 30 30 30 03 00` [download] — Ken	[reply] [d/l] [select]
Re^2: how use map and grep or several loops? by Anonymous Monk on Jul 06, 2018 at 02:36 UTC
Thanks kcott! The reason I prefer map/grep than a while to deal with something like this is because I like the feel that map/grep bring me. Image you drive on a freeway, if the way is only one exit, you can concentrate driving or you have to note every road board. the performance you sacrificed is worthy I think. ;)	[reply]
Re: how use map and grep or several loops? by johngg (Canon) on Jul 05, 2018 at 12:09 UTC
Some might disagree with you regarding the clarity of the grep and map approach and a solution using loops of various flavours might be easier to maintain in the long run. However, here's a solution for you which wraps everything in a do block to contain the localization of paragraph mode. use 5.022; use warnings; open my $logFH, q{<}, \ <<__EOD__ or die $!; Message: 11 started at: 2018-06-29 16:20:07 Transmit: ATV1[0D] Transmit: [01]10179311000=[03] Receive: [01]20179321157>[02]00000068801400000000000000000000000000000000006880 +14000000 0000000068801400000000000000000000400000000040000000004000000000400000 +00000000 000000000000000000[03][00] Transmit: [01]10179312000>[03] Receive: [01]20179331157?[02]00000068801400000000000000000000000000000000006880 +14000000 0000000068801400000000000000000000400000000040000000004000000000400000 +00000000 000000000000000000[03][00] Transmit: [01]10179313000?[03] Receive: [01]201793411578[02]00000068801400000000000000000000000000000000006880 +14000000 0000000068801400000000000000000000400000000040000000004000000000400000 +00000000 000000000000000000[03][00] Transmit: [01]101793140008[03] Receive: [01]201793511579[02]00000068801400000000000000000000000000000000006880 +14000000 0000000068801400000000000000000000400000000040000000004000000000400000 +00000000 000000000000000000[03][00] Transmit: [01]101793150009[03] Receive: [01]001793611578[02]00000068801400000000000000000000000000000000006880 +14000000 0000000068801400000000000000000000400000000040000000004000000000400000 +00000000 000000000000000000[03][00] Message: reading of spontaneous buffer not ordered Message: Periodic Buffer: Start: 2018-06-29 13:15:00 End: 2018-06-29 16:10:00 Periods: 36 Dec: 8 Points: 15 bytes collected: 5564 estimated: 556 +4 Message: amount of bytes collected ok, data accepted Message: 11 ended at: 2018-06-29 16:20:46 __EOD__ my @records = do { local $/ = q{}; map { join q{ }, map { m{\[\d\d\]} ? substr $_, 1, 2 : map { sprintf q{%02x}, ord } split m{}; } @$_; } map { s{^Receive:}{}; s{\s+}{}g; [ split m{(\[\d\d\])} ]; } grep { m{^Receive:} } <$logFH>; }; say qq{$_\n} for @records; [download] I don't show the output here as it will wrap horribly. Despite giving you this solution I would recommend sticking with looping constructs. Update: Got rid of the middle `map` by moving the `split` into the first `map`. Update 2: The first update would have made more sense if I'd included the original code so here it is :- `map { join q{ }, map { m{\[\d\d\]} ? substr $_, 1, 2 : map { sprintf q{%02x}, ord } split m{}; } @$_; } map { [ split m{(\[\d\d\])} ] } map { s{^Receive:}{}; s{\s+}{}g; $_; } grep { m{^Receive:} } <$logFH>;` [download] Cheers, JohnGG	[reply] [d/l] [select]
Re^2: how use map and grep or several loops? by Anonymous Monk on Jul 06, 2018 at 02:25 UTC
Many thanks johngg ! although you recommend loop way to deal with something like this, I prefer map/grep way. It's like a FIFO queue, so that you can easily understand how original data transform to the data I want.	[reply]