NothingInCommon has asked for the wisdom of the Perl Monks concerning the following question:

Hello I just recently got back into Perl after several years without writing a single line of code so please bare with me. I will list my dilemma and hopefully someone can guide me in the right direction.
Output I am required to process: AAA_AA1_DDD (0% / 0 of 1024 | event 0% / 0 of 1024) AAA_AA2_DDD (0% / 0 of 1024 | event 0% / 0 of 1024) BS: 111_DDD, QE: QQQ_DDD (additionnal QE) BBB_BB1_DDD (0% / 0 of 1024 | event 0% / 0 of 1024) BBB_BB2_DDD (0% / 0 of 1024 | event 0% / 0 of 1024) BS: 222_DDD, QE: QQQ_DDD (additionnal QE) CCC_CC1_DDD (0% / 0 of 1024 | event 0% / 0 of 1024) CCC_CC2_DDD (0% / 0 of 1024 | event 0% / 0 of 1024) BS: 333_DDD, QE: QQQ_DDD (additionnal QE)
Lets call the lines with % queues so AAA_AA1_DDD has two sets of queues seperated by the pipe as do every other line save the lines that show a "BS:". In this example AAA_AA1_DDD belongs to BS: 111_DDD displayed 2 lines below it and CCC_CC2_DDD belongs to BS: 333_DDD. Unfortunately the output would have be nicer if it were reversed because I can check for a BS line mark is and add what the BS owns in a hash of arrays with the important values % and # of both queues. Something like:
BS 111_DDD => (AAA_AA1_DDD,%,#,%,#)
Why do I want to do this, well I think it would be the easiest way to populate a mysqldb
table queue BS queue_name in_q_per in_q_num out_q_per out_q_num ------------------------------------------------------------ 111_DDD AAA_AA1_DDD 0% 0 0% 0 111_DDD AAA_AA2_DDD 0% 0 0% 0 222_DDD BBB_BB1_DDD 0% 0 0% 0 222_DDD BBB_BB1_DDD 0% 0 0% 0 333_DDD CCC_CC1_DDD 0% 0 0% 0 333_DDD CCC_CC1_DDD 0% 0 0% 0
I thought of one way to do this 1) Put the output into an array and reverse the array to make it easier to process so the lines would look like this:
BS: 333_DDD, QE: QQQ_DDD (additionnal QE) CCC_CC2_DDD (0% / 0 of 1024 | event 0% / 0 of 1024) CCC_CC1_DDD (0% / 0 of 1024 | event 0% / 0 of 1024) BS: 222_DDD, QE: QQQ_DDD (additionnal QE) BBB_BB2_DDD (0% / 0 of 1024 | event 0% / 0 of 1024) BBB_BB1_DDD (0% / 0 of 1024 | event 0% / 0 of 1024) BS: 111_DDD, QE: QQQ_DDD (additionnal QE) AAA_AA2_DDD (0% / 0 of 1024 | event 0% / 0 of 1024) AAA_AA1_DDD (0% / 0 of 1024 | event 0% / 0 of 1024)
Please help me (with code examples if possible) decide a plan for this.

Replies are listed 'Best First'.
Re: Plan of Attack.
by ikegami (Patriarch) on Apr 20, 2009 at 17:19 UTC
    Accumulate lines until you reach a BS line, then output what you have accumulated.
    while (<$fh>) { if (my @queue = /(\w+) \((\d+)% \/ (\d+) of \d+ \| event (\d+)% \/ +(\d+) of \d+\)/) { push @queues, \@queue; } elsif (my ($group) = /^BS: (\w+)/) { while (@queues) { my $queue = shift(@queues); print(join("\t", $group, @queue), "\n"); } } }
Re: Plan of Attack.
by linuxer (Curate) on Apr 20, 2009 at 17:31 UTC

    General Hint:

    You can read your data source linewise and process each queue line and store them temporarily in a data structure.

    If you encounter a line which starts with BS:, take the data structure and combine it with the data of the current BS: line and clear the temporary data structure.

    Example of work:

    #!/usr/bin/perl # vi:ts=4 sw=4 et: use strict; use warnings; use Data::Dumper qw(); my %result; my %queue; while ( my $line = <DATA> ) { chomp $line; if ( $line =~ m/^BS:\s+(\S+)/ ) { $result{$1} = { %queue }; %queue = (); } else { my ( $name, $in_percent, $in_num, $out_percent, $out_num ) = split m{ \(| / | of 1024 \| event | of 1024\)}, $line; $queue{$name} = { in_q_per => $in_percent, in_q_num => $in_num, out_q_per => $out_percent, out_q_num => $out_num, }; } } # show resulting structure; it's up to you to go on from here ;o) print Data::Dumper->Dump( [ \%result ], [ '*result' ] ); __DATA__ AAA_AA1_DDD (0% / 0 of 1024 | event 0% / 0 of 1024) AAA_AA2_DDD (0% / 0 of 1024 | event 0% / 0 of 1024) BS: 111_DDD, QE: QQQ_DDD (additionnal QE) BBB_BB1_DDD (0% / 0 of 1024 | event 0% / 0 of 1024) BBB_BB2_DDD (0% / 0 of 1024 | event 0% / 0 of 1024) BS: 222_DDD, QE: QQQ_DDD (additionnal QE) CCC_CC1_DDD (0% / 0 of 1024 | event 0% / 0 of 1024) CCC_CC2_DDD (0% / 0 of 1024 | event 0% / 0 of 1024) BS: 333_DDD, QE: QQQ_DDD (additionnal QE)
      This worked absolutely flawlessly. :) Thank you so much!!
Re: Plan of Attack.
by bichonfrise74 (Vicar) on Apr 20, 2009 at 18:38 UTC
    I went a different route. Basically I changed the input record separator to grab the 'BS' and then restore the input record separator to its default value.

    After that, I process the data line by line again. Here's my code.
    #!/usr/bin/perl use strict; local $/ = "QE)"; my %hash; while (<DATA>) { s/BS:\s(\w+).*QE\)//; $hash{'BS'} = $1; do { local $/ = "\n"; s/(\w+) \((\d)\% \/ (\d) of \d+ \| event (\d)\% \/ (\d) of \d+ +\)//; print "$hash{'BS'}\t$1\t$2%\t$3\t$4%\t$5\n" if ( defined($2) ) +; } } __DATA__ AAA_AA1_DDD (0% / 0 of 1024 | event 0% / 0 of 1024) AAA_AA2_DDD (0% / 0 of 1024 | event 0% / 0 of 1024) BS: 111_DDD, QE: QQQ_DDD (additionnal QE) BBB_BB1_DDD (0% / 0 of 1024 | event 0% / 0 of 1024) BBB_BB2_DDD (0% / 0 of 1024 | event 0% / 0 of 1024) BS: 222_DDD, QE: QQQ_DDD (additionnal QE) CCC_CC1_DDD (0% / 0 of 1024 | event 0% / 0 of 1024) CCC_CC2_DDD (0% / 0 of 1024 | event 0% / 0 of 1024) BS: 333_DDD, QE: QQQ_DDD (additionnal QE)