Re: Manually incrementing @ array during for
by hippo (Bishop) on Mar 16, 2020 at 16:05 UTC
|
#!/usr/bin/env perl
use strict;
use warnings;
my @array = (<DATA>);
my $i = 0;
while ($i <= $#array) {
print "Line $i is $array[$i++]";
while ($i <= $#array && $array[$i] =~ /^ /) {
print "Line $i is a continuation: $array[$i++]";
}
}
__DATA__
keyword1 data1 data2 data3
keyword2 data1 data2 data3
data4 data5
data6
keyword1 data1 data2 data3 data4
keyword3 data1
| [reply] [Watch: Dir/Any] [d/l] |
|
Thank you for the prompt reply. I was kinda hoping perl had a way to increment in the array without a shift. But changing to a while (or maybe a classic C for loop) using an index, and your reply contains a nice template!
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] [d/l] |
Re: Manually incrementing @ array during for
by tybalt89 (Monsignor) on Mar 16, 2020 at 17:39 UTC
|
#!/usr/bin/perl
use strict; # https://perlmonks.org/?node_id=11114346
use warnings;
my @array = <DATA>;
use Data::Dump 'dd'; dd \@array;
my @combinedarray = (join "", @array, '') =~ /^.*\n(?: .*\n)*/gm;
dd \@combinedarray; # do the 'for' over @combinedarray
__DATA__
keyword1 data1 data2 data3
keyword2 data1 data2 data3
data4 data5
data6
keyword1 data1 data2 data3 data4
keyword3 data1
Outputs:
[
"keyword1 data1 data2 data3\n",
"keyword2 data1 data2 data3\n",
" data4 data5\n",
" data6\n",
"keyword1 data1 data2 data3 data4\n",
"keyword3 data1\n",
]
[
"keyword1 data1 data2 data3\n",
"keyword2 data1 data2 data3\n data4 data5\n data6\n",
"keyword1 data1 data2 data3 data4\n",
"keyword3 data1\n",
]
| [reply] [Watch: Dir/Any] [d/l] [select] |
Re: Manually incrementing @ array during for
by Fletch (Bishop) on Mar 16, 2020 at 16:09 UTC
|
You need to get fancier in your parsing. You need to examine each line as it comes in, determine if it's a continuation (presuming leading whitespace indicates this, going from your example data) and (if not) append to the "current line". Once you're sure you have a full line, then process it and clear out the current line. Handwavy, vague outline:
my $current_line = q{};
while( defined( my $line = <> ) ) {
chomp( $line );
if( $line =~ m{^ \s+ \w+ }x ) {
$current_line .= $line;
next;
} else {
_process_line( $current_line );
$current_line = $line;
}
}
if( $current_line ) {
_process_line( $current_line );
}
sub _process_line {
my( $line ) = $shift;
## do whatever . . .
}
Update: Fuller example with sample data and fixing a bugglet first time through loop.
The cake is a lie.
The cake is a lie.
The cake is a lie.
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
Thank you for your prompt reply. I agree the steps necessary require more sophisticated parsing. I like the approach where a split data line is joined, then just process the entire line.
However, this is part of a much bigger body of code, and I receive the already created @array, so I don't think I can shift since the @array may be used elsewhere.
That being the case, do you think I should use an array index instead? Sort of a combination of yours and the previous reply. I could peek ahead to the next line and if it starts with blanks (or is not a keyword), I could then combine the next line to the current line as per your approach... Unless you say otherwise, I think I will go this route. Thanks again!
| [reply] [Watch: Dir/Any] |
|
Several other monks have been suggesting ways to handle this that end up modifying @array. If your part of the program receives a reference to @array, you can use the dclone method from the core module Storable to copy the array and then mangle the copy however you want.
Adapted from the Storable POD:
use Storable qw(dclone);
# ...
my $arrayref = dclone($provided_arrayref);
As long as @array is small enough to copy, this should be very efficient; Storable is an XS module. | [reply] [Watch: Dir/Any] [d/l] [select] |
|
|
| [reply] [Watch: Dir/Any] [d/l] |
|
Re: Manually incrementing @ array during for
by johngg (Canon) on Mar 16, 2020 at 16:48 UTC
|
It might be simpler to do a first pass concatenating continuation lines via splice before your main processing.
johngg@shiraz:~/perl/Monks$ perl -Mstrict -Mwarnings -E '
open my $inFH, q{<}, \ <<__EOD__ or die $!;
keyword1 data1 data2 data3
keyword2 data1 data2 data3
data4 data5
data6
keyword1 data1 data2 data3 data4
keyword3 data1
__EOD__
my @dataLines = <$inFH>;
chomp @dataLines;
close $inFH or die $!;
for my $idx ( reverse 0 .. $#dataLines )
{
next if $dataLines[ $idx ] =~ m{^keyword};
$dataLines[ $idx - 1 ] .= splice @dataLines, $idx, 1;
}
say for @dataLines;'
keyword1 data1 data2 data3
keyword2 data1 data2 data3 data4 data5 data6
keyword1 data1 data2 data3 data4
keyword3 data1
I hope this is of interest.
| [reply] [Watch: Dir/Any] [d/l] |
Re: Manually incrementing @ array during for
by Marshall (Canon) on Mar 17, 2020 at 01:22 UTC
|
I demo a common parsing pattern below. You figure out what is special about the start of a "new record". If you see that "special thing" and you are already working on a record, then you process the previous record and start a new one. Otherwise you are continuing the current record. Note that since the start of a new record triggers the output of the previous record, there is a need to output the final record once the data ends.
use strict;
use warnings;
$|=1;
my $data_lines =
('keyword1 data1 data2 data3
keyword2 data1 data2 data3
data4 data5
data6
keyword1 data1 data2 data3 data4
keyword3 data1
');
my @lines = split (/\n/,$data_lines);
print "To show array of text lines as per spec:\n";
foreach (@lines)
{
print " $_\n";
}
print "\n";
print "Showing data array's per combined input lines:\n\n";
my @array = ();
foreach my $line (@lines)
{
if ($line =~ /^\S/ and @array>0) # Finish previous record
{
process_array (@array);
@array = (); #start new record
push (@array,$_) foreach (split ' ',$line);
}
else # new or continuing record
{
push (@array, $_) foreach (split ' ',$line);
}
}
process_array (@array); # the last record
sub process_array
{
my @array = @_;
print "process array in some sub = @array\n";
}
__END__
To show array of text lines as per spec:
keyword1 data1 data2 data3
keyword2 data1 data2 data3
data4 data5
data6
keyword1 data1 data2 data3 data4
keyword3 data1
Showing data array's per combined input lines:
process array in some sub = keyword1 data1 data2 data3
process array in some sub = keyword2 data1 data2 data3 data4 data5 dat
+a6
process array in some sub = keyword1 data1 data2 data3 data4
process array in some sub = keyword3 data1
| [reply] [Watch: Dir/Any] [d/l] |
Re: Manually incrementing @ array during for
by kcott (Archbishop) on Mar 17, 2020 at 08:25 UTC
|
G'day cniggeler,
From your description, you're reading lines from a file and adding them to an array,
then reading all the same lines from the array and processing them.
You're doing the same work twice and you've provided no explanation why you need to do this.
Is there a reason you're not just processing the lines as you read them from the file?
If your keyword lines all start the same -- e.g. "ID1", "ID2", etc. -- you can do something like this:
#!/usr/bin/env perl
use strict;
use warnings;
{
local $/ = 'keyword';
while (<DATA>) {
next if $. == 1;
chomp;
y/\n//d;
print "$/$_\n";
}
}
__DATA__
keyword1 data1 data2 data3
keyword2 data1 data2 data3
data4 data5
data6
keyword1 data1 data2 data3 data4
keyword3 data1
You may need to refer to local and, for $. and $/,
"perlvar: Variables related to filehandles".
If the only way to differentiate keyword lines from continuation lines is by whitespace, you can do something like this:
#!/usr/bin/env perl
use strict;
use warnings;
my $multiline = '';
while (<DATA>) {
chomp;
if (0 == index $_, ' ') {
$multiline .= $_;
}
else {
print "$multiline\n" if length $multiline;
$multiline = $_;
}
}
print "$multiline\n";
__DATA__
keyword1 data1 data2 data3
keyword2 data1 data2 data3
data4 data5
data6
keyword1 data1 data2 data3 data4
keyword3 data1
Both of those scripts produce identical output:
keyword1 data1 data2 data3
keyword2 data1 data2 data3 data4 data5 data6
keyword1 data1 data2 data3 data4
keyword3 data1
If there's something else going on here, you'll need to tell us.
For instance, keyword may have some associated pattern, in which case a regex solution might be more appropriate.
Please include some code with any follow-up questions;
along with output, even if that's only error messages.
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
Is there a reason you're not just processing the lines as you read them from the file?
According to this, the code for which cniggeler is responsible "... is part of a much bigger body of code, and I receive the already created @array, ... the @array may be used elsewhere." Then IIUC, it's not possible for cniggeler to parse the data at the point of access (which I agree would likely be simpler and more efficient).
Give a man a fish: <%-{-{-{-<
| [reply] [Watch: Dir/Any] [d/l] |
|
Fair enough. I obviously missed the later post where the goal posts were moved. :-)
The second of my two solutions would work equally well for an array.
The chomp may not be necessary:
hard to tell as the example input data presented in the OP looks more file data than array data.
| [reply] [Watch: Dir/Any] [d/l] |
Re: Manually incrementing @ array during for
by cniggeler (Sexton) on Mar 18, 2020 at 02:20 UTC
|
Thanks for the many great replies! I ended up using a while loop and index to step through the @array, and coalescing multiple lines into one line, which was then parsed. The index had to be incremented for each "extra" line so the outer while loop didn't re-process the coalesced lines.
| [reply] [Watch: Dir/Any] |
|
#!/usr/bin/perl
use strict;
use warnings;
my @data = <DATA>;
for ((my $i, local $_, my $next) = (0, @data[0, 1]);
$i < @data;
($_, $next) = ($next, $data[++$i + 1])) {
$next and $next =~ /^\s/ and ($_, $next) = ($_ . $next, $data[++$i
+ + 1]) and redo;
# processing goes here
print "#$i: $_";
}
__DATA__
keyword1 data1 data2 data3
keyword2 data1 data2 data3
data4 data5
data6
keyword1 data1 data2 data3 data4
keyword3 data1
which gives:
#0: keyword1 data1 data2 data3
#3: keyword2 data1 data2 data3
data4 data5
data6
#4: keyword1 data1 data2 data3 data4
#5: keyword3 data1
Greetings, -jo
$gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$
| [reply] [Watch: Dir/Any] [d/l] [select] |