writing two files (different in length) to one output

ic23oluk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: writing two files (different in length) to one output by choroba (Cardinal) on May 28, 2017 at 11:14 UTC
See seek on how to rewind a file handle. Also note that `<READ1>.<READ2>` doesn't check for the definedness of both the return values. Moreover, chomping a line to append `\n` to it in the next line is useless. Chomping a file name after having used it makes no sense, chomp it before you use it, and use 3-arg open (which doesn't remove whitespace from its 3rd argument). Don't rewind the second file after having rewound the first one, otherwise you'll get an infinite loop. Read more... (1039 Bytes) ($q=q:Sq=~/;[c](.)(.)/;chr(-\|\|-\|5+lengthSq)`"S\|oS2"`map{chr \|+ord }map{substrSq`S_+\|`\|}3E\|-\|`7**2-3:)=~y+S\|`+$1,++print+eval$q,q,a, [download]	[reply] [d/l] [select]
Re: writing two files (different in length) to one output by hippo (Archbishop) on May 28, 2017 at 11:27 UTC
If one input file is shorter than the other, it should return to the first line, which is the difficulty to me. If you store the input files as arrays then it is less tricky: #!/usr/bin/env perl #===================================================================== +========== # # FILE: zip.pl # # USAGE: ./zip.pl input1 input2 output # # DESCRIPTION: Zip 2 input files into one output file, looping the # shorter one. # # REQUIREMENTS: Path::Tiny # NOTES: See http://www.perlmonks.org/?node_id=1191418 #===================================================================== +========== use strict; use warnings; use Path::Tiny; my @in = ( path ($ARGV[0]), path ($ARGV[1]) ); my $out = $ARGV[2]; my @first = $in[0]->lines; my @second = $in[1]->lines; my $max = $#first > $#second ? $#first : $#second; open my $outfh, '>', $out or die "Cannot open $out for writing: $!"; for (0 .. $max) { print $outfh $first[$_ % @first] . $second[$_ % @second] } [download] Bad argument trapping is left as an exercise.	[reply] [d/l]
Re: writing two files (different in length) to one output by LanX (Saint) on May 28, 2017 at 12:32 UTC
hmm ... TIMTOWTDI! :) What about combining `eof` with `seek` ? `open ONE, '<' ,'one.txt'; open TWO, '<' ,'two.txt'; my $go=2; while ($go) { my $line1 = <ONE>; my $line2 = <TWO>; chomp $line1,$line2; print "$line1 \t $line2"; $go--, seek ONE,0,0 if eof ONE; $go--, seek TWO,0,0 if eof TWO; }` [download] one.txt has 3, two.txt 5 lines that's the output `1 1 2 2 3 3 1 4 2 5` [download] HTH! :) Cheers Rolf _{(addicted to the Perl Programming Language and ☆☆☆☆ :) Je suis Charlie!}	[reply] [d/l] [select]
Re^2: writing two files (different in length) to one output by Marshall (Canon) on May 28, 2017 at 19:43 UTC
I really liked your solution! However, I think there is a small problem with the loop conditional. $go gets decremented every time a seek to the beginning of a file is done. This worked fine in your test case, but what if file1 needs to gets "rewound" many times? $go could get decremented a bunch of times before an eof on TWO happens. I re-coded with the idea that the loop is over when an EOF has been seen at least once on both files. I don't have any big issue with the use of the comma statement, but since this could be confusing, I expanded that part of the code into 2 discrete statements. #!usr/bin/perl use strict; use warnings; open ONE, '<' ,'one.txt' or die "Ooops $!"; open TWO, '<' ,'two.txt' or die "Ooops $!"; my $EOFseenfile1=0; my $EOFseenfile2=0; # Loop finished when EOF has been seen at least once # on both files. while (not $EOFseenfile1 or not $EOFseenfile2) #go until both EOF seen { my $line1 = <ONE>; my $line2 = <TWO>; chomp ($line1, $line2); print "$line1 \t $line2\n"; if (eof ONE) { seek ONE,0,0; $EOFseenfile1++; } if (eof TWO) { seek TWO,0,0; $EOFseenfile2++; } } __END__ file 1 line 1 file 2 line 1 file 1 line 2 file 2 line 2 file 1 line 3 file 2 line 3 file 1 line 1 file 2 line 4 file 1 line 2 file 2 line 5 file 1 line 3 file 2 line 6 file 1 line 1 file 2 line 7 file 1 line 2 file 2 line 8 file 1 has 3 lines file 2 has 8 lines [download] "eof" is also nice because you get that status right after reading the last valid line. The next read on that file handle would produce an undefined line which often ends many while() input loops, e.g. `while ($line = <IN>){}`. By using "eof" instead of an undefined read, no "re-reading" is necessary++, very nice. Update: Geez, if guess `while (not $EOFseenfile1 or not $EOFseenfile2){}` could be `while (!($EOFseenfile1 and $EOFseenfile2)){}`. I don't know why I coded the first version. These two while conditionals are equivalent.	[reply] [d/l] [select]
Re^3: writing two files (different in length) to one output by LanX (Saint) on May 29, 2017 at 11:38 UTC
You are absolutely right! My fault ...:) > "eof" is also nice because you get that status right after reading the last valid line. Yep, not sure how efficient the implementation is, but rotating over iterators is usually not that much fun... I recently stumbled over eof while reading `perlfunc` for the "Overview scalar vs list context?" thread. FWIW, my approach would be something like: `open ONE, '<' ,'one.txt'; open TWO, '<' ,'two.txt'; my $finished_1, $finished_2; until ( $finished_1 and $finished_2) { my $line1 = <ONE>; my $line2 = <TWO>; chomp $line1, $line2; print "$line1 \t $line2"; ++$finished_1, seek ONE,0,0 if eof ONE; ++$finished_2, seek TWO,0,0 if eof TWO; }` [download] out (ONE shortened to 2 lines) `1 1 2 2 1 3 2 4 1 5` [download] Cheers Rolf _{(addicted to the Perl Programming Language and ☆☆☆☆ :) Je suis Charlie!}	[reply] [d/l] [select]
Re^4: writing two files (different in length) to one output by Marshall (Canon) on May 30, 2017 at 18:54 UTC
Re^5: writing two files (different in length) to one output by LanX (Saint) on May 30, 2017 at 20:48 UTC
Some notes below your chosen depth have not been shown here
Re: writing two files (different in length) to one output by kcott (Archbishop) on May 29, 2017 at 07:28 UTC
G'day ic23oluk, Here's my take on a solution (`pm_1191418_file_merge.pl`): `#!/usr/bin/env perl -l use strict; use warnings; use autodie; die "Usage: $0 file1 file2\n" unless @ARGV == 2; my (@fhs, $shorter); for (@ARGV) { die "'$_' is zero-length\n" if -z; open my $fh, '<', $_; push @fhs, $fh; } LOOP: while (1) { my @out_line; for (0 .. 1) { my $line = readline $fhs[$_]; if (! defined $line) { $shorter = $_ unless defined $shorter; last LOOP if $shorter != $_; seek $fhs[$_], 0, 0; $line = readline $fhs[$_]; } push @out_line, $line; } chomp @out_line; print @out_line; }` [download] I created four input files: one empty; one with just a single newline; the other two with data. `$ ls -al pm_1191418_data_* -rw-r--r-- 1 ken staff 1 May 29 16:57 pm_1191418_data_blank.txt -rw-r--r-- 1 ken staff 24 May 29 16:14 pm_1191418_data_five.txt -rw-r--r-- 1 ken staff 8 May 29 16:14 pm_1191418_data_two.txt -rw-r--r-- 1 ken staff 0 May 29 16:47 pm_1191418_data_zero.txt $ cat pm_1191418_data_zero.txt $ cat pm_1191418_data_blank.txt $ cat pm_1191418_data_two.txt ONE TWO $ cat pm_1191418_data_five.txt one two three four five` [download] These first two runs just exercise the sanity tests: `$ pm_1191418_file_merge.pl Usage: ./pm_1191418_file_merge.pl file1 file2 $ pm_1191418_file_merge.pl pm_1191418_data_zero.txt pm_1191418_data_tw +o.txt 'pm_1191418_data_zero.txt' is zero-length` [download] The remaining runs test blank lines; files with a different number of lines in either order; and files with the same number of lines: $ pm_1191418_file_merge.pl pm_1191418_data_blank.txt pm_1191418_data_t +wo.txt ONE TWO $ pm_1191418_file_merge.pl pm_1191418_data_two.txt pm_1191418_data_bla +nk.txt ONE TWO $ pm_1191418_file_merge.pl pm_1191418_data_two.txt pm_1191418_data_fiv +e.txt ONEone TWOtwo ONEthree TWOfour ONEfive $ pm_1191418_file_merge.pl pm_1191418_data_five.txt pm_1191418_data_tw +o.txt oneONE twoTWO threeONE fourTWO fiveONE $ pm_1191418_file_merge.pl pm_1191418_data_blank.txt pm_1191418_data_b +lank.txt $ pm_1191418_file_merge.pl pm_1191418_data_two.txt pm_1191418_data_two +.txt ONEONE TWOTWO [download] — Ken	[reply] [d/l] [select]