Two Column Data

PilotinControl has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Two Column Data by kcott (Archbishop) on May 11, 2015 at 23:33 UTC
G'day PilotinControl, Firstly, I think you need to step back from the question of one or two loops and consider what data you're looping over. Currently, you're opening a file, reading all its data into an array, closing the file, then reading all the same data again from the array. Beyond the waste of CPU cycles to read the data twice, you've now copied all the data from disk into memory: with a large file this could be problematic. A better approach is to read and process each line from the file without ever creating an intermediate array. Here's one way to do it. #!/usr/bin/env perl -l use strict; use warnings; my $format = "\| %-15s \| %-15s \|\n"; my $format_width = 37; print '=' x $format_width; printf $format => 'Inbound Track 1', 'Inbound Track 2'; print '-' x $format_width; my ($file1, $file2) = qw{pm_1126327_inbndtrk1.txt pm_1126327_inbndtrk2 +.txt}; open my $fh1, '<', $file1 or die "Can't open '$file1' for reading: $!" +; open my $fh2, '<', $file2 or die "Can't open '$file2' for reading: $!" +; while (<$fh1>) { printf $format => get_data($_), get_data(scalar <$fh2>); } close $fh1; close $fh2; print '=' x $format_width; sub get_data { chomp(my $line = shift); join ' ' => (split /:/ => $line)[1,2]; } [download] Output: `$ pm_1126327_combine_file_data.pl ===================================== \| Inbound Track 1 \| Inbound Track 2 \| ------------------------------------- \| B&O 101 \| CSXT 1001 \| \| B&O 102 \| CSXT 1002 \| =====================================` [download] Notes: Your printf formats look fine; although, note that you can put common text into the format, e.g. the pipe separators in `"\| %-15s \| %-15s \|\n"`. However, the syntax is `printf FORMAT, LIST`; your first three `printf` statements use `printf FORMAT, STRING`! Had you used warnings, you would have been advised of this problem: "`Missing argument in printf at ...`" The call to `get_data()` provides a list context for its arguments. `<$fh2>` in list context will read the entire file; `scalar <$fh2>` reads just one record (which is what we want). See perlintro: Files and I/O. The data you show is ideal, in that both files are the same length, all records have data, no data is corrupt. Real world data can rarely be relied upon to be ideal. In a `while (1) {...}` loop, you can code whatever loop exit conditions you want. In the following example, I've simply checked for an end-of-file (eof) condition in one file; in your real-world code, you'd probably want a lot more checking than this. `while (1) { last if eof $fh1; printf $format => get_data(scalar <$fh1>), get_data(scalar <$fh2>) +; }` [download] You can replace the `while` loop in my first script with this one and get exactly the same output. Update: I had three instances of the literal '`37`' hard-coded. This is not good! What the hell does '`37`' refer to? You'll now find `$format_width` instead of '`37`': much better. -- Ken	[reply] [d/l] [select]
Re^2: Two Column Data by PilotinControl (Pilgrim) on May 12, 2015 at 00:14 UTC
Thank you Ken! I was looking for a while loop to use as another instance I am reading data from several files and displaying the information in one column. This is the first time at my attempt of reading several files and putting the data into two separate columns and trying to figure out how to format the data. This is a very good working example that I can now build off of for future sub routines. Thanks again!	[reply]
Re: Two Column Data by edimusrex (Monk) on May 11, 2015 at 19:09 UTC
I don't know if this will help or not but this is my approach assuming the files are relatively small in size and that the lines in each file are equal to each other `#!/usr/bin/perl use warnings; use strict; open my $a, "<", "file1.txt" or die "Failed to open File : $!"; open my $b, "<", "file2.txt" or die "Failed to open File : $!"; open my $out, ">", "output.txt" or die "Failed to open File : $!"; chomp(my @file1=<$a>); chomp(my @file2=<$b>); close $a; close $b; my $max = @file1; my $min = 0; while ($min < $max) { print $min."\n"; print $out "$file1[$min] : $file2[$min]\n"; $min++; } close $out;` [download] Of course with the print $out statement you can use any separator you chose (chose a colon). Hope it helps	[reply] [d/l]
Re^2: Two Column Data by PilotinControl (Pilgrim) on May 11, 2015 at 19:31 UTC
My apologies...the output needs to be sent to STDOUT not to another file. I'll give you a vote for another approach which I see I will be using here shortly.	[reply]
Re^3: Two Column Data by edimusrex (Monk) on May 11, 2015 at 20:04 UTC
Ok, try this then `#!/usr/bin/perl use warnings; use strict; open my $a, "<", "file1.txt" or die "Failed to open File : $!"; open my $b, "<", "file2.txt" or die "Failed to open File : $!"; chomp(my @file1=<$a>); chomp(my @file2=<$b>); close $a; close $b; my $max = @file1; my $min = 0; while ($min < $max) { print "$file1[$min] : $file2[$min]\n"; $min++; }` [download] That should print to standard out	[reply] [d/l]
Re^4: Two Column Data by PilotinControl (Pilgrim) on May 11, 2015 at 22:11 UTC
Re^5: Two Column Data by Laurent_R (Canon) on May 12, 2015 at 18:04 UTC
Re: Two Column Data by Laurent_R (Canon) on May 11, 2015 at 18:53 UTC
You are taking about files, but seem to be using arrays. Can we assume that the data files have been loaded into the arrays? Note that you are using the `printf` function where you can probably use the easier `print` function. As for the content of your question, it might be easier to answer if you gave us some samples of your data. Je suis Charlie.	[reply] [d/l] [select]
Re^2: Two Column Data by PilotinControl (Pilgrim) on May 11, 2015 at 19:01 UTC
Yes, the data has already been put into the arrays and the data from file 1 needs to be in column one and data from file 2 needs to be in column 2. My issue has to do with column formats.	[reply]
Re^3: Two Column Data by Laurent_R (Canon) on May 11, 2015 at 19:44 UTC
What you want to do is still not clear to me. the data from file 1 needs to be in column one and data from file 2 needs to be in column 2 Yes, but your code seems to be splitting the data from both files. Do you want only one part of the data from both files? If so, which one? Do you files have the same numbers of elements and can it be assumed that we can just marry the nth line of file 1 with nth line of file 2 to produce the nth line of the output file? Je suis Charlie.	[reply]
Re^4: Two Column Data by PilotinControl (Pilgrim) on May 11, 2015 at 19:54 UTC
Re^5: Two Column Data by Laurent_R (Canon) on May 11, 2015 at 20:12 UTC
Some notes below your chosen depth have not been shown here
Re^3: Two Column Data by Laurent_R (Canon) on May 11, 2015 at 19:36 UTC
Please do not change your original post after you have received some answers, or if you do, please indicate clearly with an update tag what you have added. You are now making me look as a fool because I asked questions whose answers appear to be in the original post, but were not there when I asked the questions. Update: Thank you for having now marked your updates and having done so quickly. Je suis Charlie.	[reply]
Re: Two Column Data by jeffa (Bishop) on May 12, 2015 at 17:06 UTC
Unix has a tool called `join` that solves these kinds of problems for you. Given the following data files: `inbndtrk1.txt: 1:B&O:101 2:B&O:102` `inbndtrk2.txt: 1:CSXT:1001 2:CSXT:1002` The following command `join -t: -o{1,2}.{2,3} inbndtrk1.txt inbndtrk2.txt` [download] will yield this output: `B&O:101:CSXT:1001 B&O:102:CSXT:1002` Now, you can more easily format the results to whatever `$client` wants: `join -t: -o{1,2}.{2,3} inbndtrk1.txt inbndtrk2.txt \ \| perl -F: -ane'printf "%s %s \| %s %s",@F'` [download] Output: `B&O 101 \| CSXT 1001 B&O 102 \| CSXT 1002` jeffa L-LL-L--L-LL-L--L-LL-L-- -R--R-RR-R--R-RR-R--R-RR B--B--B--B--B--B--B--B-- H---H---H---H---H---H--- (the triplet paradiddle with high-hat)	[reply] [d/l] [select]