Hi all! I have some data that look like below, after some formatting - thanks for helping me fix my code (in another thread) and format my data nicely.

-59.077 89.301 115.664 7 -61.251 77.435 117.760 -6 -60.950 71.712 116.061 -7 -56.247 83.685 114.576 1 -59.263 76.107 112.555 -2 -59.895 65.296 111.185 3 -60.141 63.694 111.257 -3 -61.667 63.707 116.937 2 -58.722 60.429 111.307 -1 -57.511 42.922 112.108 6

(^ 10 lines)

Now what I want to do is the following:

1- Subdivide my data file in separate data files, depending on the value of the last column as an absolute number. As an example, from the input above I want to get the following files:

File 1: output_7.txt:

7 -59.077 89.301 115.664 -7 -60.950 71.712 116.061

File 2: output_6.txt:

-6 -61.251 77.435 117.760 6 -57.511 42.922 112.108

File 3: output_1.yxy:

-1 -58.722 60.429 111.307 1 -56.247 83.685 114.576

And so on, for all the values of the last column.

Notes:

- I don't have a fixed column number, and

- I don't have a standard list of what the last column's values are, however they always are pairs of positive and negative integers as described above.

2. I also want to print the line number, so the above outputs would be (second column in the output below is the line number from the input file above)

File 1: output_7.txt:

7 1 -59.077 89.301 115.664 -7 3 -60.950 71.712 116.061

File 2: output_6.txt:

-6 2 -61.251 77.435 117.760 6 10 -57.511 42.922 112.108

File 3: output_1.yxy:

-1 9 -58.722 60.429 111.307 1 4 -56.247 83.685 114.576

And so on.

This is what I've done so far:

Code 1, looking for a specific number, it works fine. Well, it's probably very awkwardly written but it works. As a test, I matched against value "-7" to see if my code works.

#!/usr/bin/perl use warnings; use strict; open my $target, '>', "test-out-1" or die $!; open my $FILE, '<', 'input_file' or die $!; while (<$FILE>) { chomp; my @columns = unpack('a8 a8 a8 a6'); #print join(" ",map {$_} @columns), "\n"; #print "@columns[3] \n"; foreach (@columns[$#columns]) { print "$_ \n"; if ($_ =~ /-7/) { my $ID = $_; my $IDform = sprintf ("%4s", $ID); my $currentline = $.; my $currentlineform = sprintf ("%7s", $currentline); my @selection = (@columns[0..$#columns-1]); my $layout = "%10s"x(@selection) . "\n"; printf $target $IDform . $currentlineform . $layout, @selection; } } }
This is part of my output:
-7 418 -17.459 -3.557 123.002 -7 419 -19.119 -2.327 121.948 -7 421 -18.172 -5.439 122.677 -7 423 -21.239 -5.003 128.245 -7 424 -17.575 -3.567 124.891 -7 425 -19.519 1.088 136.199 -7 426 -17.135 -5.042 124.510 -7 427 -19.539 -2.356 127.619 -7 429 -16.867 0.671 123.725 -7 430 -19.638 8.992 126.487 -7 431 -19.731 13.090 129.183 -7 432 -17.846 15.834 128.342 -7 440 -20.265 16.101 127.072

First column: the value of the input file's fourth column.

Second column: the line number where the matching 4th column pattern was, in the input file.

Rest of the columns: the rest of the columns of my input file, that correspond to the 4th column value.

Now, I want to have a code that works for all possible values of the input file's 4th column. Because they're always in positive/negative pairs, and because it's highly unlikely that they will be anything different from the range 1-7, I broadened it a bit to be safe and made an array with values from -10 to 10 (this is @match) This is in an attempt to have my code work for every possible value.

This is what I've done so far:

#!/usr/bin/perl use warnings; use strict; my $match; my @match = (); push (@match, $match); # I fear this is what I'm doing wrong - I'm not + putting $match in @match correctly. @match = (-10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6 +, 7, 8, 9, 10); my $value = abs($match); # Here I'm trying to establish a variable tha +t will basically be the absolute value of the fourth column of my inp +ut file, this is what I want to match. open my $target, '>', "test-out-$value" or die $!; open my $FILE, '<', 'input_file' or die $!; while (<$FILE>) { chomp; my @columns = unpack('a8 a8 a8 a6'); #print join(" ",map {$_} @columns), "\n"; #print "@columns[3] \n"; foreach (@columns[$#columns]) { if ($_ =~ /$value/) { my $ID = $_; my $IDform = sprintf ("%4s", $ID); # Also here I'm confused and +I can't think about how to write what I want. I want to match the abs +olute value (a few lines above), but print the actual value (in my ou +tput file), not the absolute one. This worked easily in my first code + (where I matched for specific number) but now I can't think how to w +rite the general version. my $currentline = $.; my $currentlineform = sprintf ("%7s", $currentline);## my @selection = (@columns[0..$#columns-1]); my $layout = "%10s"x(@selection) . "\n"; printf $target $IDform . $currentlineform . $layout, @selection; } } }

From this, I get an unitialized variable error for $match - I have tried to see how to fix this but I'm doing it wrong and can't figure out what's wrong. Also, the output I'm getting is a file of 0 size and by the title of "test-out-0".

This is a bit too much to be edited, I know, but if someone could at least let me know how I'm managing to populate the array wrong? I checked out the array functions (shift, unshift, pop, etc) and "push" seemed like the right way to go. If I get that corrected, I may be able to continue to fix the whole thing. Thank you so much for taking the time to read this and looking forward to some kindly offered hints and suggestions.


In reply to divide multi-column input file into sub-files depending on specific column's value by angela2

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.