biobee07 has asked for the wisdom of the Perl Monks concerning the following question:

I have around 150 files with 4 columns each but variable row lengths that I need to combine by column. I do not have any common column. I want to use "paste " command in unix to do it but before that I have to get all my files to be of equal length.
Below are examples of my input files: 01-16A1-325 01-16A1-325 01-16A1-325 01-16A1-325 01-16A1-325 01-16A1-325 01-16A1-325 01-16A1-325 A T G C 11 47 0 1 11 47 0 0 11 48 0 0 12 50 0 0 12 53 0 0 13 56 0 0 13 60 0 0 13 62 0 0 13 63 0 0 13 64 0 0 13 66 0 0 14 68 0 0 14 70 0 0 14 72 0 0 ===================================== 01-17B7-325 01-17B7-325 01-17B7-325 01-17B7-325 01-17B7-325 01-17B7-325 01-17B7-325 01-17B7-325 A T G C 52 57 56 9 53 59 58 9 54 62 62 10 57 67 69 10 61 73 80 10 68 81 95 11 74 87 112 11 81 92 127 11 86 95 141 11 92 99 157 11 98 105 178 12 104 112 205 12 110 119 237 13 116 126 271 13 121 131 307 12 126 136 346 11 133 142 389 11 141 151 439 10 151 162 494 10 161 177 556 11 171 196 622 11
Would appreciate any help. Thanks Biobee

Replies are listed 'Best First'.
Re: filling up rows in files to make them of equal length
by johngg (Canon) on Aug 16, 2011 at 09:15 UTC

    You should probably consider using printf. You may need to scan all of your files first to determine maximum column widths.

    knoppix@Microknoppix:~/data$ perl -E ' > open my $inFH, q{<}, \ <<EOD or die $!; > 01-16A1-325 01-16A1-325 01-16A1-325 01-16A1-325 > 01-16A1-325 01-16A1-325 01-16A1-325 01-16A1-325 > A T G C > 11 47 0 1 > 11 47 0 0 > 11 48 0 0 > EOD > > while ( <$inFH> ) > { > printf qq{%15s%15s%15s%15s\n}, split; > }' 01-16A1-325 01-16A1-325 01-16A1-325 01-16A1-325 01-16A1-325 01-16A1-325 01-16A1-325 01-16A1-325 A T G C 11 47 0 1 11 47 0 0 11 48 0 0 knoppix@Microknoppix:~/data$

    I hope this is helpful.

    Cheers,

    JohnGG

Re: filling up rows in files to make them of equal length
by duyet (Friar) on Aug 16, 2011 at 05:09 UTC

    Do a wc -l * to find out the longest file length, then use that to fill your files with dummy data. Pseudo code:

    loop thru all files open file append dummy data to end of file close file end