snape has asked for the wisdom of the Perl Monks concerning the following question:
Hi, I have a file which has a format like :
A class 1 1 1 1 1 1 A id 12 12 15 15 16 16 B class 0 0 0 0 0 0 B id 0 0 0 0 0 0 0 C id 0 0 0 0 0 0 0 M X1 1 2 2 2 2 2 M X2 2 1 1 1 1 1 M X3 1 2 2 2 2 2 M X4 2 2 2 1 1 2 M X5 1 1 1 1 1 1 M X6 1 1 1 2 2 1 M X7 1 1 1 1 1 1 M X8 1 1 1 1 1 1 M X9 1 1 1 1 1 1 M X10 1 1 1 1 1 1 M X11 1 2 1 1 1 1 M X12 2 2 2 2 2 2 M X13 2 1 2 2 2 2 M X14 2 1 2 2 2 2 M X15 1 2 1 1 1 1 M X16 1 1 2 2 2 2 M X17 1 2 2 2 2 2
This is a subset of the big file. I need some help in writing the perl program for doing the file manipulation. 1. I would like to delete all the rows EXCEPT the rows A id followed by numbers and M X<number> followed by numbers. The problem is how should I read the file considering that its a big file so, I can't read the entire file in the array and then use for each statement. How to do if I am using the while and reading it line by line ? The output of the file should be as follows:
A id 12 12 15 15 16 16 M X1 1 2 2 2 2 2 M X2 2 1 1 1 1 1 M X3 1 2 2 2 2 2 M X4 2 2 2 1 1 2 M X5 1 1 1 1 1 1 M X6 1 1 1 2 2 1 M X7 1 1 1 1 1 1 M X8 1 1 1 1 1 1 M X9 1 1 1 1 1 1 M X10 1 1 1 1 1 1 M X11 1 2 1 1 1 1 M X12 2 2 2 2 2 2 M X13 2 1 2 2 2 2 M X14 2 1 2 2 2 2 M X15 1 2 1 1 1 1 M X16 1 1 2 2 2 2 M X17 1 2 2 2 2 2
2. Since, I have only two values i.e. 1 or 2 in both the columns, I would like to compare the values which are identical in the same position of the row but are different in position of the columns. for eg:
It is the subset of the above data, where F.C. represents First Column and S.C. represents Second Column (included for being more descriptive). Here, we see that the second column of 12 is identical to first and second column of 15 and 16. Therefore, I would like to know the longest stretch of the two similar/identical columns. Similarly, I would like to do it for all the other columns. Remember: that I can't compare the first column of 12 with second column of 12. I will be obliged if you can help on this. Any ideas and if possible may be snippets of code will be highly appreciated. Thanks a lot.A id 12(F.C.) 12 (S.C.) 15(F.C.) 15(S.C.) 16(F.C.)16 (S.C.) M X1 1 2 2 2 2 2 M X2 2 1 1 1 1 1 M X3 1 2 2 2 2 2
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Column Comparison of a File in Perl
by bv (Friar) on Jan 19, 2010 at 21:43 UTC | |
|
Re: Column Comparison of a File in Perl
by toolic (Bishop) on Jan 19, 2010 at 21:46 UTC | |
by snape (Pilgrim) on Jan 19, 2010 at 22:06 UTC |