Re: Combining matching lines in a TAB seperated textfile

Replies are listed 'Best First'.
Re^2: Combining matching lines in a TAB seperated textfile by Wobbel (Acolyte) on Apr 27, 2011 at 14:04 UTC
Correct! But what is the right tool for this problem? I've seen fantastic Perl one liners, but probably it's more elegant to work with a "new" technique (I'm afraid of hashes :-) ). But serious, I don't mind to learn something new, if I don't waste several days by wandering in the fog. It's more "the noble art of programming". Show me the way, and I'll try something new! (but elegant Perl on liners are always welcome...). Summary: Skip the correct single lines and recognize the right pairs (match on 4 columns) and transform them to a correct single line (with the right vrt, lng, lat values). The output text file contains only correct single lines.	[reply]
Re^3: Combining matching lines in a TAB seperated textfile by Eliya (Vicar) on Apr 27, 2011 at 16:00 UTC
You still haven't really answered what exactly the problem is, so I'm not going to provide a directly usable solution either :) It's okay to be looking for new, elegant, or whatever techniques, but before that keeps you from getting the work done, you could rather start with the basics you're familiar with, and see how far you get... If you run into a roadblock or feel things are getting unwieldy, you can still look for other more fancy ways around it. And if you'd like to know if there's a more idiomatic/elegant/faster/etc. way than what you've eventually come up with, nothing keeps you from presenting your work here and asking for comments. That said, here's my take at it as a starting point. It handles a simplified case (less columns), and as I wasn't entirely sure whether you want to skip or keep the 'single' lines, I chose to pass them through: #!/usr/bin/perl -w use strict; use constant { # column indices FOO => 0, BAR => 1, LEN => 2, }; my @col1; # '1st-line-of-pair' buffer while (<DATA>) { # read line chomp; my @col2 = split /\t/; # split line on tabs if (@col1) { # two lines read, i.e. pair available? if ( $col1[FOO] eq $col2[FOO] and $col1[BAR] eq $col2[BAR] ) { # is pair matching? # average length $col1[LEN] = sprintf "%.1f", ($col1[LEN] + $col2[LEN]) / 2 +; write_out(@col1); # write out modified/merged line @col1 = (); # clear buffer next; # skip rest } else { write_out(@col1); # write out non-paired line } } @col1 = @col2; # store line (previous=current) } write_out(@col1) if @col1; # take care of last line sub write_out { print join("\t", @_), "\n"; } __DATA__ abc def 3.5 abc def 4.5 ghi jkl 13.2 mno pqr 2.8 mno pqr 2.4 stu vwx 10.0 [download] Output: `abc def 4.0 ghi jkl 13.2 mno pqr 2.6 stu vwx 10.0` [download]	[reply] [d/l] [select]
Re^4: Combining matching lines in a TAB seperated textfile by Wobbel (Acolyte) on Apr 27, 2011 at 19:27 UTC
Wow! I think I can see through the mist (sorry, my English is very bad). I recognize most of the code, but there is some syntaxis I have to investigate (sprintf "%.1f"). It's no problem to use 23 # column indices and 10.000 lines? I have a "comparable" Perl snippet, that reads a text logfile and generates a html/css page. I think the hill is not to steep. Thanks for the usefull advice! And if it works... What kind of construction would a Perl expert use? I'll never be a pro, but I'm eager to learn a little bit more every day! (Wobbel, buy a Navigator...)	[reply]
Re^5: Combining matching lines in a TAB seperated textfile by Eliya (Vicar) on Apr 27, 2011 at 19:59 UTC
Re^5: Combining matching lines in a TAB seperated textfile by Wobbel (Acolyte) on May 11, 2011 at 11:10 UTC