in reply to Perl solution for current batch file to extract specific column text

Hi oryan,

a number of people will tell you that this site is not a free code-writing service and that you should show some efforts at doing things yourself, and I agree.

Having said that, even if you did not show any Perl code, I appreciate that you have shown code in another scripting language. I consider this as a real effort. And, although I do not have any benchmark, I am fairly sure that a Perl program will be much much faster than a DOS batch script (well, you don't say it, but I guess your script is DOS batch).

So, you should really look at the links that were provided to you, but I am willing to help and I can give you the skeleton of a Perl program for what you want (quick draft, untested):

use strict; use warnings; my $input = "input.txt"; open my $IN, "<", $input or die "cannot open $input $!"; my $output = "output.txt"; open my $OUT, ">", $output or die "cannot open $output $!"; while (my $line = <$IN>) { chomp $line; # remove newline character from end of line if ($line =~ /INTERPOLATED HYDROGRAPH AT (\w+)$/){ print $OUT $1; next for 1..5; # skip 5 lines my $val2 = (split / /, $line)[1]; get the second column print $OUT $val2; # a separator may be needed here next; # skip one line my $val3 = (split / /, $line)[-1]; # get the last column print $OUT "$val3\n"; # a separator may be needed here } }
This is very rough and untested, but I hope it will get you going.

Update: There are two errors in the code above (thanks to poj for pointing them out. Please see below (Re^3: Perl solution for current batch file to extract specific column text) a corrected version.

  • Comment on Re: Perl solution for current batch file to extract specific column text
  • Download Code

Replies are listed 'Best First'.
Re^2: Perl solution for current batch file to extract specific column text
by poj (Abbot) on Aug 04, 2015 at 10:15 UTC
    This prints line 1 twice
    #!perl use strict; while (my $line = <DATA>){ if ($line =~ /line 1/){ print $line; next for 1..5; # skip 5 lines print $line; } } __DATA__ line 1 line 2 line 3 line 4 line 5 line 6 line 7

    Perhaps you meant

    #!perl use strict; while (my $line = <DATA>){ if ($line =~ /line 1/){ print $line; <DATA> for 1..5; $line = <DATA>; print $line; } }
    poj
      Yes, poj, you're right, there were two mistakes.

      This is the amended (and tested) version:

      use strict; use warnings; my $input = "input.txt"; open my $IN, "<", $input or die "cannot open $input $!"; my $output = "output.txt"; open my $OUT, ">", $output or die "cannot open $output $!"; while (my $line = <$IN>) { chomp $line; # remove newline character from end of line if ($line =~ /INTERPOLATED HYDROGRAPH AT (\w+)$/){ print $OUT $1; $line = <$IN> for 1..6; # skip 5 lines my $val2 = (split / /, $line)[1]; # get the second column print $OUT " $val2"; $line = <$IN>for 1..2; # skip one line chomp $line; my $val3 = (split / /, $line)[-1]; # get the last column print $OUT " $val3\n"; } }
      And this is the output:
      $ cat output.txt CAC40 1223. 1456.
      I should add that this is not very robust code, it will probably break with any irregularity in the input data. But we would need to know more about the data (having at least three or four samples of line groups, instead of only one) to be able to do something more robust.