WeeDie has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to merge several .ktf tracks files for Kartex. It's suppose to take the first file in a directory, and append the other files to it, excluding the header.

This is what a .ktf file looks like:
//Kartex TrackFil skapad av Kartex 5.4 &KTF 2.0,sweref 99 lat long,1 %,0,N59°3.577' E13°2.460',53.97,2009-09-16 10:05:38,,,$ %,1,N59°3.577' E13°2.460',55.89,2009-09-16 10:05:41,,,$ %,2,N59°3.579' E13°2.457',55.41,2009-09-16 10:05:45,,,$ %,3,N59°3.581' E13°2.432',55.41,2009-09-16 10:06:02,,,$ %,4,N59°3.579' E13°2.422',56.37,2009-09-16 10:06:09,,,$ %,5,N59°3.579' E13°2.417',56.37,2009-09-16 10:06:13,,,$
I've come up with the following code:
use File::Find; my $dir = Cwd::getcwd(); find(\&wanted,$dir); sub wanted { if ($_ != /\.ktf$/) { print "$File::Find::name"; open (GPXDATA, "$File::Find::name") || die("Could not open fil +e! $File::Find::name"); my(@raw_data)=<GPXDATA>; open (MYFILE, '>>D:\mergedfile.ktf'); for (@raw_data) { print MYFILE m"(%,0,.+)"mgs } close (MYFILE); } }
This takes the first line of every file, excluding the header, and merges them into D:\mergedfile.ktf. I was thinking I could add the header to the file later manually. How do I best get it to take everything following the header:
//Kartex TrackFil skapad av Kartex 5.4 &KTF 2.0,sweref 99 lat long,1
Until the end of the file? And merge it into one file?

Replies are listed 'Best First'.
Re: Match multiple lines
by ikegami (Patriarch) on Sep 19, 2009 at 15:25 UTC

    How do I best get it to take everything following the header:

    <$fh_in>; # Skip header print $fh_out $_ while <$fh_in>;
    Or if you don't mind loading the entire input file into memory:
    <$fh_in>; # Skip header { local $/; print { $fh_out } <$fh_in>; }

    By the way, '.' indicates the current directory, so there's no need to load Cwd.

    By the way, $_ != /\.ktf$/ makes no sense. You probably meant $_ !~ /\.ktf$/ (long for !/\.ktf$/). But isn't that inverted? I think you want /\.ktf$/ (short for $_ =~ /\.ktf$/).

    So you get:

    use strict; use warnings; use File::Find qw( find ) open(my $fh_out, '>', 'D:\\mergedfile.ktf') or die("Can't create output file \"D:\\mergedfile.ktf\": $!\n"); my $first = 1; find(sub { return if !/\.ktf\z/; print("$File::Find::name\n"); open(my $fh_in, '<', $File::Find::name) or die("Can't open file \"$File::Find::name\": $!\n"); <$fh_in> if !$first; $first = 0; print $fh_out $_ while <$fh_in>; }, '.');
      I think other Kartex files also have headers, and that each header counts as two lines. So something like:
      find(sub { return if !/\.ktf\z/; print("$File::Find::name\n"); open(my $fh_in, '<', $File::Find::name) or die("Can't open file \"$File::Find::name\": $!\n"); while (my $line = <$fh_in>) { next if $. <= 2; print $fh_out $line; } }, '.');

        Yes exactly. The header is the 2 first lines of each file.

        I want a script that takes all lines following the header, from all .ktf files in the current directory, and merges them into one file, ultimately with the header added to it.

        I'm sorry, I don't understand the $fh_in and $fh_out variables. This is my first perl script >_<

        If someone could give me a more detailed reply on how to read and merge all lines following the 2 first lines, in all .ktf files in a directory, and write it to a file, I would be eternally grateful! :)

      Thank you for your reply, I think you misunderstand me. I need a regex pattern that matches everything following the header to the end of the file. The file contains multiple lines. The script is able to match the first line, following the header, of each file and append them together into a single file. But I don't want the regex pattern to stop at the first newline, I want it to match all characters, even newline. I've also been confused by the if ($_ != /\.ktf$/) code, but it's the only thing that worked. It returns true if filename ends with .ktf. It looks inverted to me though... anyway, it works so...
        For example, this is the results running the script in a directory containing 10 ktf files.
        %,0,N59°3.248' E13°2.656',65.99,2009-09-16 09:51:42,,,$ %,0,N59°3.577' E13°2.460',53.97,2009-09-16 10:05:38,,,$ %,0,N59°3.576' E13°2.462',55.89,2009-09-16 10:07:51,,,$ %,0,N59°3.648' E13°2.924',55.89,2009-09-16 10:14:10,,,$ %,0,N59°3.652' E13°2.937',55.41,2009-09-16 10:19:17,,,$ %,0,N59°3.454' E13°3.092',55.89,2009-09-16 10:25:05,,,$ %,0,N59°3.453' E13°3.467',58.30,2009-09-16 10:31:30,,,$ %,0,N59°3.451' E13°3.466',57.34,2009-09-16 10:36:06,,,$ %,0,N59°3.473' E13°3.605',60.70,2009-09-16 10:39:35,,,$ %,0,N59°3.452' E13°3.467',56.37,2009-09-16 10:42:49,,,$

        The regex pattern stops at newline instead of matching until the end of file. I've tried the m modifier with no luck...

        Example: print MYFILE /^(%,0,.*)/ms

Re: Match multiple lines
by zude (Scribe) on Sep 20, 2009 at 01:00 UTC
    Try:
    my(@raw_data)=<GPXDATA>; # read file into array, one line per element splice @raw_data,0,2; # delete the first two lines

    --------
    ~%{${@_[0]}}->{0}&&+++ NO CARRIER

      Thank you! This is what I was able to come up with:
      use File::Find; my $dir = Cwd::getcwd(); my $string = "//Kartex TrackFil skapad av Kartex 5.4\n&KTF 2.0,sweref +99 lat long,1\n"; open (MYFILE, '>>merged.ktf'); print MYFILE $string; close (MYFILE); find(\&wanted,$dir); sub wanted { if ($_ != /\.ktf$/) { print "$File::Find::name\n"; open (KTFDATA, "$File::Find::name") || die("Could not open fil +e! $File::Find::name"); my (@raw_data) = <KTFDATA>; splice @raw_data,0,2; close (KTFDATA); open (MYFILE, '>>merged.ktf'); print MYFILE @raw_data; close (MYFILE); } }