in reply to Extracting coordinates

Using strictures (use strict; use warnings;) is very good. Checking the result of open is very good.

Declaring all your variables up front in one hit is bad. Not using the three parameter version of open is bad. Not using lexical file handles is bad. Using global variables in subs is very bad.

In fact the fileswitch() sub isn't needed. Unless you have multiple files with the same name (most OS's don't allow that) each time through the outer loop handles a new file.

The regex can be simplified - no need to capture 6 different parts that you are just going to glue back together again.

Cleaning the code up and changing the file handling to suit sample code we get:

#!/usr/bin/perl use strict; use warnings; use diagnostics; # Fake up a couple of data files my %dataFiles = ( data1 => <<DATA1, DER.7-767/04.7 5194.5700 -6772.5200 0.0000 DER.7-767/04.8 5194.7400 -6776.3200 0.0000 DER.7-767/04.9 5192.1000 -6776.4300 0.0000 Der.7-539/99.1 5337.9000 6997.1200 0.0000 Der.7-539/99.10 5348.3300 -7020.0900 0.0000 Der.7-539/99.11 5348.4400 -7021.1100 0.0000 DATA1 data2 => <<DATA2, Kredyt3.27 5789322.3040 7500854.8800 0.0000 Kredyt3.27a -124.9646 373.4666 0.0000 Kredyt3.28 5789295.3170 7500857.7380 0.0000 Kredyt3.28a -151.9768 376.3191 0.0000 Kredyt3.29 5789298.8620 7500874.6180 0.0000 Kredyt3.29a -148.4337 393.2154 0.0000 Kredyt3.2a -63.0262 297.6930 0.0000 Kredyt3.3 5789369.8750 7500785.7170 0.0000 Kredyt3.30 5789303.2010 7500873.9300 0.0000 Kredyt3.30a -144.0905 392.5281 0.0000 Kredyt3.31 5789302.7240 7500869.9080 0.0000 Kredyt3.31a -144.5668 388.5023 0.0000 Kredyt3.32 5789307.5930 7500869.2210 0.0000 Kredyt3.32a -139.6932 387.8161 0.0000 Kredyt3.33 5789307.9110 7500871.6550 0.0000 Kredyt3.33a -139.3756 390.2524 0.0000 DATA2 ); my $listA = "data1\ndata2\n"; my @coords; #opens a list of the files that need processing open my $inFile, '<', \$listA or die "(L1)We've got a problem: $!"; while (defined (my $filename = <$inFile>)) { chomp $filename; next if !length $filename; push @coords, "Path: $filename\n"; #opens the actual file that will be processed open my $inData, '<', \$dataFiles{$filename} or die "(L2)We've got a problem: $!"; while (defined (my $line = <$inData>)) { next if $line !~ /\s+(-?\d+\.\d+)\s+(-?\d+\.\d+)/; #Append the coordinates (with the '-' sign where appropriate) push (@coords, "$1 || $2 \n"); } close ($inData); } close ($inFile); print @coords;

Prints:

Path: data1 5194.5700 || -6772.5200 5194.7400 || -6776.3200 5192.1000 || -6776.4300 5337.9000 || 6997.1200 5348.3300 || -7020.0900 5348.4400 || -7021.1100 Path: data2 5789322.3040 || 7500854.8800 -124.9646 || 373.4666 5789295.3170 || 7500857.7380 -151.9768 || 376.3191 5789298.8620 || 7500874.6180 -148.4337 || 393.2154 -63.0262 || 297.6930 5789369.8750 || 7500785.7170 5789303.2010 || 7500873.9300 -144.0905 || 392.5281 5789302.7240 || 7500869.9080 -144.5668 || 388.5023 5789307.5930 || 7500869.2210 -139.6932 || 387.8161 5789307.9110 || 7500871.6550 -139.3756 || 390.2524

True laziness is hard work

Replies are listed 'Best First'.
Re^2: Extracting coordinates
by Ignas (Novice) on Mar 21, 2010 at 11:02 UTC

    Woah mister, thank you greatly.

    I don't quite understand what some of this does, but I believe I'll figure it out with some docs and google.

    Again, thank you.

      Don't get too worried over the less obvious bits of the file handling stuff. Perl lets you use a variable as a file by using a reference to it in the open:

      my $stringBeingAFile = "This is the contents of the stringy file\n"; open my $inFile, '<', \$stringBeingAFile;

      which I use in the sample script to avoid having to create temporary files for demonstration purposes. In the current case it's slightly more complicated because you want to deal with multiple files so I use a hash that is pretending to be a directory of files (really a hash of file name => file content pairs).

      But, as I suggested, you needn't worry about that so long as you can see past the first 36 lines and ignore the "use a string as a file" syntax in the opens the remainder of the code is the important bit.


      True laziness is hard work
Re^2: Extracting coordinates
by wfsp (Abbot) on Mar 21, 2010 at 10:17 UTC
    In cases like this, imo, it may be worth declaring/compiling the regex outside of any loops.
    my $re = qr{\s+(-?\d+\.\d+)\s+(-?\d+\.\d+)}; # and then later, inside the loops next if $line !~ $re;
    You would need to do benchmarks to know whether it was actually worth it. The docs are a tad cirumspect. :-)
      You would need to do benchmarks to know whether it was actually worth it.

      No you wouldn't since there is no vars in the regex, it makes no difference.