physi has asked for the wisdom of the Perl Monks concerning the following question:

Brothers,
I'm currently 'perling' a classic perl task. But I'm not quite sure, if the is a much better way. Maybe some of you have another idea to do this:
I got a big file with about 20 different 'lists' in it. Each list must be broken into peaces, exactly by a given template. I got the 'coordinates' of the different Tag-Regions.
i.e.:
List 100:
W01 625 Z01 WWGZU Q19 0.343
S02 W222G02 XXXX9 B11 Mr.
        G03 uu 1      Adams
This has to be converted into :
W01 625
Z01 WWGZU 
Q19 0.343
S02 W222
G02 XXXX9 
B11 Mr.|Adams
G03 uu 1
So my idea was to take one whole list, put it into an $hash->{col}->{row} hash, and took out all the defined tags by there coordinates. This works, but it's a bit slow. The hash-filling took a lot of time, I guess:
my $line = 0; while (<FILE>) { my @linesplit = split //, $_; my $i = 0; for my $column (@linesplit) { $list->{$line}->{$i}= $column; $i++ } $line++; }
But maybe there is a much better way for this kind of task, or even a cpan-module?
Thanks for any suggestions.
----------------------------------- --the good, the bad and the physi-- -----------------------------------


updated by boo_radleyTitle change

Replies are listed 'Best First'.
Re: Extracting from a File
by tadman (Prior) on May 15, 2002 at 17:18 UTC
    When you are extracting from something that uses fixed length columns, sometimes unpack is the way to go. Using the example you have given, you can get the column data with something like this:
    my @cols = unpack ("A4 A4 A4 A6 A4 A5", $_);
    This is probably a lot easier than working with individual characters, which is what you get when using split (//).

    As for how you choose to organize your data, that's up to you.
      Thanks, but I have to work with individual chars, cause the problem is, that some 'tags' are like blocks. For example: starting in line 5, column 12, length 10 and continue on line 6, column 12, length 10 ,... and so on. So there is no chance to do it by an unpack :-(
      cheers
      ----------------------------------- --the good, the bad and the physi-- -----------------------------------
        I wouldn't dismiss unpack so soon. If you can develop a specification like the one you just said there, then you can use it. As an alternative, if things really are quite wacky, why not specify the position of elements in a hash and use that in conjunction with substr?
        my @frags = ( { line => 5, column => 12, length => 10, name => 'foo' }, { line => 6, column => 12, length => 10, name => 'foo' }, ); # ... my %var; # @chunks is an array of arrays, where each contains a # block of the file. foreach my $chunk (@chunks) { foreach my $frag (@frags) { push(@{$var{$frag->{name}}}, substr($chunk->[$frag->{line}], $frag->{column}, $frag->{length})); } }
        It will probably be a lot faster to use substr or unpack than to glue individual characters together, especially when you are pulling them out of a complex data structure and not just a string.