powerhouse has asked for the wisdom of the Perl Monks concerning the following question:

I am parsing a database field, that all the lines are like this:
22009^1^52.90 22010^1^42.90 22011^1^32.90 # Always a empty line at the end...
I then have the script get the field... in the variable $itDetail Then I have it split it to get the line:
while($itDetail) { (my $_line, $itDetail) = split /\015\012/, $itDetail, 2; ...
Then inside of that while statement, I am getting each of the 3 details I need...
(my $Num, $_line) = split /\^/, $_line, 2; my ($Qt, $_trashit) = split /\^/, $_line, 2; $_line = ""; # Reset the variable, just for my sake. # Do some things.... }
Now my problem is that $Num has a value of Num^Qt^xx.xx

It is NOT spliting those ones for me, each variable just has the whole value of the orginal $_line

Is there a known problem in trying to split "^"?

I know it's doing that because I inserted a piece of code to have it print each field for me, like this:
open(FH,">>/some/path/debug.txt") or die "could not open file for writ +ing: $!"; print FH '$Num = ', $Num, "\n"; close(FH);
That is how I figured out it is having a problem.

Should I just change the ^ to something different? I have tried it escaped and just as ^ in the split // field. Both have the same problem.

Thanks for any advice you have for me.

thx,
Richard

Replies are listed 'Best First'.
Re: split problem...
by graff (Chancellor) on Sep 27, 2004 at 06:59 UTC
    If you're starting with a multi-line string (1^a^x\r\n2^b^y\r\n3^c^z\r\n...) stored in a scalar variable ($itDetail), then the easier way would be:
    open( FH, ">/some/path/debug.txt" ) or die "can't write debug.txt: $!" +; for my $_line ( split /[\r\n]+/, $itDetail ) { my ( $Num, $Qt, $_trashit ) = split /\^/, $_line; print FH "Num = $Num\n"; # do other things... }
Re: split problem...
by tachyon (Chancellor) on Sep 27, 2004 at 06:40 UTC

    You are making an easy task look hard. It isn't. Try this:

    while(<DATA>) { s/\s+$//; # trim off any \015 chars on EOL my ($Num, $Qt, $_trashit) = split /\^/, $_; print "$Num || $Qt || $_trashit\n"; # blah } __DATA__ 22009^1^52.90 22010^1^42.90 22011^1^32.90

    Just replace DATA with the database filehandle.

    Now the split on ^ will work so if it *apparently does not* it is because your data is not what you think it is. Your debugging code is not adequate as you want to really want to look at the data you are trying to process (ie the line) as well. You can save typing effort simply by using warn ie warn "Got ($line)\n". If you want it in a file just do ./script.pl 2>debug.log to redirect STDERR to a file.

    cheers

    tachyon

      He/she has multiple lines in a single database field; I don't see how your code applies to this situation.

      I actually thought the:

      while ($input) { ($field, $input) = split /delim/, $input, 2; # process $field }
      was a very interesting idiom, and one I hadn't seen before.

        Perhaps I should have read a little closer, although the cause of his issue is not immediately evident. I agree it is an unusual idiom. While the use in the loop could be rationalised the use in the two splits on ^ seems pretty dubious at first glance. This would be pretty usual sort of syntax assuming the data is in blocks as you suggest.

        local $/ = ""; # set paragraph mode to read one record at a time while (my $record = <DATA>) { for ( split /[\n\r]+/, $record ) { my ($Num, $Qt, $_trashit) = split /\^/, $_; print "$Num || $Qt || $_trashit\n"; } } __DATA__ 22009^1^52.90 22010^1^42.90 22011^1^32.90 22009^1^52.90 22010^1^42.90 22011^1^32.90

        cheers

        tachyon

Re: split problem...
by TedPride (Priest) on Sep 27, 2004 at 06:50 UTC
    Oddly enough, when I tested your code with the following:
    $itDetail = '22009^1^52.90' . "\015\012" . '22010^1^42.90' . "\015\012" . '22011^1^32.90' . "\015\012"; while($itDetail) { (my $_line, $itDetail) = split /\015\012/, $itDetail, 2; (my $Num, $_line) = split /\^/, $_line, 2; my ($Qt, $_trashit) = split /\^/, $_line, 2; print "$Num $Qt $_trashit\n"; }
    I got:
    22009 1 52.90 22010 1 42.90 22011 1 32.90
    So it's working for me. An alternate, neater method of doing this would be, however:
    $itDetail = '22009^1^52.90' . "\015\012" . '22010^1^42.90' . "\015\012" . '22011^1^32.90' . "\015\012"; while ($itDetail =~ /(\d+)\^(\d+)\^(\d+\.\d+)\015\012/g) { print "$1 $2 $3\n"; }