keraam has asked for the wisdom of the Perl Monks concerning the following question:

Ok I am new to perl and all and not exactly sure how to do this so I am here asking you all. I have a file which contains on each line something similar to 011349786:200:Singer:Samuel:6

And I want to break it apart check if the department number (the 200) is = 100 and then format it like

Singer Samuel 011349786 6
Heres what i have so far after looking through man perlre and seeing this method to parse
#!/usr/bin/perl -w open(EMPLOYEES,"employees") || die "Cannot Open"; while(<EMPLOYEES>){ if(/(..):(..):(..):(..):(..)/){ #Suppost to parse $_ $ID = $1; $DP = $2; $LN = $3; $FN = $4; $NM = $5;} print $ID; # print $DP; # print $LN; # So I can see if a value is there print $FN; # print $NM; # print $LN."\t".$FN."\t".$ID."\t".$NM; } } close(EMPLOYEES)
The errors i get are Use of uninitialized value at program line 15, <EMPLOYEES> chunk 26.

I get that for all lines i print out the values. Thanks for any help

Edit kudra, 2001-12-22 Changed title, added p breaks

Replies are listed 'Best First'.
(Ovid) Re: Parsing
by Ovid (Cardinal) on Dec 19, 2001 at 06:52 UTC

    Here's a quick test case to show you how this can be done:

    #!/usr/bin/perl -w use strict; while(<DATA>){ my ( $id, $dp, $ln, $fn, $nm ) = split /:/; print "$id $dp $ln $fn $nm\n"; } __DATA__ 011349786:200:Singer:Samuel:6

    Your code cannot be actual code because you have an extra '}' in there ($NM = $5;}). If you show us a fuller code fragment, along with a few more lines from your logs, I'm sure we can do something better. I suspect you have scoping issue.

    Cheers,
    Ovid

    Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Rescued by a Perl Superhero
by chip (Curate) on Dec 19, 2001 at 12:13 UTC
    This looks like a job for Array Slice Man!

    while (<EMPLOYEES>) { chomp; (my @f = split(/:/)) == 5 or next; print join("\t", @f[2, 3, 0, 4]), "\n"; }

    No, that's all right, citizen! Your gratitude is reward enough!

        -- Chip Salzenberg, Free-Floating Agent of Chaos

Re: Parsing
by hossman (Prior) on Dec 19, 2001 at 12:58 UTC
    Everybody else has posted some great comments on how to fix your code or do what you want in a cleaner way, but i don't think anyone has accutally explained the the "Use of uninitialized value" errors you've been getting, so here goes...

    That error comes about when you try to do something with a value which is "undef". In your case, because you were trying to match on (..) (exactly two characters) your regex was failing, thus: $1..$5 were all undef, thus: "Use of uninitialized value" everytime you try to print one of those variables.

Re: Parsing
by Vavoom (Scribe) on Dec 19, 2001 at 10:31 UTC
    First off, let me say that Ovid's approach is the correct one. Split is much nicer than a regex in this situation.

    That said, here's your code rewritten to do what you seem to have wanted:
    #!/usr/bin/perl -w use strict; my ($ID, $DP, $LN, $FN, $NM); open(EMPLOYEES,"<employees") || die "Cannot Open"; while(<EMPLOYEES>) { if(/(.*):(.*):(.*):(.*):(.*)/) { #Supposed to parse $_ $ID = $1; $DP = $2; $LN = $3; $FN = $4; $NM = $5; print $ID; # print $DP; # print $LN; # So I can see if a value is there print $FN; # print $NM; # print $LN."\t".$FN."\t".$ID."\t".$NM; } } close(EMPLOYEES)

    The (..) sequence you had only matches 2 characters, (.*) matches as many as possible within the constraints of the rest of the expression. The if statement now includes the print statements so they are only run if the values are accepted.

    Vavoom
Re: Parsing
by dragonchild (Archbishop) on Dec 19, 2001 at 19:56 UTC
    Parsing text is something Perl's really good at, so Perl has lots of ways to do it. Some of the more popular are:
    • split - Useful only if you have some sort of delimiter, like ':' or ','. If you do, then split is preferred. If you don't, then split is useless.
    • unpack - This is what you would use if you had some sort of fixed-width columns. Very, very powerful. (In fact, parsing is sort of a by-product of what unpack can do.)
    • A regex. Regexes will do every sort of line-by-line parsing that you could ever imagine doing (and then some). However, good luck reading what you did tomorrow. (This is where Perl got its reputation for being line noise.)

    ------
    We are the carpenters and bricklayers of the Information Age.

    Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.