Grygonos has asked for the wisdom of the Perl Monks concerning the following question:

#!/Perl/bin/perl -w use strict; use IO::File; use constant{KB => 2**10, MB => 2**20, GB => 2**30}; print "Enter the pathname and filename of the file to split: "; my $filename = <STDIN>; my $valid_block_type; my $block_size; my $block_type; until($valid_block_type) { print "Enter the block size \n (example: 400KB,10MB,1GB): "; $block_size = <STDIN>; chomp($block_size); $block_size =~ m{([a-zA-Z]{2}$)}; $block_type = $1; BLOCK_TYPE: { $block_type eq 'KB' && do{$block_type = KB; $valid_block_type = 1; last BLOCK_TYPE;}; $block_type eq 'MB' && do{$block_type = MB; $valid_block_type = 1; last BLOCK_TYPE;}; $block_type eq 'GB' && do{$block_type = GB; $valid_block_type = 1; last BLOCK_TYPE;}; DEFAULT_BLOCK_TYPE: {print "Invalid block type specified!\n\n" +;} } }
Generates the following warning msg. the line numbers correspond to the regex checks in the code block.
Use of uninitialized value in string eq at C:\Documents and Settings\x +xx\Desktop \deasm.pl line 32, <STDIN> line 3. Use of uninitialized value in string eq at C:\Documents and Settings\x +xx\Desktop \deasm.pl line 36, <STDIN> line 3. Use of uninitialized value in string eq at C:\Documents and Settings\x +xx\Desktop \deasm.pl line 40, <STDIN> line 3.
I understand that Perl wants me to decalre these strings in variables and reference them that way... but can someone explain why this is a bad practice?

20030728 Edit by Corion: Retitled as per authors intention

Replies are listed 'Best First'.
Re: why is this code bad under -w option?
by Corion (Patriarch) on Jul 28, 2003 at 13:20 UTC

    The -w option generates warnings. They are not critical errors, but more a hint that some things might not be what you think they are. If you feel confident in your code, you can choose to ignore them, and your program will continue running as it did before, but in most cases, you get the warnings because of some deeper errors within your programming.

    In the program, perl warns you because you are trying to compare a scalar with no value to a scalar with a string value, and this happens in lines 32, 36 and 40. This is most likely, because one of the two sides of a comparision contains a scalar that was never initialized properly.

    In your case, you are trying to capture a match and then using the captured string, without checking whether the match actually succeeded :

    $block_size =~ m{([a-zA-Z]{2}$)}; $block_type = $1;
    A better way whenever you are matching things is :
    $block_size =~ m{([a-zA-Z]{2}$)} or die "I could not find the block type in the string '$block_si +ze'"; $block_type = $1;
    </code> because here, you can always be sure that $block_size was what you expected it to be and $block_type contains a meaningfull value.

    perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web
Re: why is this code bad under -w?
by halley (Prior) on Jul 28, 2003 at 13:22 UTC
    Use of uninitialized value in string eq at sourcefile, line 32.

    'Use of uninitialized value' means that there's a variable that has no value (is undef) when that's not appropriate. Without -w, undef is usually converted to '' as necessary, but with -w, it complains.

    Where is the complaint? Line 32 of your source code. Go there and see what you see. It might be an approximation, given the loose syntax of statements that take up multiple lines.

    What operator is complaining? 'string eq'. One side of the 'eq' is undef. $block_type eq 'KB' You should wager it's the side that is a variable, since the constant is never undef.

    What should you do about it? Go back and look at where you last modified the variable in question. What if $1 is undefined? What should you do?

    $block_size =~ m{([a-zA-Z]{2}$)}; $block_type = $1 || '';

    Your pattern captures any two adjacent letters at the end of the string. If there are not two adjacent letters at the end of the string, then $1 is undef. The above uses $1 if defined, or an empty string '' otherwise. You might decide to fall back on 'KB' as a default instead of an empty (invalid) block type.

    --
    [ e d @ h a l l e y . c c ]

      This code now generates no warnings... thanks for all your insight.. I thought it was griping because I used a quoted string for comparisons. I think this code has much better error checking after some insight from fellow monks... thanks for all your help. You guys are amazing as per usual.
      #!/Perl/bin/perl -w use strict; use IO::File; use constant{KB => 2**10, MB => 2**20, GB => 2**30}; print "Enter the pathname and filename of the file to split: "; my $filename = <STDIN>; #Format filename chomp($filename); $filename =~ s{\\}{\\\\}g; #Declare variables for use in until loop my $valid_block_type= 0; my $block_size = 0; #Generate a valid block size number until($valid_block_type) { #Declare local variables my $block_quantity = 0; my $block_type = 'B'; print "Enter the block size \n (example: 400KB,10MB,1GB): "; $block_size = <STDIN>; chomp($block_size); if($block_size =~ m{(^[0-9]+)}) { $block_quantity = $1; } if($block_size =~ m{([a-zA-Z]{2}$)}) { $block_type = $1; } BLOCK_TYPE: { $block_type eq 'KB' && do{$block_size = $block_quantity * KB; + $valid_block_type = 1; last BLOCK_TYPE;}; $block_type eq 'MB' && do{$block_size = $block_quantity * MB; $valid_block_type = 1; last BLOCK_TYPE;}; $block_type eq 'GB' && do{$block_size = $block_quantity * GB; $valid_block_type = 1; last BLOCK_TYPE;}; DEFAULT_BLOCK_TYPE: {print "Invalid block type specified!\n\n" +;} } if($block_quantity == 0) { print "Invalid block size specified!\n\n"; $valid_block_type = 0; } }
Re: why is this code bad under -w option?
by linux454 (Pilgrim) on Jul 28, 2003 at 14:37 UTC
    Here is another way of writing this code in perhaps a more concise way.

    #!/usr/bin/perl -w use strict; use IO::File; my %multipliers = ( 'KB' => 2**10, 'MB' => 2**20, 'GB' => 2**30, ); print "Enter the pathname and filename of the file to split: "; my $filename = <STDIN>; my $valid_block_type; my $block_size; my $block_type; until($valid_block_type) { print "Enter the block size \n (example: 400KB,10MB,1GB): "; $block_size = <STDIN>; chomp($block_size); ($block_type) = ($block_size =~ m/([a-zA-Z]{2})$/); $block_type ||= 'KB'; $block_type = uc $block_type; if( grep($block_type eq $_, keys %multipliers) ) { $block_type = $multipliers{$block_type}; $valid_block_type = 1; } else { print "Invalid block type specified!\n\n"; } }
    The reason this is better, is because now you only have a single branch of logic, and you can now control this logic in one place instead of two. Anytime you have the oppurtunity to control program flow/logic at a single point in the code, take it. This approach makes the code much more maintainable.
      Color me crazy, but why do you want to spend the time calculating 2**30 when you may not even use it?!? That's just a waste of time and space.

      Also, get rid of the $valid_block_type crud.

      use strict; use warnings; my %is_valid_type = do { my $i = 1; map { $_ => 10 * $i++ } qw( KB MB GB ) }; print "Enter the pathname and filename of the file to split: "; my $filename = <STDIN>; chomp($filename); my %block = map { $_ => undef } qw( size type amount ); while (1) { print "Enter the block size \n (example: 400KB,10MB,1GB): "; chomp($block{size} = <STDIN>); @block{qw(amount type)} = ($block{size} =~ m/^\s*([\d.]+)\s*([a-zA +-Z]{2})\s*$/); $block{type} ||= 'KB'; $block{type} = uc $block_type; unless ( $is_valid_type{$block{type}} ) { print "Invalid block type specified!\n\n"; next; } $block{size} = 2 ** $is_valid_type{$block{type}}; last; }
      This way, you defer all the calculation until you actually use it. You also get rid of excess variables and you can treat your block information as a whole instead of three related variables. (Anytime two things are related, make that relationship explicit with either an array or hash. Don't use parallel naming to tell your reader what's going on.)

      Also, you weren't storing the amount of KB that the user want to work with. Did you even write a spec for this? This code looks quite spec-less ...

      ------
      We are the carpenters and bricklayers of the Information Age.

      Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

      Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

        Color me crazy, but why do you want to spend the time calculating 2**30 when you may not even use it?!?

        % perl -MO=Deparse -e 'print 2**30' print 1073741824; -e syntax OK

        (1) Calculating 2 ** $n, where n is a human-scale integer, is ridiculously easy and cheap for computers to do. It's just a bit shifting operation.

        (2) Sub-expressions consisting of constants, such as (2**30) are computed during the compilation stage, not the runtime stage, as demonstrated above. In Perl this doesn't matter that much for runtime savings, since you must compile on every run, but why not use Perl's compiler to do the work instead of expressing yet more (possibly buggy) Perl code to be interpreted to accomplish the same calculation?

        (3) Yes, I said possibly buggy. In your first map definition, you define KB, MB and GB as consecutive multiples of ten, not consecutive powers of 1000. If a maintenance programmer would find the numeric literal or simple expression more readable than the code that derives those values, please, use the numbers!

        Update: Okay, you wanted 10, 20, 30 for KB, MB, GB, just to do a runtime 2**$e later. I still say this doesn't save resources and it doesn't make it more readable. But I'll grant it's what you intended, and not a bug. :)

        --
        [ e d @ h a l l e y . c c ]

        ok two things, first are you refering to my code, or the "improved code". I did store the amount the user wants to work with and the code works just fine. It may not be the best code ever and that's why I brought it here...constructive criticism...

        secondly.. no I didn't write a spec for this code, because it had to be done quick.. sue me

        Some of the things you said make sense and are good practice. There are nicer ways to say some of the other things you said. If I wanted lashings about code, I would have trolled a C++ board.
Re: why is this code bad under -w option?
by Skeeve (Parson) on Jul 28, 2003 at 13:21 UTC
    No. Not declare but initialize. You just do a $block_type=$1. But $1 is sometimes undef.
Re: why is this code bad under -w option?
by Not_a_Number (Prior) on Jul 28, 2003 at 18:25 UTC

    A bit late, I imagine, but here's another way of doing it if the aim, as it appears to be, is just to convert the input string into a number:

    use strict; use warnings; my %convert = ( KB => 10, MB => 20, GB => 30 ); my $num; { print "Enter the block size\n (example: 400KB,10MB,1GB): "; my $block_size = uc <STDIN>; print "Invalid block type specified!\n\n" and redo unless $block_size =~ /^(\d+)([KMG]B)$/; $num = $1 * 2 ** $convert{$2}; } print $num;

    dave

      thanks dave... that's a super cool way of doing it...

        Just after posting, I realised that I could make it more maintainable:

        my %convert = ( KB => 10, MB => 20, GB => 30 ); my $num; { print "Enter the block size\n (example: 400KB, 10MB, 1GB): "; my $block_size = uc <STDIN>; print "Invalid block type specified!\n\n" and redo unless $block_size =~ /^(\d+)([A-Z]B)$/ # Changed and defined $convert{$2}; # here $num = $1 * 2 ** $convert{$2}; } print $num;

        That way, it could be more easily be adapted to include terabytes, by just adding TB => 50 to the %convert hash, and then, a few months later :), the same for whatever comes after terabytes...

        dave

Re: why is this code bad under use strict?
by Grygonos (Chaplain) on Jul 28, 2003 at 13:10 UTC
    please forgive me I meant to title this post ...bad under -w option.