dr_joe has asked for the wisdom of the Perl Monks concerning the following question:

Esteemed Monks,

This is a script I wrote to select only the five-letter and six-letter words from a list of words.

#!/usr/bin/perl #trimwords.pl #This script selects only the five-letter and six-letter words from a +list of words use warnings; use strict; open FH1, "all_words.txt" or die $!; open (FH2, '>>trimmed_words.txt') or die $!; my @all_words = <FH1> ; my $word_size; while (@all_words){ my $word_size = length($_); unless ($word_size==5 or $word_size==6){ shift(@all_words); }} print FH2 @all_words

For the last 24 hours, I have been banging my head on this error message:

Use of uninitialized value in length at trimwords_3.pl line 16, <FH1> line 224714.

Any guidance will be much appreciated by this novice. The following information may be useful:

The input file (all_words.txt) is 224714 lines long

The output file (trimmed_words.txt) comes out as empty as my wallet

I am running perl, v5.8.8 built for x86_64-linux-thread-multi

Thanks,
dr_joe

Replies are listed 'Best First'.
Re: stuck at "Use of uninitialized value in length at ..."
by toolic (Bishop) on Jun 04, 2009 at 16:52 UTC
    Change while to for because for sets $_, but while does not.

    Update: You can verify that by adding print "$_\n"; inside your while loop. See also Basic debugging checklist.

    You'll also want to chmop @all_words;

    Update: Here is a relevant doc link: Foreach Loops

    If VAR is omitted, $_ is set to each value.
Re: stuck at "Use of uninitialized value in length at ..."
by Sue D. Nymme (Monk) on Jun 04, 2009 at 16:58 UTC

    You never assign anything to $_. Therefore, it is uninitialized.

    "while" isn't suited for looping over an array. When looping over a filehandle, as:

        while (<$fh>)

    the while implicitly assigns each line to $_ in turn. However, in the following:

        while (@foo)

    the contents of the while expression are taken to be a boolean expression, which is true as long as there are elements in @foo, and doesn't assign anything to $_.

    You may want to use the foreach keyword:

        foreach (@foo)

    This will assign each element of @foo, in turn, to $_.

      You may want to use the foreach keyword:

      In general, that's a good solution. In this case, you'd end up with

      my @all_words = <FH1> ; foreach (@all_words) {
      which is a needless waste of memory. The following is much more appropriate:
      while (<FH1>) {
      which is short for
      while (defined($_ = <FH1>)) {
Re: stuck at "Use of uninitialized value in length at ..."
by johngg (Canon) on Jun 04, 2009 at 17:53 UTC

    Instead of using length, which counts the line terminator as Marshall discovered, you could test with a regular expression for 5 or 6 characters between beginning and end of string anchors. The task could be done as a one-liner like this.

    $ perl -ne 'print if m{^.{5,6}$};' all_words.txt >> trimmed_words.txt $

    Change the >> to > if you are not appending to the file. I hope this is of interest.

    Cheers,

    JohnGG

Re: stuck at "Use of uninitialized value in length at ..."
by scorpio17 (Canon) on Jun 04, 2009 at 17:08 UTC

    Here's my version:

    use strict; my $input = "words.txt"; my $output = "output.txt"; open my $ifh, '<', $input or die "can't open $input : $!\n"; open my $ofh, '>', $output or die "can't open $output : $!\n"; while (my $line = <$ifh>) { chomp $line; my $word_size = length($line); if ($word_size == 5 || $word_size == 6) { print $ofh "$line\n"; } } close $ifh; close $ofh;
Re: stuck at "Use of uninitialized value in length at ..."
by Marshall (Canon) on Jun 04, 2009 at 17:03 UTC
    Updated: use values of 6, 7 as below to account for \n in length().
    #!/usr/bin/perl -w use strict; #This script selects only the five-letter and six-letter words from a +list of words open FH1, "all_words.txt" or die $!; open (FH2, '>>trimmed_words.txt') or die $!; while (<FH1>) { if (length ==5 or length==6) {print FH2 }; }
    Ooops the deadly "off by one problem!". I guess length counts the "\n" at the end of line!
    while (<DATA>) { if (length ==6 or length==7) {print;} } __DATA__ 123 1234 12345 123456 1234567 12345678 ============== prints: 12345 123456
Re: stuck at "Use of uninitialized value in length at ..."
by dr_joe (Initiate) on Jun 04, 2009 at 22:42 UTC
    I would like to express my gratitude to all the learned monks who showed me the light in different ways. The use of a regex was a really nice touch.
Re: stuck at "Use of uninitialized value in length at ..."
by mje (Curate) on Jun 04, 2009 at 16:51 UTC

    What is on line 224714 of all_words.txt? Is it empty by any chance?