lampros21_7 has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks. Am struggling to understand whats happening here. I have an array of about 50 elements that contain strings inside them. Each element is made up of a many sentences together. I know what is inside and i know 4 of those elements start with the words "The website has been restructured". Now, i have got a loop and a regex to remove the elements of the array that start with these words. For some reason it removes 2 of the elements and it leaves the other 2. I have tried various things like removing 1 from my $value variable when i meet such a case or remove the ^ so that whenever it would meet this in the string it would remove the element but still nothing. For some reason only 2 of the 4 elements are removed. I don't know if this will be of any help but the 4 elements are one after another. Your help will be greatly appreciated
$value = 0; foreach my $text2(@text) { if ($text2 =~ m/^The website has been restructured/) { splice(@text,$value,1); } $value++; }

Replies are listed 'Best First'.
Re: Regexp works with some elements and doesn't with others
by rhesa (Vicar) on Feb 07, 2006 at 01:23 UTC
    Modifying the array while looping over it can cause strange behaviour. It upsets the internal iterator. Can you verify that it skips every second item of two consecutive occurences?

    I suggest you use grep instead:

    @text = grep { !m/^The website has been restructured/ } @text;
Re: Regexp works with some elements and doesn't with others
by Fletch (Bishop) on Feb 07, 2006 at 01:28 UTC

    Yup, diddling what you're iterating over with foreach is asking for trouble. Quoth perldoc perlsyn:

    If any part of LIST is an array, "foreach" will get very confused if you add or remove elements within the loop body, for example with "splice". So don't do that.
      Okay, i understand that.

      Another thing is that i have another array of which i would want to remove the element in the same position as the one i remove the one from text.

      $value = 0; foreach my $text2(@text) { if ($text2 =~ m/^The website has been restructured/) { splice(@text,$value,1); splice(@links,$value,1); } $value++; } }

      Is there a way to do this with a loop? Cause i dont think it would be possible to know with grep which element is the one that needs removing. Thanks

        Use an index explicitly instead of the easily broken, implicit one you've coded. It's important to iterate in reverse so your index will still make sense after removing data. That you have these two arrays where elements exist in pairs indicates that your data structure is mismatched. This would be better served by having an array of pairs.

        foreach my $ix ( reverse 0 .. $#text ) { if ( not length $text[ $ix ] ) { splice @text, $ix, 1; splice @links, $ix, 1; } }

        The previous is optimal on very recent perls and less so on anything older. This is a great reason to use a three-arg for loop.

        for ( my $ix = $#text; $ix >= 0; -- $ix ) { # same loop contents. }

        ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

        Something I use sometimes is the Data::Dumper module. It helps especially with complex lists, e.g. hashes of hashes of hashes, but the same concept applies here. By using it, you can literally see how contents of variables change, or even if the original values are what you expect them to be.

        In this case, for example, you could modify the code to be -

        use Data::Dumper; .... <snip> .... if ($text2 =~ m/^The website has been restructured/) { print "text array before is " . Dumper(\@text, $value); splice(@text,$value,1); print "text array after is " . Dumper(\@text, $value); splice(@links,$value,1); }

        This way, you can see for yourself how @text is changing, and how it's different from what you expect. Note that you have to pass a reference to a list to the Dumper function call, since @text is one of two elements in the list of things you want to show.

        Although you have to go through and uncomment/delete the Dumper lines once you've got it working, if you like actually 'seeing' changes as they happen, this may work very well for you, especially once you start using complex data structures.

        Well ... wrong assumption!

        you can do it with 'grep' with the help of 'map':

        my @tmp = grep { $_->[0] !~ /^The website has been restructured/ } map + { [$text[$_], $arr2[$_], $_] } ( 1 .. $#text ); my @text = map { $_->[0] } @tmp; my @arr2 = map { $_->[1] } @tmp; undef @tmp;
        note - this way you create temporary structures (map & grep), so I wouldn't use it on large arrays, but since you wrote it's a 50 elements array ... I guess you can consider this sollution.

        Enjoy,
        Mickey