On Removing Elements from Arrays

footpad has asked for the wisdom of the Perl Monks concerning the following question:

The apprentice tries to make certain he's drawing appropriate conclusions...

I was reading the various answers to this Q&A and recalled my earlier mediation with arrays. I also recalled that Camel3 noted some differences regarding delete and splice when used against arrays. So, I decided to poke at things for a bit.

Based on some experimentation, it appears the most appropriate answer the aforementioned Q&A might have been, "it depends on whether or not you want to keep that array element's slot in the array." Consider, for example, the following:

#! /usr/bin/perl -w
use strict;

my @ary = ( 1, 2, 3, 4, 5 );
my @alt = @ary;
my $ndx = getArrayIndex( 3, @ary );

# Here's how delete handles it
print "Delete - Before: ", arrayToString( @ary ), "\n";
delete $ary[ $ndx ];
print "Delete - After: ", arrayToString( @ary ), "\n";

print "\n";  # A touch of white space.

# Now, let's try splice
print "Splice - Before: ", arrayToString( @alt ), "\n";
splice( @alt, $ndx, 1 );
print "Splice - After: ", arrayToString( @alt ), "\n";

exit 1;


sub arrayToString
# --------------------------------------------------------------
# Joins an array's values without the warning for uninitized 
# elements.  Also, replaces undef values with an appropriate
# string.
# --------------------------------------------------------------
{
   my @values = @_;
   my $output = "";
   for (@values) 
   { 
      if ( defined( $_ ) )
         { $output .= $_ }
      else 
         { $output .= "undef" }

      # comment this out if you don't want trailing spaces.
      $output .= " ";  
   }
   return $output;
}

sub getArrayIndex
# --------------------------------------------------------------
# When given a scalar and an array, this either returns the 
# of the scalar in the array or -1 (not found).
# --------------------------------------------------------------
{
  my $value = shift;
  my @values = @_;
  my $result = -1;

  for ( my $index = 0; $index < @values; $index++ )
  {
    if ( $value eq $values[ $index ] )
    { 
      $result = $index;
      last;
    }
  }
    return $result;
}
[download]

If you run this, you'll note a couple of things. Deleting an array element simply undefs the value at that index, whereas splice removes it entirely. (Actually, I suspect that splice is creating a new array without that particular element.)

My questions are these:

Is this a fair understanding of things? Is there, perhaps, a better way?
I'm using arrayToString() to avoid the Use of uninitialized value in join warning that appears when you print @array; (and arrays contain undef values). Is this a reasonable (read: idiomatic) to do that?
Are there other issues I'm not taking into consideration?

Thanks in advance...

--f

Comment on On Removing Elements from Arrays Select or Download Code

Replies are listed 'Best First'.
Re: On Removing Elements from Arrays by japhy (Canon) on Sep 23, 2001 at 01:59 UTC
The `splice()` function slides elements of the array around as necessary. As far as question 2 goes, I'd use: `join "", map defined($_) ? $_ : "undef", @array;` [download] _____________________________________________________ Jeff`[japhy]`Pinyan: Perl, regex, and perl hacker. `s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;`	[reply] [d/l]
Re: Re: On Removing Elements from Arrays by danger (Priest) on Sep 23, 2001 at 06:00 UTC
Then again, depending on what you want to test for you may now also test whether it is undefined because it was assigned an undefined value (explicitly or implicitly) or because it was deleted (or never explicitly used in an assignment): `my @array; @array[3..9] = (3,4,undef,undef,7); delete @array[4,8]; print join"\n",map{defined $array[$_] ? "$_: $_" : exists $array[$_] ? "$_: undef-but-exists" : "$_: undef-not-exists"} 0 .. 10; __END__ OUTPUT: 0: undef-not-exists 1: undef-not-exists 2: undef-not-exists 3: 3 4: undef-not-exists 5: undef-but-exists 6: undef-but-exists 7: 7 8: undef-not-exists 9: undef-but-exists 10: undef-not-exists` [download] This output may seem confusing. 0-3 don't exist because they were never assigned anything. 4 and 8 don't exist because we deleted them. 5 and 6 exist because we explicitly assigned the undef value to them. Element 9 does exist because when you assign a smaller list to a larger slice, the remainder of the slice is implictly assigned undef values. 10 doesn't exist because we've never assigned anything to it (we are beyond the end of the array).	[reply] [d/l]
(Ovid) Re: On Removing Elements from Arrays by Ovid (Cardinal) on Sep 23, 2001 at 02:16 UTC
From `perldoc delete`: Deleting an array element effectively returns that position of the array to its initial, uninitialized state. Subsequently testing for the same element with exists() will return false. Note that deleting array elements in the middle of an array will not shift the index of the ones after them down--use splice() for that. See the exists entry elsewhere in this document. splice and delete are inherently different and many uses of delete should actually be a splice. So, to answer your first question, yes, I think you have a fair understanding of things (though I don't recall whether or not splice is creating a new array). Regarding whether or not `arrayToString()` is a Good Thing will really depend upon your programming needs. I would think carefully about using it because, essentially, you're throwing away information (the warning, in this case). Part of the standards that I have developed for my work state "all production code must run without warnings unless preapproved by the project lead". As for "idiomatic" ways to do what you want, I rewrote your two subroutines: sub array_to_string # -------------------------------------------------------------- # Joins an array's values without the warning for uninitized # elements. Also, replaces undef values with an appropriate # string. # -------------------------------------------------------------- { my @values = @_; my $output = ""; for (@values) { $output .= defined $_ ? $_ : "undef"; # comment this out if you don't want trailing spaces. $output .= " "; } return $output; } sub get_array_index # -------------------------------------------------------------- # When given a scalar and an array, this either returns the # of the scalar in the array or -1 (not found). # -------------------------------------------------------------- { my $value = shift; my @values = @_; for my $index ( 0 .. $#values ) { return $index if $value eq $values[ $index ]; } } [download] I think those are more "idiomatic", but that could just be a matter of taste. Also, I changed the sub names from studly caps as `thisIsHarderToRead than_this_is` :) Cheers, Ovid Vote for paco! Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.	[reply] [d/l]
Re: (Ovid) Re: On Removing Elements from Arrays by footpad (Abbot) on Sep 23, 2001 at 06:32 UTC
Regarding whether or not arrayToString() is a Good Thing will really depend upon your programming needs. I would think carefully about using it because, essentially, you're throwing away information (the warning, in this case). Part of the standards that I have developed for my work state "all production code must run without warnings unless preapproved by the project lead". Precisely why I asked. I'm more than aware that warnings and other messages are designed to let you know that there might be a better way to do things, one that doesn't generate the message. That's why I asked. My routine doesn't generate the warning, so I presumed it was a more appropriate solution. Would a different approach be better? I like your rewrites, but humbly submit that the different naming conventions are more of a stylistic convention rather than an idiomatic one. As far as I can tell, the underscore separation seems to be preferred by nix folks while Windows folks appear to prefer the camelCap convention. Now, I know that perlstyle submits: While short identifiers like $gotit are probably ok, use underscores to separate words. It is generally easier to read $var_names_like_this than $VarNamesLikeThis, especially for non-native speakers of English. It's also a simple rule that works consistently with VAR_NAMES_LIKE_THIS. I personally disagree with it, just as I disagree that the following is easier to read: `if (condition){ &task1(); &task2(); &task3(); } else { &task4(); &task5(); }` [download] Than this: `if ( condition ) { task1(); task2(); task3(); } else { task4(); task5(); }` [download] Now, many folks will disagree with me on that. That's fine. As long as we're each consistent in how we do things, I don't see there's much reason to quibble over presentation.¹ Remember: coding standards need to be flexible enough to evolve as the team does. They should not turn into straight-jackets that cannot adapt to new ideas or allow the programmer a little bit of creativity. Each programmer has a set of personal preferences and the wise standards-keeper will find ways of allowing each programmer to bring their skills and ideas to the table. Otherwise, you risk dissension because the programmer spends more time fitting (and frequently griping) into your vision of what good code looks like.² Getting back on track. I suppose a more relevant question might be: OK, you've generated a warning. Now what? (Me? I write a subroutine that tests for the condition the warning is trying to alert you to and then responds accordingly.) --f* ¹ - TIMTOWTDI ² - I once got marked down on a performance review because (in a different language that used IF..ENDIF as block keywords), I accidentally wrote: If ... EndIf, instead of the "standard" if...endIf. As you might guess, I'm a bit...uh...passionate in this point. And, yes, that was enough to adversely impact my increase that year. (As you might expect, I quit shortly thereafter.)	[reply] [d/l] [select]
Re (tilly) 3: On Removing Elements from Arrays by tilly (Archbishop) on Sep 23, 2001 at 22:13 UTC
About the stylistic note, Code Complete chapter 18.2 cites a study Program Indentation and Comprehensibility from 1983 which found that a 6 space indent resulted in lower comprehension than a 2 or 4 space indent did. But the same study reported that many of the people in the study self-evaluated the 6 space indent as being the easiest to read, even as they scored worse on it! (The same study found that an indent in the range 2-4 spaces was optimal. Which one you use doesn't matter, that you use one does.) Your pair of 3-space indents amounts to a 6-space indent. Exactly the combination that had people's self-reporting about usability conflicting with their actual performance. In my books aesthetic judgements lose to performance every time! Therefore I would suggest that you either change your overall layout style, or change from 3 characters at each step to 2. Steve McConnell goes on in 18.4 to say why he thinks that a double-indentation style is a bad idea due to the fact that the resulting code winds up looking far more complex than it is. However the actual research he found didn't really show whether or not there was a large real difference. Therefore while I agree with him that you are making the mind process a large amount of irrelevant structure at a glance, I wouldn't say that item is nearly as important as making your overall indentation smaller.	[reply]
(tye)Re: On Removing Elements from Arrays by tye (Sage) on Sep 24, 2001 at 00:10 UTC
Using delete and exists on array elements is something you should discourage in the vast majority of cases. The greatest impact it will have is confusion. The most likely cases you will see of this use is by people who don't understand what they are doing. When this was originally proposed, the majority of the Perl5 Porters were against it, many of them quite vehemently, including many of the porters that I most respect. The discussion of the issue was ended because Larry Wall said "Put it in. Blame me." (only slightly paraphrased). So I am obliging him on that second request. (: So now you can effectively store in each element of an array not only "any scalar value" (including undef) but also the newly standardized "non-existant" "value" which cannot be stored in a scalar variable. So if you want to make use of this new "meta value" and then you decide that you want to copy an array, you have to go to extreme lengths: `my @x; $x[6]= 1; print "\$x[1] exists? ", ( exists $x[1] ? "yes" : "no" ), $/; my @y= @x; print "\$y[1] exists? ", ( exists $y[1] ? "yes" : "no" ), $/; __END__ Produces: $x[1] exists? no $y[1] exists? yes` [download] I would really like the use of exists and delete on arrays to require a pragma, like `use existArray` (no particularly good names came to mind). But in the mean time, I'll "just say 'no'"! - tye (but my friends call me "Tye")	[reply] [d/l] [select]