A Better Way to Find the Position of the Last Non-Whitespace Character in the Last Element of an Array.

NateTut has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: A Better Way to Find the Position of the Last Non-Whitespace Character in the Last Element of an Array. by moritz (Cardinal) on Apr 22, 2009 at 16:57 UTC
`$MessageLines[$#MessageLines] =~ m/^.*\S/sg; my $CursorCol = pos $MessageLines[$#MessageLines];` [download] Then substr can be used to obtain `$Prompt`.	[reply] [d/l] [select]
Re^2: A Better Way to Find the Position of the Last Non-Whitespace Character in the Last Element of an Array. by roubi (Hermit) on Apr 22, 2009 at 17:06 UTC
The code can also be simplified a bit like this too: `$MessageLines[-1] =~ m/^.*\S/sg; my $CursorCol = pos $MessageLines[-1];` [download]	[reply] [d/l]
Re^2: A Better Way to Find the Position of the Last Non-Whitespace Character in the Last Element of an Array. by MidLifeXis (Monsignor) on Apr 22, 2009 at 17:34 UTC
I would have used '-1' as the subscript. *Would* have. As of about 5 minutes ago I changed my mind as to the appropriateness of using -1 as a subscript. Consider the following blocks... `$[ = -4; @a = (qw(a b c d e f g h i j)); print $a[-1], "\n"; print $a[$#a], "\n"; __DATA__ d j` [download] and `$[ = 0; @a = (qw(a b c d e f g h i j)); print $a[-1], "\n"; print $a[$#a], "\n"; __DATA__ j j` [download] There is some DWIMmery with the `-1` option that could cause problems, especially if you are working on old, or someone else's code of unknown usage or localization of the `$[` variable. And yes, I have read the disclaimers not to use the `$[` variable in the fine manual. I didn't say that I would use the variable. :-) Update: This is perl, v5.8.8 built for PA-RISC2.0 Comparison Update #2: Comparison of perl binary available on this machine: `for x in 100 1 0 -1 -6 -7 -11 -12 -99; do perl -le "\$[=$x; my @ra=qw(a b c d e f); print \$[, ': -1:' , \$r +a[-1], ': $#: ', \$ra[\$#ra]" done` [download] This is perl, version 5.005_02 built for PA-RISC1.1 `100: -1: f: $#: f 1: -1: f: $#: f 0: -1: f: $#: f -1: -1: f: $#: f -6: -1: f: $#: f -7: -1: f: $#: e -11: -1: f: $#: a -12: -1: f: $#: -99: -1: f: $#:` [download] This is perl, v5.8.8 built for PA-RISC2.0 `100: -1: f: $#: f 1: -1: f: $#: f 0: -1: f: $#: f -1: -1: a: $#: f -6: -1: f: $#: f -7: -1: : $#: e -11: -1: : $#: a -12: -1: : $#: -99: -1: : $#:` [download] Conclusion: Don't use `$[` as anything other than 0 (as presented by a couple of other monks so far. --MidLifeXis The tomes, scrolls etc are dusty because they reside in a dusty old house, not because they're unused. --hangon in this post	[reply] [d/l] [select]
Re^3: A Better Way to Find the Position of the Last Non-Whitespace Character in the Last Element of an Array. by AnomalousMonk (Archbishop) on Apr 23, 2009 at 00:34 UTC
Interesting. Running the same test on my ActiveState Win32 Perl 5.8.2 build 808, I get results indicating that `-1` more reliably accesses the last element of an array. The following one-liner was run repeatedly with `$[` initialized to different values: `>perl -wMstrict -le "$[ = 100; my @ra = qw(a b c d e f); print '$[: ', $[, ' -1: ', $ra[-1], ' $#: ', $ra[$#ra];" $[: 100 -1: f $#: f` [download] Output of successive runs of the one-liner: `$[: 100 -1: f $#: f $[: 1 -1: f $#: f $[: 0 -1: f $#: f $[: -1 -1: f $#: f $[: -6 -1: f $#: f $[: -7 -1: f $#: e $[: -11 -1: f $#: a Use of uninitialized value in print at -e line 1. $[: -12 -1: f $#: Use of uninitialized value in print at -e line 1. $[: -99 -1: f $#:` [download] Perhaps more support for the *don't do that* part of the discussion in the docs about assigning to `$[` a value other than `0`? (BTW, the latest perlvar sez wrt `$[` that As of release 5 of Perl, assignment to $[ is treated as a compiler directive, and cannot influence the behavior of any other file, so old code that plays too fast and loose with `$[` may be broken anyway.)	[reply] [d/l] [select]
Re^3: A Better Way to Find the Position of the Last Non-Whitespace Character in the Last Element of an Array. by AnomalousMonk (Archbishop) on Apr 28, 2009 at 22:54 UTC
I think `$[` was originally intended to allow use of either 0-based or 1-based arrays, and then things kinda got out of hand. Sanity is slowly being restored.	[reply] [d/l]
Re^4: A Better Way to Find the Position of the Last Non-Whitespace Character in the Last Element of an Array. by MidLifeXis (Monsignor) on Apr 29, 2009 at 15:05 UTC
Re^2: A Better Way to Find the Position of the Last Non-Whitespace Character in the Last Element of an Array. by NateTut (Deacon) on Apr 22, 2009 at 21:16 UTC
Thanks, I tried pos originally on my regex but it ket coming back with 20, which makes sense now that I think about it.	[reply] [d/l]
Re: A Better Way to Find the Position of the Last Non-Whitespace Character in the Last Element of an Array. by Fletch (Bishop) on Apr 22, 2009 at 17:07 UTC
`rindex( $MessageLines[ -1 ], ':' )` comes immediately to mind. If that's -1 there's no ':' in it (but then reading the rindex docs would tell one that). The cake is a lie. The cake is a lie. The cake is a lie.	[reply] [d/l]
Re^2: A Better Way to Find the Position of the Last Non-Whitespace Character in the Last Element of an Array. by NateTut (Deacon) on Apr 22, 2009 at 21:18 UTC
I originally tried rindex, but the last non whitespace character won't always be a colon unfortunately.	[reply] [d/l]
Re^3: A Better Way to Find the Position of the Last Non-Whitespace Character in the Last Element of an Array. by Fletch (Bishop) on Apr 22, 2009 at 21:24 UTC
Aaah, I missed that qualification (got stuck on the specific example). `$idx = $+[1] if $MessageLines[-1]=~/(\S)\s*$/` then. The cake is a lie. The cake is a lie. The cake is a lie.	[reply] [d/l]
Re: A Better Way to Find the Position of the Last Non-Whitespace Character in the Last Element of an Array. by kyle (Abbot) on Apr 22, 2009 at 17:06 UTC
You could easily eliminate a variable: `my @MessageLines = ( 'Message Line 1', 'Message Line 2', 'Message Line 3', 'Prompt: ' ); if ( $MessageLines[-1] =~ m{ \A ( .* \S ) \s* \z }xms ) { my $CursorCol = length $1; print "\$CursorCol\[$CursorCol\]\n"; }` [download] If all you want is a one liner, this works: `my $CursorCol = grep { /\S/ .. undef } reverse split //, $MessageLines[-1];` [download] I think what you have is easier to understand, however.	[reply] [d/l] [select]
Re: A Better Way to Find the Position of the Last Non-Whitespace Character in the Last Element of an Array. by bichonfrise74 (Vicar) on Apr 22, 2009 at 21:17 UTC
Try this... `#!/usr/bin/perl use strict; my @MessageLines = ('Message Line 1', 'Message Line 2', 'Message Line3 +', 'Prompt: '); $MessageLines[-1] =~ s/\s+\|\s//; my $last_char = substr( $MessageLines[-1], -1 ); print $last_char;` [download]	[reply] [d/l]
Re: A Better Way to Find the Position of the Last Non-Whitespace Character in the Last Element of an Array. by AnomalousMonk (Archbishop) on Apr 22, 2009 at 23:08 UTC
It is also possible to take advantage of the @LAST_MATCH_START (in English, `@-` in Perlish) array (see perlvar) associated with the capture group variables `$1`, `$2`, etc. `>perl -wMstrict -le "$ARGV[$#ARGV] =~ m{ (\S) \s* \z }xms; my ($lastchr, $lastpos) = defined $1 ? ($1, $-[1]) : ('absent', -1); print '----- output -----'; print qq{last non-ws char is '$lastchr' at index $lastpos}; " "Message Line 1" "Message Line 2" "Message Line 3" "Prompt: " ----- output ----- last non-ws char is ':' at index 6` [download] Update: Changed link to use English version of `@-` to avoid rendering problems.	[reply] [d/l] [select]
Re: A Better Way to Find the Position of the Last Non-Whitespace Character in the Last Element of an Array. by johngg (Canon) on Apr 23, 2009 at 09:44 UTC
How about a look-ahead containing a capture? `$ perl -le ' > @strs = ( q{abcd }, q{abcdef}, q{ } ); > print qq{String: >$_< - }, > m{(?=(\S)\s*$)}g > ? qq{found $1 at offset @{ [ pos() ] }} > : q{no match} > for @strs;' String: >abcd < - found d at offset 3 String: >abcdef< - found f at offset 5 String: > < - no match $` [download] I hope this is of interest. Cheers, JohnGG	[reply] [d/l]