comment on

Perl gives you many facilities for processing strings and it is generally (but not always) not idiomatically good to use low-level routines like 'index'.

Often processing using regular expressions is more suitable.

For example, code like (not tested, you get the idea):

# Pull token from line (Not good perl style)
$end = index( $line, / /, 0 );
$token = substr( $line, 0, $end );
$line = substr( $line, $end ); # still need to lose whitespace
#
[download]

You could instead do...:

( $token, $line ) = split( /\s+/, $line, 2 );
[download]

or even:

( $token, $line ) = $line =~ /^(\S+)\s+(\S+)$/;
[download]

It perhaps looks a little funnier if you are used to other languages and you are of course free to code how you want. But, IMHO you don't get the warm fuzzy perl feeling unless you are in the idiom.

Now someone is going to tell me how I should be doing the above in a much more efficient way ;-)

Hmmm...last thought is that if you don't have to deal with quoting issues (or you do and you are red hot at regexps ;-) you get to do things like:

@tokens = split( $line, /\s+/ );  # Split all words into the array
foreach my $word ( @words ) {
  # drive state machine
}
[download]

And lastly if you do have to deal with quoted whitespace etc (isn't the real world a tough place) you might find what you need in the Text::ParseWords module. (You shouldn't even have to go to CPAN for that...its part of the base install).

In reply to Re: Character Index by jbert
in thread Character Index by wyvern

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.