comment on

Hello hbm. Thanks for taking time to tear this apart and show me where I can tighten things up.

For /^(?:The|An?) / vs. /^(The|A|An) /, the only reason I can give you is I have not yet grokked extended patterns in perlre. I should stop capturing when all I want is a cluster to save memory. (The|A|An) is one of the first things I learned for writing regexes. I still have to force myself to use [] for single characters like [ _-] and [yt1] instead of () (( |_|-) and (y|t|1) respectively). Another thing, you not knowing that /(The|A|An)/ worked is far better than me not knowing how to use a whole section of perlre.

For my @abbr = $name =~ /(?:_|\b)(\w)/g; vs. a for loop and substr, all I can say it that this began while I was teaching myself substr and helping someone else get it at the same time one really early morning. Until two days ago, this subroutine was a lot tinier.

sub initials {
  my $name = shift;
  for my $word (split(/( |_)/,$name) {
    push @abbr, substr($word,0,1);
  }
  print join('',@abbr);
}
[download]

Two days ago I looked at it and decided to add a few things. Little things went through my head like...

What if the user wants periods after each initial?
What if the abbreviation is all caps in spite of the grammar rules making certain words lowercase in names and titles?
HTML has an abbr tag, so I'll just add it in just in case I want to use it in my HTML code later.

Also, I did not know that I could use a regex like that to split a scalar into a list. Until now all I knew was split.

For join('.',@abbr) . '.' vs. join('',map { $_ =~ s/$/./; $_; } @abbr), all I can say is that I overcomplicated it. I did think of join('.',@abbr) at first, then thought but that won't put a period at the end, I guess I'll have to map it. The idea of concatenating a period on the end of join('.',@abbr) did not even cross my mind. eeps.

Now onto your update. I see that you are directly modifying $opt{name} to remove articles instead of assigning it to another variable. When I am modifying a variable with a regex, I almost always assign it to another variable first to preserve the original. If you are getting the HTML for the abbreviation of "The International House of Pancakes", in the title= part of the HTML, you might want the article to be there. Also, I am not seeing the single word test in your code. If I am abbreviating musicians names, I do not think I want Bono, Cher, Madonna, or Sting returned as B, C, M, or S; but I would want Olivia Newton-John returned as ONJ. Am I misreading it?

I will update this post with other questions I may have. I need to study the code more.

Have a cookie and a very nice day!

Lady Aleena

In reply to Re^2: Getting abbreviations or initials by Lady_Aleena
in thread Getting abbreviations or initials by Lady_Aleena

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.