in reply to Re: Getting abbreviations or initials
in thread Getting abbreviations or initials
Hello hbm. Thanks for taking time to tear this apart and show me where I can tighten things up.
For /^(?:The|An?) / vs. /^(The|A|An) /, the only reason I can give you is I have not yet grokked extended patterns in perlre. I should stop capturing when all I want is a cluster to save memory. (The|A|An) is one of the first things I learned for writing regexes. I still have to force myself to use [] for single characters like [ _-] and [yt1] instead of () (( |_|-) and (y|t|1) respectively). Another thing, you not knowing that /(The|A|An)/ worked is far better than me not knowing how to use a whole section of perlre.
For my @abbr = $name =~ /(?:_|\b)(\w)/g; vs. a for loop and substr, all I can say it that this began while I was teaching myself substr and helping someone else get it at the same time one really early morning. Until two days ago, this subroutine was a lot tinier.
sub initials { my $name = shift; for my $word (split(/( |_)/,$name) { push @abbr, substr($word,0,1); } print join('',@abbr); }
Two days ago I looked at it and decided to add a few things. Little things went through my head like...
Also, I did not know that I could use a regex like that to split a scalar into a list. Until now all I knew was split.
For join('.',@abbr) . '.' vs. join('',map { $_ =~ s/$/./; $_; } @abbr), all I can say is that I overcomplicated it. I did think of join('.',@abbr) at first, then thought but that won't put a period at the end, I guess I'll have to map it. The idea of concatenating a period on the end of join('.',@abbr) did not even cross my mind. eeps.
Now onto your update. I see that you are directly modifying $opt{name} to remove articles instead of assigning it to another variable. When I am modifying a variable with a regex, I almost always assign it to another variable first to preserve the original. If you are getting the HTML for the abbreviation of "The International House of Pancakes", in the title= part of the HTML, you might want the article to be there. Also, I am not seeing the single word test in your code. If I am abbreviating musicians names, I do not think I want Bono, Cher, Madonna, or Sting returned as B, C, M, or S; but I would want Olivia Newton-John returned as ONJ. Am I misreading it?
I will update this post with other questions I may have. I need to study the code more.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Getting abbreviations or initials
by hbm (Hermit) on Aug 21, 2012 at 00:54 UTC |