Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re: Re: Re: matching every occurrence of a regex

by ihb (Deacon)
on Jan 07, 2003 at 16:44 UTC ( [id://225022]=note: print w/replies, xml ) Need Help??


in reply to Re: Re: matching every occurrence of a regex
in thread matching every occurrence of a regex

What most other posters here didn't consider is that the whole sequence is stored in a string. This string might be read in from a file but doesn't have to be, so all these while (<FILE>) {...} approaches might not work.

If the real string differs from the given example and the lines contain e.g. more 'whitespace surrounded numbers', then the regex has to be massaged accordingly. Nevertheless the general idea should still work.

In the second quote you speak of general idea; in the first you complain over that people show general ideas and don't bother about details. One could easily have included your quote but slightly modified for each other reply: "If the input is gotten/stored in any other way, then the loop construct has to be massaged accordingly. Nevertheless the general idea should still work."

Many posts here are about general ideas (and that's good). It's silly to always have to point out that if it's read from a file it should use while (...) { ... } but if it's in some form of list it should be for (...) { ... }. The idea is still that the problem is solved by looping through every sequence. How you choose to do that is up to the final implementor. Imho, the questioneer should be skilled enough to know how to read a file line by line, or how to loop through an array.

Personally I would use the same approach as you did, but that's irrelevant right now.

What's more important to note is that none of the replies that used while (...) { ... } local()ized $_!

ihb

Replies are listed 'Best First'.
Re4: matching every occurrence of a regex
by Hofmator (Curate) on Jan 07, 2003 at 17:13 UTC

    Well, the original messages stated that the protein sequence is "just a string of letters". From that I drew the conclusion that the whole thing is stored in a scalar. On rereading, maybe that's not so obvious anymore and my misinterpretation.

    Concerning the localizing of $_, that's the right thing to do in certain circumstances. However, we are showing here snippets of code without context, presenting general ideas as you write yourself. We can't know whether it's reasonable to localize $_ or not, the author of the script has to decide that himself. You could point out, though, (as a general remark) that it might be a good idea to use local $_ if you are suspecting that the person asking might not be aware of that.

    Update:When I'm talking about 'certain circumstances' I'm thinking about short scripts, acting similar to a unix filter with one main while (<>) {} loop. Localizing makes (most of the times) no sense there. However, ihb makes a very valid point for the more general case below.

    I'd agree that we mostly agree ;-)

    -- Hofmator

      Concerning the localizing of $_, that's the right thing to do in certain circumstances.

      I'd argue that almost always that's the right thing to do. I'd say that it's an exception not to localize $_, and so I really think that it should be put in demonstrative code--especially code targeting not-too-advanced Perl programmers. I'd go as far as putting a "Just do it unless you understand why you usually should do it and have a good reason not to" mark on this issue.

      The result of not localizing $_ would in practice be to do $_ = undef, unless there's something that breaks out of the loop. The loop will continue until $_ is undefined. If you actually want $_ to be undefined after the loop then you're probably off better by explicitly undefining $_ after the loop. Or you probably should ask yourself why you want to explicitly undefine it.

      Afaik (but I have no safe source at the moment--and I started Perling right about v5.6's release so I have no own historical perspective), foreach didn't use to localize its associated variable. At some time the porters (or whoever it was) decided that foreach really ought to localize its variable. And I think everyone agrees that's a good thing. Why while wasn't given the same treatment I can only speculate about. I find it somewhat likely though that constructs like the one below could've had anything to do with it. (If such decision-making ever took place.)
      /pat/ or last while <FOO>; print;
      You could point out, though, (as a general remark) that it might be a good idea to use local $_

      ... and I sure will. ;) (Actually, this post isn't directly targeted towards you, as you seem to mostly agree with me.)

      This is such a serious topic it's even been mentioned in Sins of Perl Revisited.

      Personally I'm quite paranoid against modules that are likely to use this particular while loop. I always (unless I'm familiar with the author and knows that such mistakes aren't likely to happen) check the source to make sure that the module won't make me scratch my head for hours and hours because of a destroyed $_. Actually, I entertained myself for a little while looking for places of improper non-localization of $_ in my Perl installation. Out of 1076 scanned .pm files I found 28 that matched the very simple pattern /while \s* \(? \s* </x. Out of these 28 modules 13 (!) were not localizing $_ properly. I checked all 28 modules briefly to make sure that the match was indeed code (not in quotes, pod, or what-not), and I also tried to see if $_ was localized in the current subroutine, and if not if it was used in some particular way. (In Pod::Functions the code was at file scope, but I still marked it as improper use.) But I don't claim to be perfect, so some cases below might have legitimate reasons for not localizing $_.

      (It seems like the authors were consistent. Those that localized $_ did that for all whiles, whereas those that didn't localize for one case didn't localize for any case.)

      ihb

      Curious about the 13 modules?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://225022]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (8)
As of 2024-04-19 09:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found