Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re: Question regarding a regex

by Anonymous Monk
on Jul 22, 2021 at 19:26 UTC ( [id://11135317]=note: print w/replies, xml ) Need Help??


in reply to Question regarding a regex

No, the caret only negates the character set if it appears immediately after the left square bracket. Outside square brackets, it matches the beginning of the string or immediately after a newline. Also: the  -~ sequence inside square brackets means 'any character between space and tilde, inclusive.'

So in words, the regular expression specifies 'all characters in the first 4096 (or end-of-file, whichever comes first) are "\r", "\n", "\t", or characters in the range space to tilde, inclusive, in your machine's native encoding'.

Off-topic comments:

  • The -T operator will tell you if your file is ASCII/UTF-8. Just say return -T $_[0];.
  • The special file handle _ (i.e. underscore) means "whatever file was last tested" under any recent Perl, and can be faster because it makes use of the same stat() structure.
  • The three-argument form of open() is preferred because it handles file names with strange characters better. In your case it would be open FH, "<", $_[0]
  • You should probably ensure that your open() succeeded. Something like open FH, "<", $_[0] or die "Failed to open $_[0]: $!"; is the usual idiom.
  • People usually use lexical file handles rather than bareword file handles these days because they get closed automatically when they go out of scope (say, if you throw an unexpected exception). In your code that would look like open my $fh, "<", $_[0] ....

Replies are listed 'Best First'.
Re^2: Question regarding a regex
by AnomalousMonk (Archbishop) on Jul 23, 2021 at 00:05 UTC
    Outside square brackets, it [caret] matches the beginning of the string or immediately after a newline.

    By default, ^ (caret) outside a character class matches only at the beginning of a string, exactly as \A does. Caret (outside a character class) also matches immediately after an embedded newline if the /m modifier is asserted. See Modifiers in perlre.


    Give a man a fish:  <%-{-{-{-<

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11135317]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (5)
As of 2024-03-29 11:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found