in reply to Re: Function for reading file
in thread Function for reading file
How can i use it without the word regex \W ?
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Function for reading file
by davido (Cardinal) on Mar 28, 2020 at 23:36 UTC | |
Is your goal really to match every character in the ASCII character set except for 0-9 and _ (underscore)? I couldn't suggest a better alternative without knowing exactly what you want. However, again if you're only working with ASCII and not Unicode, these expressions do the same thing, in slightly different ways: Of these options the second one is certainly the easiest to read. But my point in my previous post was that I'm doubtful this is exactly what you want. It seems very suspect to allow matching \t, \n, (, ), ^, -, A, B, z, ' '(space), ','(comma), and so on, but to disallow any numeric digit and the underscore. It doesn't seem like it's doing what you want it to be doing. But you didn't make clear to me what it is that you actually want to do. Furthermore, if you are dealing with Unicode semantics, the number of characters that are matched by that pattern is enormous, and even weirder. If you suggested what you're trying to match we might be able to help come up with a more specific expression. Dave | [reply] [d/l] [select] |
|
Re^3: Function for reading file
by AnomalousMonk (Archbishop) on Mar 29, 2020 at 01:42 UTC | |
... the word regex \W ... It's important to understand what you're dealing with. The \W (that's big-W) character class (see perlrecharclass) matches any character that is not a \w (little-w) character. The \w characters are sometimes called "word" characters, but IIRC they originate with the set of characters that are allowed in a C- or Perl-language identifier; that's why _ (underscore) is included, but - (hyphen), for instance, is not. So \W is better described as the anti-word regex! And I agree with davido's point here that if [A-Za-z\W] really does the trick for you, then [^_\d] is more clear, readable, maintainable, and IMHO preferable. Update: Made "identifier" into a Wikipedia link. Give a man a fish: <%-{-{-{-< | [reply] [d/l] [select] |