Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: Regex to get file name from the path with spaces (updated)

by haukex (Archbishop)
on Sep 08, 2021 at 07:57 UTC ( [id://11136556]=note: print w/replies, xml ) Need Help??


in reply to Regex to get file name from the path with spaces

Don't use a regex, use File::Spec, which is a core module (or you can use e.g. Path::Class or Path::Tiny from CPAN). use File::Spec; my $filename = File::Spec->splitpath($fullpath); - if this script is running on a non-Windows system but you need to handle Windows filenames, write File::Spec::Win32 instead of File::Spec.

Update: After rereading I realize you're also trying to extract the path from a string that looks like "TEXT: \"...\"". I feel like we might be missing some context, because the file format is unclear to me - is this some standardized file format you're trying to extract a part of? If so, what format? Usually quoted strings also have some kind of escaping mechanism, is that the case here? All of this things will affect what the best solution is. If the format really is as simple as it seems, then I would combine Corion's solution to extract the string from the quotes with my suggestion above to extract the filename.

Replies are listed 'Best First'.
Re^2: Regex to get file name from the path with spaces
by szpt9m (Novice) on Sep 08, 2021 at 08:04 UTC
    Thanks for the response. I have many other lines to capture in the regex and here to make it easier i have given a part of the regex where i am facing issue. So i need a fix in the regex so that i can adapt the existing code

      Maybe you can think of your capture problem in a different way then. Most likely, you want everything after the last "path separator" up until the double quotes:

      TEXT:\s+".*?[\\/]([^\\/"]+)"

      The filename can contain everything except a path separator (\ or /) and double quotes, and must be followed by double quotes.

        Thanks This worked for me .*TEXT:\s+".*?[\\\/]([^\\\/"]+)"
        somehow it is not working for me or am i doing something wrong :( used it like this  .*TEXT:\s+".*?[\/]([\/"]+)" as \\ was not recognized and for my input TEXT: "C:\temp\test äbc.txt" there were no matches found

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11136556]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (4)
As of 2024-03-29 13:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found