in reply to file name parsing to get the original file name

you can also do the following (maybe a little less efficient way but at least you won't need to require another module)
my $filename = 'xxx/yyy/zzz.a'; my ($tmp1,$tmp2,$name) = split(/\//,$filename); print "$name";
$name from above will now contain zzz.a

Replies are listed 'Best First'.
Re: file name parsing to get the original file name
by Abigail-II (Bishop) on Aug 19, 2003 at 14:11 UTC
    That's not very flexible. I can understand not needing to care about separators from different platforms, but your approach only works if you have two directories and then the file. It will fail if there's just one directory, or three.

    my $filename = "one/two/three/four/five/six.a"; my $name = (split m{/} => $filename) [-1];

    Abigail

      If we are trying for 'best UNIX-only solution that requires no modules', I vote for:

      my($name) = $path =~ /([^\/]+)\z/;

      I second Abigail-II's suggestion that a module is used, though, as these sorts of problems are generic in nature, and it is very scary to see hundreds of different solutions to the same problem, each with their own independent set of failings.

      At least if a single module is used by everybody, then the code is being excercised in a higher percentage of the possible contexts, and problems will be fixed sooner, rather than being discovered much later.

      UPDATE: Optimizing the above expression, we can see the speed improve by a factor of 6:

      $path =~ /(?:.*\/)?(.+)/s; my $name = $1;

      It seems that the Perl regular expression engine does a poor job of dealing with matching a pattern at the end of a string. This is not surprising given that most regular expression engines start searching from the beginning of the string.

        Enter sexeger.

        Add this to Abigail's benchmark.

        aristotle => 'foreach my $f (@files) { my ($fn) = reverse($f) =~ m!^(.*?)/?!s; $fn = reverse $fn; }',
        
                      Rate     markm   abigail    markm2 aristotle
        markm      39625/s        --      -56%      -58%      -61%
        abigail    89688/s      126%        --       -4%      -11%
        markm2     93877/s      137%        5%        --       -7%
        aristotle 100885/s      155%       12%        7%        --
        
        Reversing the string (twice!) may be costly, but the simplicity of the regex offsets this. Note that [^/]+ would have been much slower. .*? has been treated to special optimizations.

        Makeshifts last the longest.

        Some quick benchmarking shows your solution to be about half as fast compared to mine.
        #!/usr/bin/perl use strict; use warnings; use Benchmark qw /cmpthese/; our @files = qw { /etc/passwd one/two/three/four/five/six.a file a/very/deep/file/indeed/deeper/than/you/may/think/really }; cmpthese -5 => { abigail => 'foreach my $f (@files) { my $fn = (split m{/} => $f) [-1] }', markm => 'foreach my $f (@files) { my ($fn) = $f =~ /([^\/]+)\z/ }', }; __END__ Benchmark: running abigail, markm for at least 5 CPU seconds... abigail: 5 wallclock secs ( 5.19 usr + 0.00 sys = 5.19 CPU) @ 68 +404.24/s (n=355018) markm: 6 wallclock secs ( 5.23 usr + 0.00 sys = 5.23 CPU) @ 34 +427.53/s (n=180056) Rate markm abigail markm 34428/s -- -50% abigail 68404/s 99% --

        Abigail

        Wouldn't it be easier just to do:

        $path=~/.*\/(.*)$/;$name=$1;

        Basically nuke everything in the way and grab what's left??