in reply to Read a list of files into an array

Dearest monks, how do I read a list of text files from a particular directory into an array?

you can use the glob function:

perl -e 'my @arr=glob ("*.*");print "$_\n" for (@arr)'

HTH

citromatik

Update: Or even more compressed

perl -e 'print "$_\n" for (glob ("*.*"))'

Update2: And lets add the test fot text files:

perl -e 'for (glob ("*.*")){print "$_\n" if (-T $_);}'

Replies are listed 'Best First'.
Re^2: Read a list of files into an array
by halley (Prior) on Jun 22, 2007 at 16:22 UTC
    For portability, you should get out of the habit of thinking of "*.*" as meaning "all entries in a directory." Just use "*" instead.

    On Un*x varieties including Linux and MacOS X, the period is just another character which might be in a filename. On those platforms, "*.*" finds only those entries that just happen to include at least one period.

    Windows thankfully expects "*" to mean "*.*", so using a single star parameter to glob() is portable on Windows and Un*x varieties both.

    my @subdirectories = grep { not /^\.\.?$/ } grep { -d } glob("*");
    my @files = grep { -f } glob("*");
    There is also a -T check which sniffs the head of a file for anything that doesn't smell like text, but I tend not to trust that kind of contents-check. It's not sufficiently clear if it would say true or false for a few UTF-8 or Latin high-bit characters, which I would still call a "text" file.

    --
    [ e d @ h a l l e y . c c ]

      Funny how you and Argel are slicing the Camel's hair over glob and file systems :-)

      I would say, for portablitiy "get out of the bad habit of using glob" to get a list of files, since glob - what do the docs say?

      In list context, returns a (possibly empty) list of filename expansions on the value of EXPR such as the standard Unix shell /bin/csh would do. In scalar context, glob iterates through such filename expansions, returning undef when the list is exhausted.

      There. That dratted csh - glob clearly has a UNIX bias. To know what glob really does, you have to know how the filename expansion of csh works - it uses the shell's wildcard syntax to filter filenames, not perl regular expressions. That's enough reason for me to almost never use glob.

      So - for sake of portability use opendir and readdir.

      --shmem

      _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                    /\_¯/(q    /
      ----------------------------  \__(m.====·.(_("always off the crowd"))."·
      ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
      Except *.* is not portable either as it does not match UNIX hidden files -- files and directories that begin with a dot. In shells I usually use '.??*' to hit those (where the '??' is a kludge to make sure I avoid '.' and '..' -- the current and parent directory).
        The glob() won't find hidden files on Windows, either. Thus, the concepts are portable. If you need hidden files also, you should use an opendir() approach instead, and this goes for all platforms.

        I see the shell hack ".??*" a lot, but it has a nasty bug in there: it ignores directories or files that have names like ".a" or ".3". While this may seem like a rare case, there's a huge difference between making something that will break rarely, and making something that will work completely consistently. Just imagine trying to figure out why that directory didn't get added to a backup tape, and of course, it's too late when you discover this.

        Regarding your other comment about filename extensions, Windows FAT always has an extension, even if it is blank. The string "*." is not the same thing as the string "*"; a call to "*." would get all files with no extension (and no visible period) on Windows, while it would get all files which end in period on Un*x.

        --
        [ e d @ h a l l e y . c c ]

        A reply falls below the community's threshold of quality. You may see it by logging in.
      Now that I think about it even on Windows files are not required to have extensions. Some examples are %SystemRoot%\system32\drivers\etc\hosts and the registry hive files.