Melly has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks

I'm trying to understand the different behaviour of file-handles and similar (e.g. '<DATA>').

I don't think I'd ever really thought about them - they just worked - but I ran into a problem with CSV_XS and getline. Basically, why does a FH variable work (e.g. $IN), but <IN> and <DATA> don't.

The following code is non-functional without commenting out stuff, but should illustrate what I mean.

use Text::CSV_XS;

my $csv = Text::CSV_XS->new();

open(my $IN, '<', 'test.csv'); # ok
open(IN, '<', 'test.csv');     # nope

my $row = $csv->getline(<DATA>); # nope - Usage: Text::CSV_XS::getline
+(self, io) at test_02.pl line 11, <IN> line 2.
my $row = $csv->getline(<IN>);   # nope - Usage: Text::CSV_XS::getline
+(self, io) at test_02.pl line 11, <IN> line 2.
my $row = $csv->getline($IN);    # ok

print ${$row}[0];

__DATA__
a,b,c
d,e,f
[download]

So, what is the difference between "open(IN..." and "open($IN..."? (and is there a way to alias <DATA> to, e.g., $DATA?)

UPDATE

Ah - \*DATA or \*IN - err, what does '\*' imply?

map{$a=1-$_/10;map{$d=$a;$e=$b=$_/20-2;map{($d,$e)=(2*$d*$e+$a,$e**2
-$d**2+$b);$c=$d**2+$e**2>4?$d=8:_}1..50;print$c}0..59;print$/}0..20
[download]

Tom Melly, pm (at) cursingmaggot (stop) co (stop) uk

Comment on Filehandles and CSV_XS Select or Download Code

Replies are listed 'Best First'.
Re: Filehandles and CSV_XS (updated) by haukex (Archbishop) on Sep 01, 2023 at 10:46 UTC
`<DATA>` is not a filehandle, it's the `<>` I/O operator (aka readline) applied to the bareword filehandle `DATA`. `$csv->getline(DATA)` may work in your code because you don't have strict turned on (which can cause strange and hard-to-analyze bugs), but since you should always Use strict and warnings, you should use the glob notation to pass bareword filehandles as arguments, i.e. `$csv->getline(DATA)`. For files you open, nowadays you should always use lexical filehandles - "open" Best Practices. Minor edits shortly after posting.* Update in response to your edit: Relevant reading is Symbol Tables, Typeglobs and Filehandles, and Passing Symbol Table Entries (typeglobs) (and perlref for globrefs).	[reply] [d/l] [select]
Re^2: Filehandles and CSV_XS by Melly (Chaplain) on Sep 01, 2023 at 10:55 UTC
Ah! That (sort-of) makes sense - in all honesty, my brain gives up on at least three things - quantum physics, Trump-supporters, and type-globs... `map{$a=1-$_/10;map{$d=$a;$e=$b=$_/20-2;map{($d,$e)=(2$d$e+$a,$e2 -$d2+$b);$c=$d2+$e2>4?$d=8:_}1..50;print$c}0..59;print$/}0..20` [download] Tom Melly, pm (at) cursingmaggot (stop) co (stop) uk	[reply] [d/l]
Re^3: Filehandles and CSV_XS by eyepopslikeamosquito (Archbishop) on Sep 01, 2023 at 13:09 UTC
in all honesty, my brain gives up on at least three things - quantum physics, Trump-supporters, and type-globs... Thanks for making me laugh. :) I'm relieved you didn't mention your brain giving up on `use strict`, `use warnings` and lexical variables. You really need to embrace all three. To give a simple example why, notice that this code: `use strict; use warnings; sub fred { my $fname = 'f.tmp'; open( FH, '<', $fname ) or die "error: open '$fname': $!"; print "file '$fname' opened ok\n"; # ... process file here die "oops"; # if something went wrong close(FH); } eval { fred() }; if ($@) { print "died: $@\n" } # oops, handle FH is still open if an exception was thrown. my $line = <FH>; print "oops, FH is still open:$line\n";` [download] is not exception-safe because the ugly global file handle `FH` is not closed when die is called. A simple remedy, as noted at Exceptions and Error Handling References, is to replace the ugly global `FH` with a lexical file handle `my $fh`, which is auto-closed at end of scope (RAII): `use strict; use warnings; sub fred { my $fname = 'f.tmp'; open( my $fh, '<', $fname ) or die "error: open '$fname': $!"; print "file '$fname' opened ok\n"; # ... process file here die "oops"; # if something went wrong close($fh); } eval { fred() }; if ($@) { print "died: $@\n" } print "ok, \$fh is auto-closed when sub fred exits (normally or via di +e)\n";` [download]	[reply] [d/l] [select]
Re^3: Filehandles and CSV_XS by NERDVANA (Priest) on Sep 02, 2023 at 06:50 UTC
I know that feeling :-) and yeah I also had that experience when first encountering typeglobs. But, here's the simple explanation I arrived at for my own understanding: Perl allows you to create different types of global things using the same name, like '$foo', '@foo', '%foo', 'sub foo' and so on. To store them internally, one logical design would have been to have a hash table of "ScalarGlobals", another hash table of "ArrayGlobals", another hash table of "HashGlobals", and so on. In perl-speak, it might look like `$globals{$package_name}[Scalars]{$scalar_name}`. But, that makes a lot of hash tables per package name. Larry chose instead to create one hash table per package, and then store a struct which has a "slot" for each type of thing. In perl-speak, it might look like `$globals{$package_name}{$thing_name}[ScalarSlot]` The end result is lower use of memory by having fewer hash tables. In the original language, NAME referred to the file handle, $NAME the scalar, @NAME the array, %NAME the hash, and &NAME the subroutine. But, for reasons I am uninformed of, it was decided that people needed access to these slots in the perl language (probably so that Exporter can be written in Perl instead of written in C) and so NAME gives you access to this C struct that has a slot for each type of thing. The notation NAME{IO} is how you access the file handle slot of that struct. Now that bare-word file handles are discouraged, that implementation detail of the C-side typeglob structs (and the awkward Perl syntax for it) is more in-our-face than ever, especially since there is no native '$' variable for STDIN, STDOUT, STDERR, or DATA.	[reply] [d/l] [select]
Re: Filehandles and CSV_XS by ikegami (Patriarch) on Sep 01, 2023 at 13:36 UTC
Ah - \DATA or \IN - err, what does '\' imply?* `NAME` refers to the symbol table entry for "NAME". It contains `$NAME` (`NAME{SCALAR}`), `@NAME` (`NAME{ARRAY}`), `%HASH` (`NAME{HASH}`), `&NAME` (`NAME{CODE}`) and a number of other things, including a file handle (`NAME{IO}`) and a directory handle. So when you are doing `print FH ...` [download] It's effectively a shorthand for `FH`, which is effectively a shorthand for `FH{IO}`. Things commonly accepted as file handles: IO object (`FH{IO}`). reference to glob (`\FH`). glob (`FH`). name of the glob (`"FH"`). This is less supported than the others. `open( my $fh, ... )` assigns a reference to a glob to `$fh` (like `\FH`).	[reply] [d/l] [select]