Re: slurping styles
by ikegami (Patriarch) on Aug 01, 2008 at 10:55 UTC
|
my $raw = do {
open(my $fh, '<', $file)
or die("Could not read file: $!\n");
binmode $fh;
local $/;
<$fh>
};
Cleaned up (without do ⇒ saves memory):
my $raw;
{
open(my $fh, '<', $file)
or die("Could not read file: $!\n");
binmode $fh;
local $/;
$raw = <$fh>;
}
| [reply] [d/l] [select] |
|
|
my $raw = do {
open(my $fh, '<', $file)
or die("Could not read file: $!\n");
binmode $fh;
local $/;
\<$fh>
};
$$raw =~ ...; ## etc.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
|
|
Do you care to explain why does the do version uses 2x as much memory?
[]s, HTH, Massa (κς,πμ,πλ)
| [reply] [d/l] |
|
|
No can do. I don't know why.
Update: Actually, it's quite basic. The assignment of do's return value causes it to be copied. You'll notice the same effect from subroutines.
use Devel::Peek qw( Dump );
{
my $x = do {
my $y = 'abcdef';
Dump($y);
$y
};
Dump($x);
}
{
my $x = sub {
my $y = 'abcdef';
Dump($y);
$y
}->();
Dump($x);
}
__END__
SV = PV(0x226df4) at 0x226ce0
REFCNT = 1
FLAGS = (PADBUSY,PADMY,POK,pPOK)
PV = 0x1822b04 "abcdef"\0
CUR = 6
LEN = 8
SV = PV(0x226e24) at 0x226d34
REFCNT = 1
FLAGS = (PADBUSY,PADMY,POK,pPOK)
PV = 0x187a93c "abcdef"\0 <-- new buffer
CUR = 6
LEN = 8
SV = PV(0x226e0c) at 0x1830320
REFCNT = 1
FLAGS = (PADBUSY,PADMY,POK,pPOK)
PV = 0x182c84c "abcdef"\0
CUR = 6
LEN = 8
SV = PV(0x226e60) at 0x226d70
REFCNT = 1
FLAGS = (PADBUSY,PADMY,POK,pPOK)
PV = 0x1855fac "abcdef"\0 <-- new buffer
CUR = 6
LEN = 8
| [reply] [d/l] [select] |
|
|
| [reply] [d/l] [select] |
Re: slurping styles
by pjotrik (Friar) on Aug 01, 2008 at 10:35 UTC
|
| [reply] |
Re: slurping styles
by jettero (Monsignor) on Aug 01, 2008 at 10:47 UTC
|
In the second block, you never localize *FILE, which is a dubious practice anyway now that we can use this form:
open my $file, "<", $file or die "crap: $!";
my $raw = do { local $/; <$file> }; # sexy
close $file;
Personally, my favorite slurping method these days is slurp() from File::Slurp. It's up to version 9_999.x, so you know it's good.
| [reply] [d/l] [select] |
Re: slurping styles
by moritz (Cardinal) on Aug 01, 2008 at 10:55 UTC
|
| [reply] [d/l] |
Re: slurping styles
by jwkrahn (Abbot) on Aug 01, 2008 at 11:45 UTC
|
open my $FILE, '<:raw', $file or die "could not open $file: $!";
-s $FILE == read $FILE, my $raw, -s $FILE or die "could not read $file
+: $!";;
close $FILE;
| [reply] [d/l] |
Re: slurping styles
by ysth (Canon) on Aug 02, 2008 at 02:37 UTC
|
When you say <FILE>, the readline result is stored in a special variable called a "targ" (target) associated with that particular readline call (it's more complicated if there's recursion or threading) and the assignment copies the value. The special variable holds on to its allocated space for reuse the next time the readline is executed (if ever). This is why you see twice the space allocated in the do{} case.
There are two optimizations that can prevent it; one is that a number of different operations that normally use a targ switch to using an arbitrary scalar when their result is being assigned to it by scalar assignment. Compare the direct assignment vs. the assignment with an intermediary operation:
$ perl -MO=Concise,-exec -we'my $foo = scalar <STDIN>'
1 <0> enter
2 <;> nextstate(main 1 -e:1) v
3 <#> gv[*STDIN] s
4 <1> readline[t3] sK/1
5 <0> padsv[$foo:1,2] sRM*/LVINTRO
6 <2> sassign vKS/2
7 <@> leave[1 ref] vKP/REFC
-e syntax OK
~$ perl -MO=Concise,-exec -we'my $foo = <STDIN>'
1 <0> enter
2 <;> nextstate(main 1 -e:1) v
3 <0> padsv[$foo:1,2] sRM*/LVINTRO
4 <#> gv[*STDIN] s
5 <1> readline[t3] sKS/1
6 <@> leave[1 ref] vKP/REFC
-e syntax OK
(The scalar() operation itself is optimized away, but nevertheless interferes with the other optimization.) Note that the sassign operation disappears and readline takes the sv to read into as an extra argument (to which it is alerted by the extra S (STACKED) flag).
The other optimization that prevents readline from using its targ is specific to readline. When you
catenate onto a buffer, the readline and concatenation operations are joined into a single rcatline
operation:
$ perl -MO=Concise,-exec -we' $foo.=<STDIN>'
Name "main::foo" used only once: possible typo at -e line 1.
1 <0> enter
2 <;> nextstate(main 1 -e:1) v
3 <#> gvsv[*foo] s
4 <#> rcatline[*STDIN] sS
5 <@> leave[1 ref] vKP/REFC
-e syntax OK
| [reply] [d/l] [select] |
Re: slurping styles
by Anonymous Monk on Aug 01, 2008 at 11:46 UTC
|
my $raw = slurp_file( $file);
sub slurp_file {
return do {
local ( *ARGV, $/, $^I );
use open qw' IN :bytes ';
@ARGV = @_;
<>;
};
}
| [reply] [d/l] |