bv has asked for the wisdom of the Perl Monks concerning the following question:
I was messing around trying to get the offset of a string within a file with this code:
#!/usr/bin/perl
use strict;
use warnings;
open my $fh, '<', shift or die "Error: $!";
my $chunk;
my $find = 'regf';
while (my $bytes = read $fh, $chunk, 1024)
{
unless ((my $idx = index $chunk, $find) < 0)
{
printf "Found %s at offset %x\n", $find, tell $fh + $idx - $by
+tes;
}
}
There is a bug in my code at tell $fh + $idx which is fixed by tell($fh) + $idx. I understand that part. What I don't understand is why it segfaults? What is the result of adding an integer to a filehandle?
This is perl, v5.10.0 built for i486-linux-gnu-thread-multi
Update: fixed logic to give actual offset
print pack("A25",pack("V*",map{1919242272+$_}(34481450,-49737472,6228,0,-285028276,6979,-1380265972)))
Re: Perl segfaults: Why?
by BrowserUk (Patriarch) on Sep 15, 2009 at 16:25 UTC
|
What is the result of adding an integer to a filehandle?
Maybe this explains it?
open $fh, '<', 'junk.dat';;
print $fh, $fh + 1;;
GLOB(0x3d39d90) 64200081
Note: 0x3d39d90 => 64,200,080
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
Re: Perl segfaults: Why?
by cdarke (Prior) on Sep 15, 2009 at 16:33 UTC
|
Tried your code on 5.10.1 on Windows, 5.10.0 on Cygwin, 5.8.8 on CentOS all OK. I did reproduce the seg. fault on 5.10.0 on Ubuntu.
(gdb) bt
#0 Perl_do_tell (my_perl=0x9c01008, gv=0x0) at doio.c:1038
#1 0x080fac6a in Perl_pp_tell (my_perl=0x9c01008) at pp_sys.c:2077
#2 0x080b20f9 in Perl_runops_standard (my_perl=0x9c01008) at run.c:38
#3 0x080b0560 in perl_run (my_perl=0x9c01008) at perl.c:2391
#4 0x08063ebd in main (argc=3, argv=0xbfff4dd4, env=0xbfff4de4)
at perlmain.c:113
| [reply] [d/l] |
|
1031 Off_t
1032 Perl_do_tell(pTHX_ GV *gv)
1033 {
1034 dVAR;
1035 register IO *io = NULL;
1036 register PerlIO *fp;
1037
1038 if (gv && (io = GvIO(gv)) && (fp = IoIFP(io))) {
1039 #ifdef ULTRIX_STDIO_BOTCH
1040 if (PerlIO_eof(fp))
1041 (void)PerlIO_seek(fp, 0L, 2); /* ultrix 1.2 wor
+karound */
1042 #endif
1043 return PerlIO_tell(fp);
1044 }
1045 if (ckWARN2(WARN_UNOPENED,WARN_CLOSED))
1046 report_evil_fh(gv, io, PL_op->op_type);
1047 SETERRNO(EBADF,RMS_IFI);
1048 return (Off_t)-1;
1049 }
I don't see how passing NULL to do_tell (as shown in your stack trace) would cause any problems. The line where the segfault occurs in your strack trace explicitly checks if gv is NULL before using it. Could this be a compiler optimisation problem? I could confirm that if I saw the assembler code for that function on a machine where it crashes.
As an aside, I discovered that Perl will treat a number passed to tell as the name of a glob.
$ perl -MO=Concise -e'tell 1234'
6 <@> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 1 -e:1) v ->3
5 <1> tell[t2] vK/1 ->6
4 <1> rv2gv sK*/1 ->5
3 <#> gv[*1234] s ->4
-e syntax OK
$ perl -MO=Concise -e'tell *1234'
6 <@> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 1 -e:1) v ->3
5 <1> tell[t2] vK/1 ->6
4 <1> rv2gv sKR/1 ->5
3 <#> gv[*1234] s ->4
-e syntax OK
(And not just when it's a constant. It's just easier to see there.)
That means the following are all equivalent:
my $fh = \*STDOUT;
tell( 0+$fh );
my $fh = \*STDOUT;
no strict 'refs';
tell( *{ ''.(0+$fh) } );
# Assuming STDOUT is still located at address 0x814ec28
tell( *135588904 );
| [reply] [d/l] [select] |
|
0000000000103a10 <Perl_do_tell>:
103a10: 48 89 5c 24 e8 mov %rbx,-0x18(%rsp)
103a15: 4c 89 64 24 f8 mov %r12,-0x8(%rsp)
103a1a: 48 89 f3 mov %rsi,%rbx
103a1d: 48 89 6c 24 f0 mov %rbp,-0x10(%rsp)
103a22: 48 83 ec 18 sub $0x18,%rsp
103a26: 80 7e 0c 09 cmpb $0x9,0xc(%rsi)
103a2a: 49 89 fc mov %rdi,%r12
103a2d: 74 41 je 103a70 <Perl_do_tell+0x60
+>
103a2f: 31 ed xor %ebp,%ebp
103a31: be 0b 06 00 00 mov $0x60b,%esi
103a36: 4c 89 e7 mov %r12,%rdi
103a39: e8 12 49 f3 ff callq 38350 <Perl_ckwarn@plt>
103a3e: 84 c0 test %al,%al
103a40: 75 6e jne 103ab0 <Perl_do_tell+0xa0
+>
103a42: e8 29 34 f3 ff callq 36e70 <__errno_location@p
+lt>
103a47: c7 00 09 00 00 00 movl $0x9,(%rax)
103a4d: 48 c7 c0 ff ff ff ff mov $0xffffffffffffffff,%rax
103a54: 48 8b 1c 24 mov (%rsp),%rbx
103a58: 48 8b 6c 24 08 mov 0x8(%rsp),%rbp
103a5d: 4c 8b 64 24 10 mov 0x10(%rsp),%r12
103a62: 48 83 c4 18 add $0x18,%rsp
103a66: c3 retq
103a67: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
103a6e: 00 00
103a70: 48 8b 46 10 mov 0x10(%rsi),%rax
103a74: 48 85 c0 test %rax,%rax
103a77: 74 b6 je 103a2f <Perl_do_tell+0x1f
+>
103a79: 48 8b 68 08 mov 0x8(%rax),%rbp
103a7d: 48 85 ed test %rbp,%rbp
103a80: 74 af je 103a31 <Perl_do_tell+0x21
+>
103a82: 48 8b 45 00 mov 0x0(%rbp),%rax
103a86: 48 8b 70 30 mov 0x30(%rax),%rsi
103a8a: 48 85 f6 test %rsi,%rsi
103a8d: 74 a2 je 103a31 <Perl_do_tell+0x21
+>
103a8f: 48 8b 1c 24 mov (%rsp),%rbx
103a93: 48 8b 6c 24 08 mov 0x8(%rsp),%rbp
103a98: 4c 8b 64 24 10 mov 0x10(%rsp),%r12
103a9d: 48 83 c4 18 add $0x18,%rsp
103aa1: e9 ba 14 f3 ff jmpq 34f60 <Perl_PerlIO_tell@p
+lt>
103aa6: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
103aad: 00 00 00
103ab0: 49 8b 44 24 08 mov 0x8(%r12),%rax
103ab5: 48 89 ea mov %rbp,%rdx
103ab8: 48 89 de mov %rbx,%rsi
103abb: 4c 89 e7 mov %r12,%rdi
103abe: 0f b7 48 20 movzwl 0x20(%rax),%ecx
103ac2: 81 e1 ff 01 00 00 and $0x1ff,%ecx
103ac8: e8 e3 3e f3 ff callq 379b0 <Perl_report_evil_f
+h@plt>
103acd: e9 70 ff ff ff jmpq 103a42 <Perl_do_tell+0x32
+>
103ad2: 66 66 66 66 66 2e 0f nopw %cs:0x0(%rax,%rax,1)
103ad9: 1f 84 00 00 00 00 00
Update: AFAICT (which might not be all that far :)
— I stopped doing assembly around ten years ago), the "if (gv " has not
been optimised away:
103a70: mov 0x10(%rsi),%rax # fetch gv from stack
103a74: test %rax,%rax # gv == 0 ?
103a77: je 103a2f <Perl_do_tell+0x1f> # if so, continue with
+ if (ckWARN2...
| [reply] [d/l] [select] |
|
|
| [reply] [d/l] |
|
| [reply] |
Re: Perl segfaults: Why?
by ikegami (Patriarch) on Sep 15, 2009 at 16:36 UTC
|
I suspect it's tell (or something it calls) that's faulting, not the addition. ( cdarke has since shown this to be true. ) This would be the result of receiving a number for argument instead of a file handle.
There should be a validation check that catches this. I don't have access to 5.10.1 at this very moment to check if it's been fixed there. Could someone check?
| [reply] [d/l] |
|
Could someone check?
Seems to be:
open $fh, '<', $0;;
print scalar <$fh>;;
#! perl -slw
print tell( $fh );;
14
print tell( $fh + 1 );;
tell() on unopened filehandle at (eval 10) line 1.
-1
print $];;
5.010001
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
|
Segfaulting here with 5.10.0 and 5.10.1 but not on 5.8.9. All tested with BrowserUK's script.
$ cat y.pl
#!/usr/bin/perl
open $fh, '<', $0;;
print scalar <$fh>;;
print tell( $fh );;
print tell( $fh + 1 );;
$ perl y.pl # system perl: 5.10.0
#!/usr/bin/perl
Segmentation fault
$ perl5.10.0 y.pl
#!/usr/bin/perl
Segmentation fault
$ perl5.10.1 y.pl
#!/usr/bin/perl
Segmentation fault
$ perl5.8.9 y.pl
#!/usr/bin/perl
16-1
| [reply] [d/l] |
|
After digging some more, I think this is a compiler problem, not a Perl problem.
| [reply] |
|
|
Re: Perl segfaults: Why?
by ikegami (Patriarch) on Sep 15, 2009 at 16:57 UTC
|
What is the result of adding an integer to a filehandle?
The addition operator will ask for the numeric representation of its arguments, so they will be coerced into numbers if they're not already. So the real question is: "What is the result of numifying a filehandle?"
Many things are considered file handles in Perl
- IO
- References to IO
- Glob
- References to glob
- Variable names (sometimes)
In this case, you have a reference to a glob
$ perl -MScalar::Util=reftype -le'open my $fh, "<", \$buf; print refty
+pe($fh)'
GLOB
When you numify a reference, you get the memory address at which the referenced object is located.
$ perl -le'open my $fh, "<", \$buf; print $fh; printf "0x%x\n", 0+$fh'
GLOB(0x814ec28)
0x814ec28
So you are adding the address of the glob associated with the file handle.
| [reply] [d/l] [select] |
|
|