ambrus has asked for the wisdom of the Perl Monks concerning the following question:

Dear fellow monks,

I get a Segmentation Fault (SIGSEGV) from perl when running the below snippet.

I see nothing in this code why it should do that. Do you have any explanation.

perl -we '$_ = ""; $/ = \4096; while (defined($_.=<>)) { /([\x20-\x7e] +*\z)/ and $_ = $1; }' /dev/urandom

As its input is random, I sometimes have to wait more to get the segv, somtimes I get it very soon. This is a stripped-down example, so it does not do any useful. When giving it a finite file instead of /dev/random it migth get in an infinite busy-loop when giving it a finite file instead of /dev/random (which is the expected behaiviour) or it might segfault.

I am running perl v5.8.5 built for i686-linux, but I can get the same error from v5.8.2 too.

Update: removed a line break from the command. Probably does not matter at all, but you can never now in such cases. Update: please report whether you can reproduce the segfault at all, even if you have no idea why it happens. Maybe it's just a fault on my side.

Replies are listed 'Best First'.
Re: Segmentation fault when doing /match/ and $_=$1
by diotalevi (Canon) on Sep 28, 2004 at 19:15 UTC

    perl -Dt's failure shows that the last operation to run is the .=. So either the concatenation or the defined() check are killing this. More info may follow.

    (../seg:10) unstack (../seg:10) gvsv(main::_) (../seg:10) rcatline (../seg:10) defined (../seg:10) and (../seg:10) nextstate (../seg:10) match (../seg:10) and (../seg:10) enter (../seg:10) nextstate (../seg:12) gvsv(main::1) (../seg:12) gvsv(main::_) (../seg:12) sassign (../seg:12) leave (../seg:10) unstack (../seg:10) gvsv(main::_) (../seg:10) rcatline (../seg:10) defined (../seg:10) and (../seg:10) nextstate (../seg:10) match (../seg:10) and (../seg:10) enter (../seg:10) nextstate (../seg:12) gvsv(main::1) (../seg:12) gvsv(main::_) (../seg:12) sassign (../seg:12) leave (../seg:10) unstack (../seg:10) gvsv(main::_) (../seg:10) rcatline (../seg:10) defined (../seg:10) and (../seg:10) nextstate (../seg:10) match (../seg:10) and (../seg:10) enter (../seg:10) nextstate (../seg:12) gvsv(main::1) (../seg:12) gvsv(main::_) (../seg:12) sassign (../seg:12) leave (../seg:10) unstack (../seg:10) gvsv(main::_) (../seg:10) rcatline (../seg:10) defined (../seg:10) and (../seg:10) nextstate (../seg:10) match (../seg:10) and (../seg:10) enter (../seg:10) nextstate (../seg:12) gvsv(main::1) (../seg:12) gvsv(main::_) (../seg:12) sassign (../seg:12) leave (../seg:10) unstack (../seg:10) gvsv(main::_) (../seg:10) rcatline Segmentation fault

    perl -Ds's failure.

    => => => SV_YES => => => => PVMG("k|"\0) => PVMG("k|"\0) PVMG("k|lf\254\4\335\303\306^NQ\365\242.\271Q\37 +66y\364\263\260Iy\202\333\307d\233|\372\12^\202x\232\236\37\14\205\31 +1N\13\247\313\265"...\0) => PVMG(""\0) => => => PVMG(""\0) => PVMG("\277W\346`\307I\246\35\346]d\223\17\20\335ac7\233\225\35 +1\262\377\250|\225\351\343\212l^Du\211q\235D+M\350M\320%\341\3263\311 +\334"...\0) => SV_YES => => => SV_YES => => => => PVMG(""\0) => PVMG(""\0) PVMG("\277W\346`\307I\246\35\346]d\223\17\20\335ac +7\233\225\351\262\377\250|\225\351\343\212l^Du\211q\235D+M\350M\320%\ +341\3263\311\334"...\0) => PVMG(""\0) => => => PVMG(""\0) => PVMG("\274v,D\332\313\330\21C\31\232\206?\247[\220 Q\2lyU }\33 +0\346\6m\241\270\365%\272\242\213\14\1\\\226\264d\205fN\315A\374880]] +|"...\0) => SV_YES => => => SV_YES => => => => PVMG(""\0) => PVMG(""\0) PVMG("\274v,D\332\313\330\21C\31\232\206?\247[\220 + Q\2lyU }\330\346\6m\241\270\365%\272\242\213\14\1\\\226\264d\205fN\3 +15A\374880]]|"...\0) => PVMG(""\0) => => => PVMG(""\0) => PVMG("\25^\277\11UJ[\221j\353@n\230\341\237\312&y\244\247p\276 +K;\307\215\236\214`\207i\353~\347\0142\302[\315K\251\6'\255\1o\374\4\ +376"...\0) => SV_YES => => => SV_YES => => => => PVMG(""\0) => PVMG(""\0) PVMG("\25^\277\11UJ[\221j\353@n\230\341\237\312&y\ +244\247p\276K;\307\215\236\214`\207i\353~\347\0142\302[\315K\251\6'\2 +55\1o\374\4\376"...\0) => PVMG("+?)~CH"\0) => => => PVMG("+?)~CH"\0) => PVMG("+?)~CH\33\222 \306\22\260\240t\365\16\320\315/ \266\333\ +213\372\327\375\325et\366\314\0332[\6\7\225\34k\201x\312\324\11p\30\3 +46"...\0) => SV_YES => => Segmentation fault
Re: Segmentation fault when doing /match/ and $_=$1
by graff (Chancellor) on Sep 29, 2004 at 01:01 UTC
    I'm running the test with 5.8.1 on darwin (macosx 10.3.5), and it can chug away happily (or maybe with intense boredom) for as long as I let it run. I added "$|=1" and a counter that prints out at every 1K iterations -- I decided to stop watching after 64K cycles through the while loop... (should I have waited longer?)

    (update: tried it with 5.8.5 on a freebsd5.3-BETA/intel -- again, no seg fault after 64K iterations, but it ran noticeably faster than 5.8.1 on the mac.)

Re: Segmentation fault when doing /match/ and $_=$1
by hv (Prior) on Sep 29, 2004 at 12:04 UTC

    I was able to reproduce the coredump with various 5.8.x perls and with the latest bleadperl. I suspect there may be a bug in the optimisation that stores $1 and family only as a pair of pointers into the target string, but within $_ = $1; I'd expect the substring always to be extracted before the superstring is released so I'm not sure where the problem is likely to lie.

    To reduce this to something perlbuggable, I'd suggest copying a few KB from /dev/urandom into a file to get something reproducible - probably do this a few times until you get a dataset that cores within the first 2 or 3 iterations - then start playing with the bytes in the file to simplify the description of the problem. For example, you may be able to replace each byte in the dataset with 'x' or 'y', such that the 'y's occur only where the regexp matches. Once the construction of the dataset can be reduced to a one-liner, debugging the problem should be straightforward.

    Hugo

Re: Segmentation fault when doing /match/ and $_=$1
by hossman (Prior) on Sep 28, 2004 at 20:26 UTC

    reproduced segfault on...

    chrish@asimov:~$ uname -a Linux asimov 2.6.6.hoss1 #1 SMP Tue Jul 6 16:31:01 PDT 2004 i686 GNU/L +inux
Re: Segmentation fault when doing /match/ and $_=$1
by BrowserUk (Patriarch) on Sep 28, 2004 at 17:04 UTC

    Seems likely you are blowing a stack in the regex engine somewhere?


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon