vladb has asked for the wisdom of the Perl Monks concerning the following question:

I have a small script which takes some html file as its input and strips out certain chunks of the html using a search and replace regexp prior to spitting it out.

Here's a larely reduced and simplified version of the script (done so as to narrow down to the actual buggy spot):
use strict; open(IN,"page.html"); $/ = undef; my $content = <IN>; close(IN); print "size: ". length($content) ."\n"; # either of the two regexp cause seg fault! # $content =~ s/^Set-Cookie: .*$//mgi; # CORE DUMP # Also trying to remove the /g option here. # Still core dumps. # $content =~ s/^Set-Cookie: .*$//mi; # CORE DUMP # these work fine though... $content =~ s/^Cache-Control: .*$//mi; # OK print "Done.\n";
Try uncommenting either of the two lines marked # CORE DUMP and the script will core dump.
----------------
size: 67877 Segmentation fault (core dumped)
I'm also puzzled over why the second CORE DUMP line still fails when I remove the /g regexp option? As I look at the OK line and compare it to the CORE DUMP line just above it (also without the /g option), I hardly note any startling difference.

This script would only core dump on a server where I have perl 5.6.0 installed. The same script works just fine with perl 5.6.1

Aware of any difference?

Thanks for help..

Update: Just played with the page.html file and added 'Set-Cookie: foobar' line and the script didn't core dump! So, it only does so when there are no 'Set-Cookie: .*' line in the source file. (note: there was a 'Cache-Control: .*' line there already.. I didn't try removing it to see if the script would then core dump, though)

"There is no system but GNU, and Linux is one of its kernels." -- Confession of Faith

Replies are listed 'Best First'.
Re: Mystereous regexp core dump
by gellyfish (Monsignor) on Mar 25, 2002 at 22:20 UTC

    It could be this:

    [ 7076] By: jhi on 2000/09/14 14:4 +2:56 Log: Fix for a parsing bug, not for the original bug. Subject: Re: [ID 20000910.005] Another segfault with rege +xes. From: Spider Boardman <spider@web.zk3.dec.com> Date: Wed, 13 Sep 2000 17:52:20 -0400 Message-Id: <200009132152.RAA24029@leggy.zk3.dec.com> Branch: perl ! regcomp.c
    But there are loads of changes in reg*.c between 5.6.0 and 5.6.1 ....

    /J\

Re: Mystereous regexp core dump
by talexb (Chancellor) on Mar 25, 2002 at 22:11 UTC
    My guess is that the dot star you have in the regexp is eating large chunks of memory. If you're trying to delete the line with the Set Header directive, create a regexp that will do that.

    I'm not sure if there's a module to parse an HTTP header but if there is, I'd suggest using that instead.

    --t. alex

    "Here's the chocolates, and here's the flowers. Now how 'bout it, widder hen, will ya marry me?" --Foghorn Leghorn

Re: Mystereous regexp core dump
by IlyaM (Parson) on Mar 25, 2002 at 22:30 UTC
    This script would only core dump on a server where I have perl 5.6.0 installed. The same script works just fine with perl 5.6.1

    Aware of any difference?

    5.6.0 is very buggy. Either upgrade to 5.6.1 or downgrade to 5.005.

    --
    Ilya Martynov (http://martynov.org/)