comment on

Well, that certainly is a lot of detail -- most of which doesn't really shed light on the basic problem. We do at least get to see that you really are pulling a regex out of a database, and applying it to the full content of a data file. I presume the regex at the top of the thread indicates what comes from the database, but you haven't shown us what is in the file, or how you built the regex that went into the database. That might matter.

Splitting it into lines is a major design change

I assume this is because of how the regexes are being loaded into the database. So what is the point of trying to store and use monster regexes this way? Is that really necessary? There is so much redundant stuff in that big regex, if you really need the match to extend over the entire data file (which I doubt), it would make more sense to construct most of the regex on the fly in the perl script, rather than storing it all in a table (numerous times, presumably, with minor, systematic variations). Maybe a different design would actually be better.

As for simplifying the code, you still have farther to go on that. Preparing the DBI statement handles in advance and using placeholders would be a good start. Consider an idiom like this, especially for the queries that you are using over and over again:

my @map_cols = qw( a.mapping a.path_prefix a.parent_mapping
                   a.parent_element a.rule a.data_nature
                   a.table_suffix b.element b.column_name
                   b.priority_local b.priority_global );

my $map_sql = 'SELECT ' . join( ',', @map_cols ) .
  'FROM cor_ekl_map a, cor_ekl_map_dfn b
  WHERE a.mapping = ? and a.mapping = b.mapping';

my $map_sth = $dbh->prepare( $map_sql );

my @proc_cols = qw(a.process_id a.process_ts a.process_stage
                   b.ekl_set b.mapping);

my $proc_sql = 'SELECT ' . join( ',', @proc_cols ) .
  'FROM sys_ekl_ipt_001 a, cor_ekl_set_dfn b
  WHERE a.ekl_set = b.ekl_set';

my $proc_sth = $dbh->prepare( $proc_sql );

s/^[ab]\.// for ( @map_cols, @proc_cols );  # don't need table prefixe
+s now

my $regex_sth = $dbh->prepare( 'SELECT regex FROM cor_ekl_rul WHERE ru
+le = ?' );
my $cdata_sth = $dbh->prepare( 'SELECT column_name column_type
  FROM cor_dat_col WHERE data_nature = ? AND table_suffix = ?' );

$proc_sth->execute;

while ( my $proc_row = $proc_sth->fetchrow_arrayref ) 
{
    my %procdata;
    @procdata{@proc_cols} = @$proc_row;

    $map_sth->execute( $procdata{"mapping"} );

    while ( my $map_row = $map_sth->fetchrow_arrayref ) 
    {
        my %mapdata;
        @mapdata{@map_cols} = @$map_row;

        my $table_name = join( '_', 'dat',
                               $mapdata{data_nature},
                               $mapdata{table_suffix} );

        $regex_sth->execute( $mapdata{rule} );
        my ( $regex ) = $regex_sth->fetchrow_array;

        # and so on...
    }
}
[download]

I think that going this direction will make your code a lot shorter, simpler, and easier to maintain. Boiling down the regexes to just the stuff that matters will help too. I suspect that you don't really need to store regexes in the database as all.

In reply to Re^5: regex causing segmentation fault (core dump) by graff
in thread regex causing segmentation fault (core dump) by Otogi

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.