Thanks a lot for your advice. I played with it, built on it a bit, and came up with this:

#! /usr/bin/perl -w use strict ; use warnings ; use diagnostics ; use Data::Dumper ; $|++ ; my $sql_query = q( /*<QUERY>*/ SELECT /*<FIELD>*/ ASSET_ID /*</FIELD>*/ , /*<FIELD>*/ ASSET_DESC /*</FIELD>*/ , /*<FIELD>*/ ASSET_COST /*</FIELD>*/ FROM /*<FULL_TBL>*/ SYNERGEN.SA_ASSET@/*<DATABASE>*/SGENQA/*</DATABASE>*/ /*</FULL_TBL>*/, /*<FULL_TBL>*/ SYNERGEN.SA_WORK_ORDER@/*<DATABASE>*/SGENTEST/*</DATABASE>*/ /*</FULL_TBL>*/ WHERE /*<CONDITION>*/ ASSET_ID IS NOT NULL /*</CONDITION>*/ AND /*<CONDITION>*/ ASSET_DESC IS NOT NULL /*</CONDITION>*/ AND /*<CONDITION>*/ ASSET_COST > 100 /*</CONDITION>*/ /*</QUERY>*/ ) ; my @type_list = qw( FIELD DATABASE CONDITION FULL_TBL QUERY ) ; foreach my $type ( @type_list ) { my @list = &get_match_list( $type, $sql_query ) ; print Dumper( \@list ), "\n" ; } exit( 0 ) ; #----- F U N C T I O N S ---------------------------------------- sub start_tag { my $tag = shift ; my $start = "\\/\\*\\s*<$tag>\\s*\\*\\/" ; return $start ; } sub end_tag { my $tag = shift ; my $end = "\\/\\*\\s*<\\/$tag>\\s*\\*\\/" ; return $end ; } sub get_match_list { my ( $tag, $text ) = @_ ; my @match_list = () ; # create the start & end tags my ( $start, $end ) = ( &start_tag( $tag ), &end_tag( $tag ) ) ; # as long as you're finding tag pairs... while ( $text =~ m/$start\s*(.*?)\s*$end/gi ) { my $new_match = $1 ; # strip out any comments, and replace spaces on either end # with a single space (leaving line breaks alone) my $spc = $new_match =~ /[ \t]+\/\*.*?\*\/|\/\*.*?\*\/[ \t]+/ ? ' ' : '' ; $new_match =~ s/[ \t]*\/\*(.*?)\*\/[ \t]*/$spc/g ; # Strip any whitespace off the ends $new_match =~ s/^\s*(.*?)\s*$/$1/ ; push( @match_list, $new_match ) ; } return @match_list ; }

The output looks like

$VAR1 = [ 'ASSET_ID', 'ASSET_DESC', 'ASSET_COST' ]; $VAR1 = [ 'SGENQA', 'SGENTEST' ]; $VAR1 = [ 'ASSET_ID IS NOT NULL', 'ASSET_DESC IS NOT NULL', 'ASSET_COST > 100' ]; $VAR1 = [ 'SYNERGEN.SA_ASSET@SGENQA', 'SYNERGEN.SA_WORK_ORDER@SGENTEST' ]; $VAR1 = [];

...very nearly what i was aiming for, but I can't seem to get the regex to match across a bunch of lines. What do I need to do to make the QUERY tags match as intended?


_______________
D a m n D i r t y A p e
Home Node | Email

In reply to Re: Re: In search of regex advice by DamnDirtyApe
in thread In search of regex advice by DamnDirtyApe

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.