comment on

Thanks a lot for your advice. I played with it, built on it a bit, and came up with this:

#! /usr/bin/perl -w

use strict ;
use warnings ;
use diagnostics ;
use Data::Dumper ;
$|++ ;

my $sql_query = q(
  /*<QUERY>*/
    SELECT
      /*<FIELD>*/ ASSET_ID /*</FIELD>*/ ,
      /*<FIELD>*/ ASSET_DESC /*</FIELD>*/ ,
      /*<FIELD>*/ ASSET_COST /*</FIELD>*/
    FROM
      /*<FULL_TBL>*/
        SYNERGEN.SA_ASSET@/*<DATABASE>*/SGENQA/*</DATABASE>*/
      /*</FULL_TBL>*/,
      /*<FULL_TBL>*/
        SYNERGEN.SA_WORK_ORDER@/*<DATABASE>*/SGENTEST/*</DATABASE>*/
      /*</FULL_TBL>*/
    WHERE
      /*<CONDITION>*/ ASSET_ID IS NOT NULL /*</CONDITION>*/ AND
      /*<CONDITION>*/ ASSET_DESC IS NOT NULL /*</CONDITION>*/ AND
      /*<CONDITION>*/ ASSET_COST > 100 /*</CONDITION>*/
  /*</QUERY>*/
) ;


my @type_list = qw( FIELD DATABASE CONDITION FULL_TBL QUERY ) ;

foreach my $type ( @type_list )
{
    my @list = &get_match_list( $type, $sql_query ) ;
    print Dumper( \@list ), "\n" ;
}

exit( 0 ) ;

#----- F U N C T I O N S ----------------------------------------


sub start_tag
{
    my $tag = shift ;
    my $start = "\\/\\*\\s*<$tag>\\s*\\*\\/" ;
    return $start ;
}

sub end_tag
{
    my $tag = shift ;
    my $end = "\\/\\*\\s*<\\/$tag>\\s*\\*\\/" ;
    return $end ;
}

sub get_match_list
{
    my ( $tag, $text ) = @_ ;
    my @match_list = () ;

    # create the start & end tags
    my ( $start, $end ) = ( &start_tag( $tag ), &end_tag( $tag ) ) ;

    # as long as you're finding tag pairs...
    while ( $text =~ m/$start\s*(.*?)\s*$end/gi )
    {
        my $new_match = $1 ;

        # strip out any comments, and replace spaces on either end
        # with a single space (leaving line breaks alone)
        my $spc = $new_match =~ /[ \t]+\/\*.*?\*\/|\/\*.*?\*\/[ \t]+/
            ? ' '
            : '' ;
        $new_match =~ s/[ \t]*\/\*(.*?)\*\/[ \t]*/$spc/g ;

        # Strip any whitespace off the ends
        $new_match =~ s/^\s*(.*?)\s*$/$1/ ;

        push( @match_list, $new_match ) ;
    }
    return @match_list ;
}
[download]

The output looks like

$VAR1 = [
'ASSET_ID',
'ASSET_DESC',
'ASSET_COST'
];

$VAR1 = [
          'SGENQA',
          'SGENTEST'
        ];

$VAR1 = [
          'ASSET_ID IS NOT NULL',
          'ASSET_DESC IS NOT NULL',
          'ASSET_COST > 100'
        ];

$VAR1 = [
          'SYNERGEN.SA_ASSET@SGENQA',
          'SYNERGEN.SA_WORK_ORDER@SGENTEST'
        ];

$VAR1 = [];
[download]

...very nearly what i was aiming for, but I can't seem to get the regex to match across a bunch of lines. What do I need to do to make the QUERY tags match as intended?

_______________
D a m n D i r t y A p e
Home Node | Email

In reply to Re: Re: In search of regex advice by DamnDirtyApe
in thread In search of regex advice by DamnDirtyApe

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.