comment on

I've never been that good at regular expressions. What I want to do is parse many log entries for words, ultimately, in SQL like expressions.

For example.

$line = "05/04/2010 13:09:45 - A - somebody - ( ( my.my id >= 1 ) ) and ( ( is-relative.to code = 'sister' ) or ( is-relative.to code = 'brother' ) or ( is-mother.to code = 'dog' ) )";

What ultimately I need out of these strings are:

my.my id 
is-relative.to code 
is-relative.to code 
is-mother.to code
[download]

but something like this would be great!

( my.my id >= 1 )
( is-relative.to code = 'sister' )
( is-relative.to code = 'brother' )
( is-mother.to code = 'dog' )
[download]

my.my id >= 1 )
is-relative.to code = 'sister'
is-relative.to code = 'brother'
is-mother.to code = 'dog'
[download]

I have been looking a while for hints to an elegant resolution for this problem. There is much dialogue about the use of Text::Balanced but not enough examples in the documentation for my little brain, to help me solve the riddle.

I have an example here that just pulls the expressions, I know what to do from there. I would like some ideas or code examples on a more elegant solution using one of the CPAN modules if that is possible.

What it basically does is:

Split the text at the first close parens
Parse the expression out of this "before" text

Remove everything up to and including the last open paren
Remove any beginning or trailing spaces

Split the "after" text this time, and repeat the above operations

Here is a snippet of code that pulls the expressions

$text = "05/04/2010 13:09:45 - A - somebody - ( ( my.my id >= 1 ) ) an
+d ( ( is-relative.to code = 'sister' ) or ( is-relative.to code = 'br
+other' ) or ( is-mother.to code = 'dog' ) )";

my $new = $text;
while ( 1 ) {
   $ind = index($new, ')');

   # Split the text at the first close parens
   $before = substr($new,0,$ind);
   $after = substr($new,$ind);
   last if ( $before eq "" );

   # Clean up the before string
   #  Remove everything up to and including the last open paren
   #  Remove any beginning or trailing spaces
   $before = substr($before,rindex($before,'(')+1);
   $before =~ s/^\s+//;
   $before =~ s/\s+$//;
   push(@list,$before);

   if ( $after =~ /\)/ ) {
      # Disgard chars up to the first open paren
      $after = substr($after,index($after,'(')+1);
      $new = $after;
      print "\n";
  } else {
     last;
  }
}

foreach my $i (@list) {
   print "--".$i."--\n";
}
[download]

In reply to Elegant examples to parse parenthesised strings by back-n-black

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.