need help in extracting lines

stanleysj has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks

I need help in creating a regex to extract the following lines from a text file.

Table: GO Terms
[entity}

Table: Aliases
[Alias]    
MAL1P1.18
pfmalp012
alap11

Table: Y2H Interactions
[download]

what I need are the lines in between "Alias" .. "Table: Y2H Interactions". I tried to use the range operator but for some reason did not work. Maybe I was wrong with the syntax. Sometimes there could be no entries in between. But if there are "aliases" then i need to pick it up.

I tried a similar code like below

 if /\[Alias\]/../^\s*\s$/ {
push (@alias, $_);
}
[download]

Comment on need help in extracting lines Select or Download Code

Replies are listed 'Best First'.
Re: need help in extracting lines by Corion (Patriarch) on Jan 13, 2009 at 13:18 UTC
If Perl thinks there is something wrong with the syntax, Perl tells you so. What did Perl tell you and what did you do about it? Maybe you want to just run your code using the diagnostics pragma? I used `perl -Mdiagnostics -le "if /\[Alias\]/../^\s*\s$/ { push (@alias, $_); }"` and found the message pretty to the point.	[reply] [d/l]
Re^2: need help in extracting lines by AnomalousMonk (Archbishop) on Jan 13, 2009 at 16:30 UTC
Another possible conceptual problem with stanleysj's approach is that the regex `/^\s\s$/` is looking for a line consisting only of one or more* whitespace characters, i.e., is equivalent to `/^\s+$/`. This may or may not be what the OPer really wants to terminate the text block with.	[reply] [d/l] [select]
Re: need help in extracting lines by jdporter (Paladin) on Jan 13, 2009 at 14:44 UTC
Others have already addressed your explicit question, but I'd like to suggest a different approach to your problem. Of couse, I'm making some assumptions about the real nature of your problem, so correct me if I'm wrong. Or simply disregard. :-) It appears that your data is a series of chunks separated by empty lines, and that each chunk begins with a `Table: ...` line. If so, then we could use perl's "paragraph" mode of reading input records: `local $/ = ''; # read paragraphs while (<>) { # each paragraph is multiple lines, the first of which is "Table: .. +." # and the second is some kind of tag enclosed in brackets. my( $table ) = /^Table: (.)/ or die "Hm... bad paragraph:\n$_"; my( undef, $tag, @lines ) = split /\n/; if ( $tag eq '[Alias]' ) { push @aliases, \@lines; } }` [download] Between the mind* which plans and the hands which build, there must be a mediator... and this mediator must be the heart.	[reply] [d/l] [select]
Re^2: need help in extracting lines by Wiggins (Hermit) on Jan 13, 2009 at 20:46 UTC
I tried this code out and needed to make a change or 2. -- the /^Table regex needs an 's' trailing modifier. -- the 'If (tag' condition I changed the 'eq' to '=~', because it turns out that the '[Alias]' line actually ends with a trailing space, and is '[Alias] '. But the paragraph mode is great! I once used a customer specific language that did list processing in paragraphs, but that was in the 70's. But it is a great processing mode. It is always better to have seen your target for yourself, rather than depend upon someone else's description.	[reply]
Re^3: need help in extracting lines by jdporter (Paladin) on Jan 13, 2009 at 21:25 UTC
/^Table regex needs an 's' trailing modifier Depends on what you want in the `$table` variable. I was intending to get the name of the table, i.e. only what follows '`Table:`' on that line. Adding the `s` modifier would put the entire rest of the paragraph into the variable. the '[Alias]' line actually ends with a trailing space, and is '[Alias] ' In that case, I'd write `if ( $tag eq '[Alias] ' )` [download] :-) But more importantly, square brackets are special in regular expressions, so you'd want to escape them if you go that route: `if ( $tag =~ /\[Alias\]/ )` [download] Between the mind which plans and the hands which build, there must be a mediator... and this mediator must be the heart.	[reply] [d/l] [select]
Re: need help in extracting lines by Bloodnok (Vicar) on Jan 13, 2009 at 13:35 UTC
Just goes to show that you can see exactly what you want to (see, that is...). Without doing the one-liner (coz Corion had already done that:-), it took 3 scans before I noticed the missing parens. A user level that continues to overstate my experience :-))	[reply]