Here's a go without using a big regex. It looks for begin{} and end{} keys and stores the text between the two in an anon array in a hash indexed by that key.

use strict; use warnings; #-- hash that will be keyed by Latex element my %in = (); while ( <DATA> ) { if ( /\\begin\{(\S+?)\}/ ) { #-- tell that we're in a block $in{$1}->{Status} = "in"; #-- add a new element to the anon array containing this info push @{$in{$1}->{Text}}, ""; } if ( /\\end\{(\S+?)\}/ ) { $in{$1}->{Status} = "pending"; } #-- now loop on all keys, see we are in that element. foreach my $key ( keys %in ) { my $status = $in{$key}->{Status}; if ( $status eq "in" || $status eq "pending" ) { #-- add text to last element of the array $in{$key}->{Text}->[$#{$in{$key}->{Text}}] .= $_; } $in{$key}->{Status} = "out" if $status eq "pending"; } } #-- write it out. Here loop over all possible keys. Could be restricte +d to # 'exertext', 'answers', 'soln' foreach my $key ( keys %in ) { my $file = $key . ".tex"; print "Creating file '$file'\n"; open( FILE, '>', $file) or die "Cannot open file '$file' for writing: + $!\n"; #-- write each element of array my $i = 1; foreach my $text ( @{$in{$key}->{Text}} ) { print FILE "%% Starting Element $i\n"; print FILE $text; $i++; } close FILE; }
The output is With nested tags, the sub tags are included in the given tag (so that exer.tex includes exertext.tex, soln.tex, answer.tex). If you ran it on a full doc, presumably "document.tex" would be the same as your input doc.

- j


In reply to Re: Parsing Exercise set by jimbojones
in thread Parsing Exercise set by David Arnold

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.