in reply to Re: Efficiently Extracting a Range of Lines (was: Range This)
in thread Efficiently Extracting a Range of Lines (was: Range This)

i'm curious why you overlooked my solution. although it may not be the "best" way, my solution was created from lessons i learned from posts like Code Smarter, and Death to Dot Star!

although my solution breaks on multiple START/STOP tags, this requirement was not specified in the question. i would add this functionality for a more general solution, but i'd also need to know if it should handle nested tags or not.

my solution will, however, match START/STOP tags anywhere in the input stream, as was specified by the code in the original post. it will work if the STOP tag does not exist, as i got from the original data (granted this might be a typo). and it matches the behaviour of including the START/STOP tags in the results. mine includes it outputs a string, instead of a list, but that is easily remedied with split either in the return statements, or to be done outside the find_between_tags() function.

~Particle

  • Comment on (particle) Re: Re: Efficiently Extracting a Range of Lines (was: Range This)

Replies are listed 'Best First'.
Re3: Efficiently Extracting a Range of Lines (was: Range This)
by Hofmator (Curate) on Jul 11, 2001 at 19:45 UTC

    particle, I overlooked your solution on purpose ;-). That has absolutely nothing to do with the quality - all of them work fine on single START/END tags.

    I just was not sure how quick index in comparison to the regex solutions works. The other two are both regex and thus easy to compare. I thought index should be quicker than a regex on a fixed string:

    $i = index $stuff, 'START'; # compared to $stuff =~ /START/;
    but the benchmark suggested otherwise. The difference might be because of the function overhead and the surrounding code in your solution. But I was too lazy to test this with some proper benchmarks ...

    -- Hofmator