Re: Split Function

Update: Ignore this! It is much slower.

This might work out a little faster if performance is a issue--which it usually is with genome related stuff.

#! perl -slw
use strict;

my $re = '([^X]+)X*' . '(?:([^X]+)X*)?' x 100;
$re = qr[$re];

my $screen = "ATCGATCGXXXXXATCGATXXXACTGCTACGGTACXXXAATTATXGCGCGXXT";

$screen =~ $re;

print for @-[ 1 .. $#- ];

__END__
P:\test>test2
0
13
22
38
45
52
[download]

Examine what is said, not who speaks.

"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail

Comment on Re: Split Function - Positions Download Code

Replies are listed 'Best First'.
Re: Re: Split Function - Positions by duff (Parson) on Jun 02, 2004 at 03:10 UTC
It is much slower. Surely it must be so; look at the size of your RE! :-) duff	[reply]
Re: Re: Re: Split Function - Positions by BrowserUk (Patriarch) on Jun 02, 2004 at 03:28 UTC
Agreed. Though I had thought that by grabbing the matches using a standard `m[([^X]+)]g` first, I would know how big to make the big re. Then a second pass would populate `@-`. As it turn out, `push @posns, pos($screen) - length $1 while $screen =~ /([^X]+)/g;` [download] is substantially faster than `push @posns, $-[ 0 ] while $screen =~ m[([^X]+)]g;` [download] which surprised me. I'm not sure why that would be? My best guess is that `@-` uses tie-style magic, and isn't populated unless it is accessed rather than when the regex runs? Perhaps the captures are made in the form of LVALUE refs and `@- and @+` are derived from those if and when they are called for? Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail	[reply] [d/l] [select]