Re: greping big numbers

There is a trick to manage unicode sequences by storing where a sequence starts and then again where it stops. You can employ this trick here as well. If you look at all your numbers in order, you'll find that you start a sequence at the first number, then count up until you reach the next number, then start again with the next number and so on:

23    900   8000  10000 ...
start stop  start stop  ...
[download]

The thing to realize is that the inverted sequence (which you want) is constructed by simply prepending a 0 to that list:

      23    900   8000  10000 ...
      start stop  start stop  ...
0     23    900   8000  10000 ...
start stop  start stop  start ...
[download]

The only thing you still need to know about is whether the boundaries should be included in your list or not. From your description above, it seems that the list should start with 901 again, so you'll have to adjust the end markers for the first list (and thus the start markers for the inverse) by +1:

      23    901   8001  10001 ...
      start stop  start stop  ...
0     23    901   8001  10001 ...
start stop  start stop  start ...
[download]

Writing the code for this is left as an exercise to the reader.

I learned this trick from demerphq as a good trick how to implement the (vast) unicode character ranges, together with quick inversion.

Update: moritz found an earlier mention of the concept described by IBM at Cultured Perl: Inversion lists with Perl. The technical term seems to be "inversion lists", and there is Algorithm::InversionList, which implements the concept. The original interest in the data structure is attributed there to Unicode as well.

Comment on Re: greping big numbers Select or Download Code

Replies are listed 'Best First'.
Re^2: greping big numbers by wol (Hermit) on Mar 24, 2009 at 13:03 UTC
Ingenious. It's possible to deal with overlapping regions fairly easily, on top of this algorithm. Once you have your initial inversion list, and you've adjusted your boundaries, just remove any resulting ranges which are "back to front". Eg (for a complete range of 0-20): input: (4-12), (8-16) initial inversion:(0-4), (12-8), (16-20) tweak boundaries:(0-3), (13-7), (17-20) remove back-to-fronts:(0-3), (17-20) Writing the code is (once again) left as an exercise for the reader (or an instruction to a minion, as applicable). -- use JAPH; print JAPH::asString();	[reply]

Replies are listed 'Best First'.

Re^2: greping big numbers
by wol (Hermit) on Mar 24, 2009 at 13:03 UTC

It's possible to deal with overlapping regions fairly easily, on top of this algorithm. Once you have your initial inversion list, and you've adjusted your boundaries, just remove any resulting ranges which are "back to front".

Eg (for a complete range of 0-20):
input: (4-12), (8-16)
initial inversion:(0-4), (12-8), (16-20)
tweak boundaries:(0-3), (13-7), (17-20)
remove back-to-fronts:(0-3), (17-20)

Writing the code is (once again) left as an exercise for the reader (or an instruction to a minion, as applicable).

--
use JAPH;
print JAPH::asString();

[reply]