in reply to Regex "(un)Knowledge"

First, let us do something a bit easier. We'll use the character offset instead of the line number as key to the hash:

my %hash; my $re= qr{ ( /\* .*? \*/) | ( \/\/[^\n]*) | " (?: [^"\\]* | \\. )* " | ' (?: [^'\\]* | \\. )* ' | . [^/"']* }xs; while( /$re/g ) { $hash{pos($_)}= $1; }
Then there are several ways to convert character offsets into line numbers. If none of your patterns spanned lines, then I'd probably update the regex to match newlines separately so I could increment a line number count in the same loop. But /* */ can span lines so I think I'd instead do a merge-sort-ish thing similar to:
my @nl; while( /\n/g ) { push @nl, pos($_); } my $ln= 1; while( /$re/g ) { $ln++ while $nl[$ln-1] < pos($_); $hash{$ln}= $1; }
Except I think there is probably at least one off-by-one error in that code. For example, pos($_) might need to be replaced with something from @- or @+ in one or both of those places.

I hope it gives you an idea where to start to get what you are looking for.

                - tye

Replies are listed 'Best First'.
Re: Re: Regex "(un)Knowledge" (loop)
by nofernandes (Beadle) on Jul 15, 2003 at 18:20 UTC

    But in this case how can i read a file!!?

    Considering that i must read all the content of the file at once in order to catch multiline comments!!

      I think you misunderstood something.

      You can/should read the file all at once with my approach (I already assumed you were doing this based on comments elsewhere in the thread and because, as you note above, you need to match multi-line comments).

      Just use your existing file-reading code. I only rewrote the code you provided so you only need to replace that part of your code (except I didn't bother to repeat the close statement). (update:) nor the file reading code (my mind simply blocked out the that part of your code).

                      - tye

        Hmm i see.. but i cannot make this run!! Iīm a little bit of newbie in the Perl Language!!!!!

        Can you explain me why does my code donīt work!??

        $file="Finger.java"; open(F,"$file"); undef $/; my %hash; my $re= qr{ ( /\* .*? \*/) | ( \/\/[^\n]*) | " (?: [^"\\]* | \\. )* " | ' (?: [^'\\]* | \\. )* ' | . [^/"']* }xs; while( /$re/g ) { $hash{pos($_)}= $1; } my @nl; while( /\n/g ) { push @nl, pos($_); } my $ln= 1; while( /$re/g ) { $ln++ while $nl[$ln-1] < pos($_); $hash{$ln}= $1; } @keys=sort {$a<=>$b} (keys %hash); foreach $key (@keys) { $value=$hash{$key}; $hash_ordenada{$key}=$value; print "Line: $key\t$value\n"; }

        Thank you very much!

        NUNO