Re^3: Working with regexes

well. you just can´t apply a regex that is suppused to match multiple lines repeatedly to single lines. that won´t ever work.

this will do the job:

#!/usr/bin/perl

use warnings;
use strict;

my $infile  = "input.txt";
my $outfile = "output.txt";
my $text;

{
    local $/=undef; #set line separator locally to undef to read in th
+e file as a whole
    open(FILE, $infile) or die "died opening $infile for input: $!\n";
    $text = <FILE>;
    close(FILE);
}

open OUT, ">$outfile" or die "died opening $outfile for output: $!\n";

#repeatedly match searched text, while deleting found occurences 
#to avoid endless loop

while ( $text =~ s/([A-Z]+) *- *([^-]+) *- *([^"]+)"([^"]+)".+?([0-9]+
+)$//sm )
{
    print OUT "CATEGORY $1\n\n",
              "KEYWORDS $2\n\n",
              "Summar text: $3\n\n",
              "Reference: $4\n\n",
              "Id: $5\n\n",
}

close OUT;
[download]

i suggest reading perlre

Update:

this one is better because it uses \G to avoid unneccessary backtracking

#!/usr/bin/perl

use warnings;
use strict;

my $infile  = "input.txt";
my $outfile = "output.txt";
my $text;

{
    local $/=undef; #set line separator locally to undef to read in th
+e file as a whole
    open(FILE, $infile) or die "died opening $infile for input: $!\n";
    $text = <FILE>;
    close(FILE);
}

open OUT, ">$outfile" or die "died opening $outfile for output: $!\n";

#repeatedly match searched text, using \G
#to avoid endless loop

while ( $text =~ m/\G.*?([A-Z]+) *- *([^-]+) *- *([^"]+)"([^"]+)".+?([
+0-9]+)$/gsm )
{
    print OUT "CATEGORY $1\n\n",
              "KEYWORDS $2\n\n",
              "Summar text: $3\n\n",
              "Reference: $4\n\n",
              "Id: $5\n\n",
}

close OUT;
[download]

Comment on Re^3: Working with regexes Select or Download Code