hi Dave,
thank you for your answer.
I have just tried your advice again - on another computer and without the "big file" - the patterns with the "second part" work now (I added two lines with text without dot in DATA below). Perhaps I made a typo a couple of hours earlier. I only modified the patterns with \b (s. below) since otherwise it matched
12-22a **This is trash*** too.
I'll try this tomorrow again with the "big file" and report here. Thank you very much again.
VE
use strict;
use warnings;
use 5.010;
my $pat1 = qr '(\b\d-\d\w{2}(?:\.\w+)?)';
my $pat2 = qr '(\b[A-Z]\d{2}(?:\.\d+)?)';
my $title;
while ( <DATA> ) {
chomp;
if ( /(.+)(\(.+\))$/ ) { # Better regex for 'title' lines??
$title = "$1$2;";
}
else {
next unless /$pat1|$pat2/;
my @items = grep length, split /$pat1|$pat2/;
say $title, splice @items, 0, 2 while @items;
}
}
__DATA__
Titel Text (A12-3)
3-123.7 Just another (small) text 3-123.8 Some more text
1-234 Text without dot 2-345 More text without dot
A35 Another text without dot
A12.34 Another item B56.78 Yet another item
Another Titel Text (B23-9)
Some trash here
12-22a **This is trash***
1-22a.b Just another text
2-3cd.e Some more text
W12.34 Another item
Z56.78 Yet another item Z56.78 And another!! Z56.7a And another!!!
Some trash
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.