When I run your code, I get a quite different set of results from those you posted.
P:\test>test
Xq27-q28
22q12.1
19q
1q25
11p11.2
10q23.31
8p22
7p22
19q12-q13.11
52
15
19q12-
Which makes it difficult to try and answer your two questions.
In any case, it is probably better to specify the regex more accurately so that you don't capture the unwanted values in the first place. This seems to do the trick on the sample you supplied.
#! perl -w
use strict;
my $text = <DATA>;
my $re_1chrom = '[\dXY]*[pq]\d+(?:.\d+)?';
my $re = qr[\b$re_1chrom(?:-$re_1chrom)?\b]i;
my @chroms = ( $text =~ /$re/g );
print $_, $/ for @chroms
__END__
P:\test>test
Xq27-q28
Xq11-q12
22q12.1
1q42.2-q43
17p11
1q25
13q12.3
11p11.2
10q25
10q23.31
8p22
8p22
7q11.23
7p22
19q12-q13.11
19q12-q13.11
The sequential double reference to 8p22 seems odd in context, but eliminating that and the other duplicate(s) is a simple step.
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.