Hi,

I'm processing the text below, trying to extract all mentions of chromosomal bands (they're the ones with the numbers and p's and q's). Using this code:

$c='[\d\-\.pqxy]'; #regexp (@chroms)=($text=~/\s$c*?\s/sig); #extract all for ($i=0; $i<@chroms; $i++) { splice(@chroms, $i, 1) if (!($chroms[$i]=~/[pqxy]/i)); } #eliminate pure numbers print "$_\n" foreach (@chroms);

The print statement yields:

Xq27-q28 q11q12 22q121 1q422q43 19q 17p11 1q25 13q123 11p112 1q25 1q2331 8p22 8p22 7q1123 7p22 19q12q1311 52 15 19q12q1311

Two questions: why doesn't the splice statement get rid of the "52" and "15" elements? Why do none of the dots or hyphens appear for the captured elements?

Whole-genome scan studies recently identified a locus on Xq27-q28 Xq11-q12 22q12.1 1q42.2-q43 19q 17p11 1q25 13q12.3 11p11.2 10q25 10q23.31 8p22 8p22 7q11.23 7p22 chromosome segments 19q12-q13.11 linked to prostate tumor aggressiveness by use of the Gleason score as a quantitative trait. We have now completed finer-scale linkage mapping across this region that confirmed and narrowed the candidate region to 2 cM, with a peak between markers D19S875 and D19S433. We also performed allelic imbalance (AI) studies across this region in primary prostate tumors from 52 patients unselected for family history or disease status. A high level of AI was observed, with the highest rates at markers D19S875 (56%) and D19S433 (60%). Furthermore, these two markers defined a smallest common region of AI of 0.8 Mb, with 15 (29%) prostate tumors displaying interstitial AI involving one or both markers. In addition, we noted a positive association between AI at marker D19S875 and extension of tumor beyond the margin (P = 0.02) as well as a higher Gleason score (P = 0.06). These data provide strong evidence that we have mapped a prostate tumor aggressiveness locus to chromosome segments 19q12-q13.11 that may play a role in both familial and non-familial forms of prostate cancer.


In reply to regexp and substitute operator problem by dannoura

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.