I don't see an "missing text". What do you mean by that?

Perhaps what the regex is doing is surprising you?

To debug something like this, capture and print the other terms in the regex.

#!/usr/bin/perl -w use strict; while (<DATA>) { if($_ =~ /(<.+>)(<.+>)(.*)/) { print "$1\n\n"; print "$2\n\n"; print "$3\n\n"; } }
$1 is: <TITLE><![CDATA[<p>Dogs may not smarter than 6-year-olds, but researchers suggest canines might be on par with 2-year-olds.< Psychologist Stanley Coren says, "We do know that dogs understand far more than we credit them with, from about 165 words to 250 words." Eve +n better than understanding our words, dogs know our hand gestures and body postures. Dogs may, in fact, far exceed 2-year-olds when it comes to reading emotions.<BODY> $2 is: <![CDATA[<p>Developmentally, 2-year-olds are generally more interested in themselves, while dogs do care how their people feel, and instantly recognize a change in emotion.< "While your dog can't comprehend that you just received a traffic violation, he can tell that you're upset t +he second you walk through the door," Coren says. "In fact, dogs can dete +ct some subtle changes which even adults can't," adds Coren. "We can't smell cancer or predict seizures, as dogs can."< When I posted this story on my Facebook Fan page recently (<a href=" http://www.new.facebook.com/pages/ Steve-Dale/50057343596?ref=ts"> $3 is: www.new.f acebook.com/pages/Steve-Dale/50057343596?ref=ts, or simply type Steve Dale into the Facebook search), I received some interesting responses:< Kelle: "Heck, my Italian Greyhound is smarter than most college students."< Karen: "Depends on how you define smart.
What I called $3 is what you called $1. Remember that default for regex'es is "greedy", meaning that an expression will match the maximal length thing while still allowing the rest of the regex to match. So these <.+> terms mean to match as much stuff as possible between angle brackets. The second of these terms gets the last pair of angle bracket stuff (update:while still allowing first term to match), first term gets all angle bracket stuff preceding that and 3rd term in regex gets what is left after 2nd term.

Update: Try:

#!/usr/bin/perl -w use strict; while (<DATA>) { if($_ =~ /(<.+>)(.*)/) { print "$1\n\n"; print "$2\n\n"; } }
You are still going to get the same result for the (.*) term. What I called $3 above.

Basically every char of text is "accounted for", nothing is "missing". We know what you called $1 matches. What are you trying to match?

Another Update with a minimal match example:

The below regex uses the ? modifier to say: match the shortest thing possible between the angle brackets. Which are the first two angle bracket things in your DATA. $3 would be everything else following.

#!/usr/bin/perl -w use strict; while (<DATA>) { if($_ =~ /(<.+?>)(<.+?>)(.*)/) { print "$1\n\n"; #prints <TITLE> print "$2\n\n"; #prints <![CDATA[<p> } }

In reply to Re: Some portion of the text missing by Marshall
in thread Some portion of the text missing by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.