I tried using your code with a couple of changes, primarily substituting in the regex I had for the my sentences (the data that I'm using has a ton of punctuation, such as ... and "" and so on, so the regex you had in your example wasn't what I needed and was cutting out a lot of the data). However, when I put my regex in, it went back to the same issue I had before, which was that it would print out the first paragraph with the sentence brackets around the sentences but then print out the same paragraph again but with the paragraph brackets around the whole paragraph...
Just so you can have an idea of what I changed, I've included the code below, with an excerpt of the text file I'm using.
local $/ = ""; open $fh, $ARGV[0] or die "File $ARGV[0] not found!\n"; $scount = 0; $pcount=0; @paragraphs; while ($paragraph = <$fh>){ @sentences; while($paragraph =~ /\s*(([A-Z][A-Za-z]*)(((([A-Za-z]|[0-9])*((\'* +|\-*)[A-Za-z]*))\s*(\.{3})*\!*\"*\(*\)*\,*\:*\s*)*(([A-Za-z]|[0-9])*) +)(\.|\?|\!))/g){ push @sentences, "<s>$1</s>"; $scount++; } push @paragraphs, "<p>\n\t" . join("\n\t", @sentences) . "\n</p>\n +"; $pcount++; } print for @paragraphs; print "\n Total Lines: $scount\n"; print "\n Total Paragraphs: $pcount\n";
Data:
But the truth is that in the short run, markets can occasionally be pushed, especially when so many decisions to buy or sell are keyed off what everyone else in the market is doing. Chain reactions are not much harder to start (in fact, given how quickly price moves get noticed, they may be easier) than they were 70 years ago.
All that notwithstanding, the interesting thing about the Greenspan resignation rumor was that it raised an obvious question: Would it really matter? As Jacob Weisberg just pointed out in " Ballot Box," Steve Forbes is apparently the only American who doesn't think Greenspan has done a terrific job as Fed chairman. And most of us would be happy to have Greenspan stay in office even after his current term expires in the middle of next year. But it's interesting to note that in the past couple of months there have been more than a few voices--including those of economists Greg Mankiw and Robert Barr--suggesting that Greenspan has been more the beneficiary of good economic fundamentals than the creator of them.
That position may be a bit overstated, particularly since Greenspan has shown an unusual ability to let his thinking on inflation, productivity, and the economy's possible growth rate evolve in response to changing data. But the essential point, that the soundness of this economy does not depend on Greenspan's presence at the head of the Fed, is right. That might not be the case if Greenspan's successor were either an inflation dove like William Greider or a perma-bear like Jim Grant. But whoever would succeed Greenspan would be nothing of the sort. He or she would be, in a word, Greenspanian, still concerned about the possibility of an overheating economy but also convinced that important technological changes have allowed this economy to grow faster than in the past without sparking inflation.
If anything, in fact, the bond market should have rallied on news that Greenspan might be stepping down, since he has long since stopped being paranoid enough for bondholders, who seem perpetually convinced that the United States is about to become Brazil. There are certainly Fed governors out there who would be far more likely to raise interest rates aggressively at the first hint of price pressures than Greenspan.
In reply to Re^4: How to match regex over multiline file
by kyaloupe
in thread How to match regex over multiline file
by kyaloupe
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |