Hello guys,

I am trying to use PERL to extract paragraphs from a text. However, the code does not generate the results I expect. Here are the codes I wrote:

my $string = <<'TEXT'; Assembly and Manufacturing The Company's assembly and manufacturing operations include PCB a +ssembly and the manufacture of subsystems and complete products. Its PCB assem +bly activities primarily consist of the placement and attachment of electr +onic and mechanical components on printed circuit boards using both SMT and tra +ditional pin-through-hole ("PTH") technology. The Company also assembles subsys +tems and systems incorporating PCBs and complex electromechanical components, a +nd, increasingly, manufactures and packages final products for shipment di +rectly to the customer or its distribution channels. The Company employs just-in +-time, ship-to-stock and ship-to-line programs, continuous flow manufacturing +, demand flow processes and statistical process control. The Company has expand +ed the number of production lines for finished product assembly, burn-in and +test to meet growing demand and increased customer requirements. In addition, +the Company has invested in FICO, a producer of injection molded plastic f +or Asia electronics companies with facilities in Shenzhen, China. As OEMs seek to provide greater functionality in smaller products +, they increasingly require advanced manufacturing technologies and processes +. Most of the Company's PCB assembly involves the use of SMT, which is the leadi +ng electronics assembly technique for more sophisticated products. SMT is + a computer-automated process which permits attachment of components dire +ctly on both sides of a PCB. As a result, it allows higher integration of elec +tronic components, offering smaller size, lower cost and higher reliability t +han traditional manufacturing processes. By allowing increasingly complex +circuits to be packaged with the components placed in closer proximity to each +other, SMT greatly enhances circuit processing speed, and therefore board and sys +tem performance. The Company also provides traditional PTH electronics ass +embly using PCBs and leaded components for lower cost products.; TEXT local $/ = ""; open my ($str_fh), '<', \$string; while ( <$str_fh> ) { print "New Paragraph: $_\n","*" x 40, "\n" ; } close $str_fh;

The text is a part of annual report of this company and is available at https://www.sec.gov/Archives/edgar/data/32272/0000950147-97-000151.txt.

I expect the code returns the paragraphs, however, I got the whole text back. I am quite confused with these errors.

Moreover, is it possible to still get paragraphs separately even if the current "blank" lines do not count as paragraph separator? Would anyone help me with this issue?

Thanks so much!!! Best Regards

In reply to Extract Paragraph From Text by perlbeginneraaa

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.