perlbeginneraaa has asked for the wisdom of the Perl Monks concerning the following question:
I am trying to use PERL to extract paragraphs from a text. However, the code does not generate the results I expect. Here are the codes I wrote:
my $string = <<'TEXT'; Assembly and Manufacturing The Company's assembly and manufacturing operations include PCB a +ssembly and the manufacture of subsystems and complete products. Its PCB assem +bly activities primarily consist of the placement and attachment of electr +onic and mechanical components on printed circuit boards using both SMT and tra +ditional pin-through-hole ("PTH") technology. The Company also assembles subsys +tems and systems incorporating PCBs and complex electromechanical components, a +nd, increasingly, manufactures and packages final products for shipment di +rectly to the customer or its distribution channels. The Company employs just-in +-time, ship-to-stock and ship-to-line programs, continuous flow manufacturing +, demand flow processes and statistical process control. The Company has expand +ed the number of production lines for finished product assembly, burn-in and +test to meet growing demand and increased customer requirements. In addition, +the Company has invested in FICO, a producer of injection molded plastic f +or Asia electronics companies with facilities in Shenzhen, China. As OEMs seek to provide greater functionality in smaller products +, they increasingly require advanced manufacturing technologies and processes +. Most of the Company's PCB assembly involves the use of SMT, which is the leadi +ng electronics assembly technique for more sophisticated products. SMT is + a computer-automated process which permits attachment of components dire +ctly on both sides of a PCB. As a result, it allows higher integration of elec +tronic components, offering smaller size, lower cost and higher reliability t +han traditional manufacturing processes. By allowing increasingly complex +circuits to be packaged with the components placed in closer proximity to each +other, SMT greatly enhances circuit processing speed, and therefore board and sys +tem performance. The Company also provides traditional PTH electronics ass +embly using PCBs and leaded components for lower cost products.; TEXT local $/ = ""; open my ($str_fh), '<', \$string; while ( <$str_fh> ) { print "New Paragraph: $_\n","*" x 40, "\n" ; } close $str_fh;
The text is a part of annual report of this company and is available at https://www.sec.gov/Archives/edgar/data/32272/0000950147-97-000151.txt.
I expect the code returns the paragraphs, however, I got the whole text back. I am quite confused with these errors.
Moreover, is it possible to still get paragraphs separately even if the current "blank" lines do not count as paragraph separator? Would anyone help me with this issue?
Thanks so much!!! Best Regards
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Extract Paragraph From Text
by kcott (Archbishop) on Sep 08, 2015 at 07:20 UTC | |
by perlbeginneraaa (Novice) on Sep 08, 2015 at 15:18 UTC | |
by 1nickt (Canon) on Sep 08, 2015 at 15:28 UTC | |
by perlbeginneraaa (Novice) on Sep 08, 2015 at 15:37 UTC | |
by 1nickt (Canon) on Sep 08, 2015 at 15:50 UTC | |
by AnomalousMonk (Archbishop) on Sep 08, 2015 at 15:51 UTC | |
|
Re: Extract Paragraph From Text
by shadowsong (Pilgrim) on Sep 08, 2015 at 08:35 UTC | |
by perlbeginneraaa (Novice) on Sep 08, 2015 at 15:27 UTC | |
|
Re: Extract Paragraph From Text
by 2teez (Vicar) on Sep 08, 2015 at 07:31 UTC | |
by perlbeginneraaa (Novice) on Sep 08, 2015 at 15:28 UTC | |
|
Re: Extract Paragraph From Text
by CountZero (Bishop) on Sep 08, 2015 at 21:17 UTC | |
|
Re: Extract Paragraph From Text
by locked_user sundialsvc4 (Abbot) on Sep 08, 2015 at 12:22 UTC | |
by perlbeginneraaa (Novice) on Sep 08, 2015 at 15:31 UTC | |
by locked_user sundialsvc4 (Abbot) on Sep 09, 2015 at 22:11 UTC |