Im trying to extract some lines from a .doc to an excel sheet..

The format of the document is: 1. Hi,perl monks.Have a good day. 2. Checking whether the output is coming.Perlmonks.org 3. HTML formatting is good. Use while formatting.

Im splitting the document into text and im extracting the lines. The o/p is coming like below in excel sheet:

first cell :- Hi,perl monks. Second cell :- Have a good day Third cell:- Checking whether the output is coming. fourth cell:- Perlmonks.org fifth cell :- HTML formatting is good. sixth cell :- Use while formatting.

But i need the output to come like this, i.e the points have to be extracted in means of points.

first cell :- Hi,perl monks. Have a good day Second cell:- Checking whether the output is coming.perlmonks.org Third cell :- HTML formatting is good.Use while formatting.
@files=glob('*.doc'); foreach my $file (@files) { $i=0;$j=0; my $var; $var = $filename."$file"; print $var ; my $document = Win32::OLE -> GetObject("$var"); print "Extracting Text ...\n"; my @array; my $paragraphs = $document->Paragraphs(); my $enumerate = new Win32::OLE::Enum($paragraphs); while(my $paragraph = $enumerate->Next()) { my $text = $paragraph->{Range}->{Text}; $text =~ s/[\n\r\t]//g; $text =~ s/\x0B/\n/g; $text =~ s/\x07//g; chomp $text; my $Data .= $text; @array=split(/\.$/,$Data); foreach my $line( @array) { if($line =~ m/^Document/sis/) { $i=1; $j=0; $Sheet->Cells($row,$col-1)->{'Value'} = $file; } if ($i == 1) { $j=$j+1; } if($line=~ m/$pattern/) { $s=0; } if ($j > 1 && $s!=0) { $Sheet-> Cells($row,$col+6)->{'Value'} = $line ; $row=$row+1; } } }

help out monks


In reply to Extract lines from document to excel by rajkrishna89

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.