Thanks for the guidance. I do have some code, but I didn't figure it would help to see it if I didn't provide the file it's trying to extract data from. But I'll place it here anyway. <\p>

while(<STDIN>) { @section = split /Class: Invoice/, $_; @AdminData = split /\n/, $section[0]; @BodyTemp = split /Administrative Data:/, $_; @Body = split /Reply: click here/, $BodyTemp[0]; @Splitterhold = split/Payment Detail - Payment ID /, $_; foreach $Splitterhold(@Splitterhold) { $Splitterhold =~ s/InvoiceDate /Invoice Dateĉ /g; $Splitterhold =~ s/Customer ID /CustomerIDĉ /g; $Splitterhold =~ s/^Phone /Phoneĉ /g; $Splitterhold =~ s/Txn Type Post Day Amount \(USD\)\n/InvoiceDateĉ + /g; $Splitterhold =~ s/Card Type Card Number Exp Date BIN\n/CreditCard +ĉ /g; $Splitterhold =~ s/Name /Nameĉ /g; $Splitterhold =~ s/Address Line 1 /Addressĉ /g; $Splitterhold =~ s/City /Cityĉ /g; $Splitterhold =~ s/State /Stateĉ /g; $Splitterhold =~ s/Email Address /EmailAddressĉ /g; $Splitterhold =~ s/Home phone number /Homephonenumberĉ /g; $Splitterhold =~ s/Last modified on /Lastmodifiedonĉ /g; } #@sector = split /Payment Detail -/, $section[1], /administration +>/; if ($#Splitterhold > 0) { for ($x = 0; $x < $#Splitterhold; $x++) { @Split = split/\n/, $Splitterhold[$x]; @parse = split /ĉ/, @Split; if ($#parse > 0) { $parse[0] =~ s/\W//g; $parse[1] =~ s/\-//g; @AO{$parse[0]} = $parse[1]; } if ($#parsezero > 0) { $parsezero[1]=~ s/\-//g; $IV{$parsezero[0]} = $parsezero[1]; @IVone = push (@IV, @IV); print $IV; } } } $Body[1] =~ s/^one$/1/gi; $Body[1] =~ s/^two$/2/gi; $Body[1] =~ s/^three$/3/gi; $Body[1] =~ s/^four$/4/gi; $Body[1] =~ s/^five$/5/gi; $Body[1] =~ s/^six$/6/gi; $Body[1] =~ s/^seven$/7/gi; $Body[1] =~ s/^eight$/8/gi; $Body[1] =~ s/^nine$/9/gi; $Body[1] =~ s/^zero$/0/gi; $Body[1] =~ s/0ne/1/gi; @PostingBody = split/\n/, $Body[1]; for ($x = 0; $x < $#PostingBody; $x++) { $PostingBody[$x] =~ s/\s//gi; $PostingBody[$x] =~ s/\W//gi; $MC = NULL; if ($PostingBody[$x] =~ m/\d{3}.*\d{3}.*\d{4}/) { $PostingBody[$x] =~ s/\D//gi; $PostingBody[$x] =~ s/\W//g; $MC{'Digits'} = $PostingBody[$x]; } } @elements=('Digits'); for($x=0; $x< @elements; $x++) { print ($MC{$elements[$x]}."\t\t"); $MC = ""; } @elements=("PostID","Location","posted","Reply","Postersage","Part +ner", "AdType","PaidAd","AdPrice","Whitelisted","Name","Phone","Email"," +UserCreated","Settings", "Referrer","IP","AdCreated"); for($x=0; $x< @elements; $x++) { print(@AO{$elements[$x]}."\t"); $AO = ""; } @elements=("Lastmodifiedon", "InvoiceDate", "CreditCard", "Name", +"Address", "City", "State", "EmailAddress", "Homephonenumber", "Custo +merID"); for($x=0; $x< @elements; $x++) { print (@IVone{$elements[$x]}."\t"); $IV = ""; } { print "\n"; } }

In reply to Re^2: Extracting data from a PDF to a spreadsheet by Anonymous Monk
in thread Extracting data from a PDF to a spreadsheet by NonProgrammer

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.