Last week I asked about reading a web page and working on it. Kindly Monks suggested I used either WWW::Mechanize and/or Web::Scraper.
I could not easily try either of these since my Perl distribution did not include them. However, LWP::Simple was available and I tried this. This let me do what I want of:
1. Reading a web page
2. Storing it as an html file
3. Opening the html file in Excel
4. Using Perl Excel to read and store the figures from the web page that I needed.
It would have been useful to store the spreadsheet as a spreadsheet so I used the line (see trial code below) $workbook -> SaveAs ($spsh_file);. The file name did include .xlsx as the extension. This did store the spreadsheet. However, when I tried to open it within Excel I was told that the file extension or format was not valid.If I do not have a file extension, it is stored as with .htm as the extension.
Is there a way that you can choose what format will be used when storing an open Excel file?For example, in some circumstances it would be good to store the spreadsheet as a comma delimited file.In yet other circumstances a pdf file would be good.
use strict;
use OLE;
use Win32::OLE::Const "Microsoft Excel";
use LWP::Simple;
my ($catalog, $url, $url_file, $spsh_file);
my ($excel, $workbook, $sheet);
$url = "http://www.oreilly.com/catalog";
$url_file = "C:\\oreilly_file.html";
$spsh_file = "C:\\oreilly_spreadsheet.xlsx";
getstore($url, $url_file);
$excel = CreateObject OLE "Excel.Application";
$excel -> {Visible} = 1;
#___ OPEN EXISTING WORKBOOK
$excel->{DisplayAlerts}=0;
$workbook = $excel -> Workbooks -> Open("$url_file");
$sheet = $workbook -> Worksheets(1) -> {Name};
$sheet = $workbook -> Worksheets($sheet);
$sheet -> Activate;
$workbook -> SaveAs ($spsh_file);
$excel -> Quit;
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.