merrymonk has asked for the wisdom of the Perl Monks concerning the following question:

Last week I asked about reading a web page and working on it. Kindly Monks suggested I used either WWW::Mechanize and/or Web::Scraper.
I could not easily try either of these since my Perl distribution did not include them. However, LWP::Simple was available and I tried this. This let me do what I want of:
1. Reading a web page
2. Storing it as an html file
3. Opening the html file in Excel
4. Using Perl Excel to read and store the figures from the web page that I needed.

It would have been useful to store the spreadsheet as a spreadsheet so I used the line (see trial code below) $workbook -> SaveAs ($spsh_file);. The file name did include .xlsx as the extension.
This did store the spreadsheet. However, when I tried to open it within Excel I was told that the file extension or format was not valid.
If I do not have a file extension, it is stored as with .htm as the extension.

Is there a way that you can choose what format will be used when storing an open Excel file?
For example, in some circumstances it would be good to store the spreadsheet as a comma delimited file.
In yet other circumstances a pdf file would be good.
use strict; use OLE; use Win32::OLE::Const "Microsoft Excel"; use LWP::Simple; my ($catalog, $url, $url_file, $spsh_file); my ($excel, $workbook, $sheet); $url = "http://www.oreilly.com/catalog"; $url_file = "C:\\oreilly_file.html"; $spsh_file = "C:\\oreilly_spreadsheet.xlsx"; getstore($url, $url_file); $excel = CreateObject OLE "Excel.Application"; $excel -> {Visible} = 1; #___ OPEN EXISTING WORKBOOK $excel->{DisplayAlerts}=0; $workbook = $excel -> Workbooks -> Open("$url_file"); $sheet = $workbook -> Worksheets(1) -> {Name}; $sheet = $workbook -> Worksheets($sheet); $sheet -> Activate; $workbook -> SaveAs ($spsh_file); $excel -> Quit;

Replies are listed 'Best First'.
Re: Reading and storing a web page as a spreadsheet query
by planetscape (Chancellor) on Sep 17, 2010 at 08:19 UTC
Re: Reading and storing a web page as a spreadsheet query
by moritz (Cardinal) on Sep 17, 2010 at 09:39 UTC
    Using Perl Excel to read and store the figures from the web page that I needed. It would have been useful to store the spreadsheet as a spreadsheet so I used the line (see trial code below) $workbook -> SaveAs ($spsh_file);. The file name did include .xlsx as the extension. This did store the spreadsheet. However, when I tried to open it within Excel I was told that the file extension or format was not valid.

    What exactly is "Perl Excel"?

    If you're referring to the Spreadsheet::WriteExcel module (if so, why not call it by name?), it writes old-fashioned excel files, so give them a .xls extension, not .xlsx.

    Perl 6 - links to (nearly) everything that is Perl 6.
      By Perl Excel I meant just the functions you get with when you 'say' (see the first lines of the code)
      use Win32::OLE::Const "Microsoft Excel";
      I am sorry if this obscured what I was asking.
Re: Reading and storing a web page as a spreadsheet query
by dasgar (Priest) on Sep 17, 2010 at 13:48 UTC

    merrymonk, what version of Excel are you using and which version of Excel file are you wanting to save it as? Based on the error message that you described, it sounds like you saved the Excel file in one version format and when you manually opened the file, Excel was expected a different version format.

    • If you're using Excel 2003:
      • You can only save the file as Excel 2003 format, which means you need to use .xls file extension.
    • If you're using Excel 2007 and want to save it in Excel 2003 format:
      • You need to use the .xls file extension and need to add FileFormat parameter to the SaveAs method.
      • Example: $workbook->SaveAs(Filename => "C:\\test.xls", Fileformat => xlExcel8);
    • If you're using Excel 2007 and want to save in Excel 2007 format:
      • You need to use the .xlsx file extension.
      • I forgot the full reasons behind it and it might not apply to your situation, but one of my scripts has an additional parameter for the SaveAs method in this scenario.
        Example: $workbook->SaveAs(Filename => "C:\\test.xls", ConflictResolution => xlLocalSessionChanges);

    Hopefully this information will help point you in the right direction in debugging the problem.

      That was just the sort of thing I was looking for! Many thanks.
      Do you happen to know of a link that gives more details?
      I will of course be 'googling' for some more about this.