I met a weird case when I tried to download the webpage "http://securities.stanford.edu/1008/UTIIQ96/" using LWP::Simple::get() and save the $content to a txt file.
The weird thing is that if I open the txt file using some editor (e.g., UltraEdit), it shows perfectly normal:
<HTML><HEAD><TITLE>Unitech Industries, Inc. - Securities Class Action</TITLE>
However, if I use "print $content" during the downloading . The log shows something differently:
< H T M L > < H E A D > < T I T L E > U n i t e c h I n d u s t r i e s , I n c . - S e c u r i t i e s C l a s s A c t i o n < / T I T L E >
It just adds a space after every character.
When I try to use regex to extract information, the space issue just haunted me all the time as perl will always read the txt file as if it has the extra space!
I will appreciate it if someone can give me some hint on the cause and solution to the problem.
Thank you!
In reply to LWP problem by coltman
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |