Thank you, zentara
it is working
it took a bit time for me to download and install modules (I'm installing modules at first time :) using the software center ubuntu 12...
Remained work is about designing and making readable lines...
Here, link was for US, also I have to collect data from these sublinks:
NASDAQ: http://finance.yahoo.com/actives?e=o
AMEX: http://finance.yahoo.com/actives?e=aq
NYSE: http://finance.yahoo.com/actives?e=nq
Is it possible to make copy-paste this script (what you wrote for me) under the script and change the url to another url (one of above three links)? In one script I'm planning take data from four urls, is it possible?
| [reply] |
Is it possible to make copy-paste this script (what you wrote for me) under the script and change the url to another url (one of above three links)? In one script I'm planning take data from four urls, is it possible?Sure, it should be as simple as putting it all in a loop. Just put your urls into single quoted strings, and separate with a comma, as shown below.
#!/usr/bin/perl
use strict;
use warnings;
use LWP::UserAgent;
use HTTP::Request::Common qw(GET);
use HTML::TokeParser::Simple;
my $ua = LWP::UserAgent->new;
# Define user agent type
$ua->agent('MyApp/0.1 ');
my @requests = (
'http://finance.yahoo.com/actives?e=us',
'http://finance.yahoo.com/actives?e=o AMEX',
'http://finance.yahoo.com/actives?e=aq',
'http://finance.yahoo.com/actives?e=nq',
);
# loop thru them
foreach my $requested ( @requests ) {
print "STARTING $requested ###########################\n\n\n\n\n";
# Request object
my $req = GET $requested;
# Make the request
my $res = $ua->request($req);
my $con = $res->content;
#print "$con\n";
my $p = HTML::TokeParser::Simple->new( \$con );
while ( my $token = $p->get_token ) {
# This prints all text in an HTML doc (i.e., it strips the HTML)
next unless $token->is_text;
print $token->as_is, "\n";
}
print "ENDING $requested ###########################\n\n\n\n\n\n";
} # end of loop
exit 0;
| [reply] [d/l] |
Yes, this is what I need... thank you zentara, all code are working... now I'm doing remained refexp works
| [reply] |
Finally, I (and zentara) finished scripting this project. Code I paste below.
There is one problem: how I can add also some links to a NYSE.csv file? Links (Chart, Profile, More) of first ten lines must be added. Here, text (Chart, Profile and More) don't have a links, and for NYSE.csv file for first ten rows, these text (Chart, Pro, More) must have links. Any help will move forward these project. Thanks.
#!/usr/bin/perl
use strict;
use warnings;
use LWP::UserAgent;
use HTTP::Request::Common qw(GET);
use HTML::TokeParser::Simple;
my $ua = LWP::UserAgent->new;
# Define user agent type
$ua->agent('MyApp/0.1 ');
my @requests = (
'http://finance.yahoo.com/actives?e=us',
'http://finance.yahoo.com/actives?e=o',
'http://finance.yahoo.com/actives?e=aq',
'http://finance.yahoo.com/actives?e=nq',
);
my @lin = "";
my $file = "trash";
open (FH, ">$file");
select(FH);
# loop thru them
foreach my $requested ( @requests ) {
print "STARTING $requested ###########################\n\n\n\n\n";
# Request object
my $req = GET $requested;
# Make the request
my $res = $ua->request($req);
my $con = $res->content;
#print "$con\n";
my $p = HTML::TokeParser::Simple->new( \$con );
while ( my $token = $p->get_token ) {
# This prints all text in an HTML doc (i.e., it strips the HTML)
next unless $token->is_text;
#push(@lin, $token->as_is, "|");
print $token->as_is, ";";
}
print "ENDING $requested ###########################\n\n\n\n\n\n";
} # end of loop
close (FH);
select (STDOUT);
open (FG, "<$file") || die "Can't open $file for a reading: $!\n";
while (<FG>) {
push(@lin, $2) if $_ =~ /(Volume Leaders;US;NASDAQ;AMEX;NYSE;)(Sym
+bol;Name.*)/g;
}
close(FG);
foreach my $vf (@lin) {
$vf =~ s/; ;/;/g;
# print $vf, "\n";
}
foreach my $vf (@lin) {
$vf =~ s/;/|/g;
# print $vf, "\n";
}
foreach my $vf (@lin) {
$vf =~ s/Symbol\|Name\|Last Trade\|Change\|Volume\|Related Info\|/SYMB
+OL\|NAME\|LAST TRADE\|CHANGE\|VOLUME\|RELATED INFO\n/g;
$vf =~ s/Chart\|, \|Profile\|, \|More\|/Chart, Profile, More\n/g;
$vf =~ s/(\&)\|/\&/g;
$vf =~ s/\| \(/ \(/g;
$vf =~ s/\|(\d|\d\d|\d\d\d)\.(\d|\d\d|\d\d\d)\|(\d|\d\d|\d\d\d)\:(\d|\
+d\d|\d\d\d)/\|$1\.$2 $3\:$4/g;
}
my $date = localtime;
$date =~ s/ /_/g;
my $us = "US-$date.csv";
my $nasdaq = "NASDAQ-$date.csv";
my $amex = "AMEX-$date.csv";
my $nyse = "NYSE-$date.csv";
open (US, ">$us") || die "US.csv: $!\n";
open (NASDAQ, ">$nasdaq") || die "NASDAQ.csv: $!\n";
open (AMEX, ">$amex") || die "AMEX.csv: $!\n";
open (NYSE, ">$nyse") || die "NYSE.csv: $!\n";
print US $lin[1];
print NASDAQ $lin[2];
print AMEX $lin[3];
print NYSE $lin[4];
close (US);
close (NASDAQ);
close (AMEX);
close (NYSE);
unlink $file; # delete a file 'trash'
exit 0;
| [reply] [d/l] |