Web scraping linkedin.com

alexgrimmy has asked for the wisdom of the Perl Monks concerning the following question:

I didn't find easy to export my contact information for LinkedIn so I started to look at Mojo::UserAgent for ways to "scrape" my contact info off LinkedIn. To put it mildy, I'm failing miserably. I can get the transactor to pull the default page, but I tried to post my log information but quickly have no idea why what I see returned is different than the "view source" in a browser. Any advice is greatly appreciated.

Comment on Web scraping linkedin.com

Replies are listed 'Best First'.
Re: Web scraping linkedin.com by Your Mother (Archbishop) on May 08, 2015 at 02:19 UTC
(LinkedIn) User Agreement 8.2. Don'ts. You agree that you will not: …Use manual or automated software, devices, scripts robots, other means or processes to access, “scrape,” “crawl” or “spider” the Services or any related data or information;… I am guessing your actual problem lies with JavaScript stuff regarding their sessions in which case you would need a JS aware agent like WWW::Mechanize::Firefox or WWW::Selenium but it could be as simple as changing the name of the UserAgent so it’s not a flagged bot/agent name. Still you’re not supposed to do this here and I personally wouldn’t help you because it can cast Perl and its fans in a bad light and as poor Netizens.	[reply]
Re: Web scraping linkedin.com by Albannach (Monsignor) on May 08, 2015 at 05:03 UTC
Just use their export connections feature: https://www.linkedin.com/people/connections - select all then choose export in the lower right. -- I'd like to be able to assign to an luser	[reply]
Re: Web scraping linkedin.com by Gangabass (Vicar) on May 08, 2015 at 03:47 UTC
You can try WWW::Mechanize::PhantomJS	[reply]

(LinkedIn) User Agreement

8.2. Don'ts. You agree that you will not: