Hello, this is my first post here at perlmonks.org. I'm glad to be a part of the community.

So I'm working on building a spider in perl. This spider is supposed to crawl my company's intranet. The goal is to grab all the links from a page and add them to an array called @foundLinks. After populating the array with said links, it will go back through those found links and get the links on each page. The ultimate goal is to find .doc and .docx files on the intranet so that those can be converted into PDF documents. The actual conversion is going to be handled by some VBScript. The intranet spider is going to work in conjunction with another spider that will dig through physical disk paths for the same .doc and .docx files.

Now that you have some background, let's get to the actual problem that I'm having with this spider. There is a server on our intranet that is requesting authentication. My script has no idea what to do, so it gets a 401 (Unauthorized) error message. What I'm trying to do is to get rid of this problem

In an attempt to rectify the issue, I did some digging and came across some code that's supposed to create a Win32::API module known as LogonUser.

use Win32::API; ############################################################## # Add cleanup before using this in production Win32::API->Import('advapi32', 'int LogonUser( LPTSTR lpszUsername, LPTSTR lpszDomain, LPTSTR, lpszPassword, DWORD dwLogon, DWORD dwLogonProvider, PHANDLE phToken )' ) or die "Win32::API->import for LogonUser failed: $^E"; my $ph = ' ' x 4; # Required, but we do not use it #############################################################

This module is supposed to be called in the spider. I'm filtering out links with a decision tree with some regular expressions. The code I have set up will only try to call the LogonUser module when the server that requires authentication is found. The code in the spider subroutine is as follows:

if ( $linkList->[$x] =~ [pattern of the server that needs +authentication] ) { print("You're in the LogonUser block!"); LogonUser( '[username goes here]', '', '', 0x000003, 0 +x000000, $ph ) or die $^E; # ImpersonateLoggedOnUser($ph); }

When I run the .pl file through a batch file, I manage to do some indexing until I hit the server asking authentication. Then I get this message:

Logon failure: unknown user name or bad password

I've even tried passing it on to my supervisor to see if it wasn't the credentials I was providing for the first three arguments (Username, Domain, Password - respectively) He's told me that he has many more permissions on that server than I do. Now, he's busy with the workload of someone out of the office, so he can't review the problem more. By the way, yes I've entered all three credentials before.

Any ideas on what's wrong? I'm curious if there's something wrong with the fourth and fifth parameters I'm passing it. These are the logon type and logon provider parameters. These apparently have more documented constants, but I don't know how to use them without getting an error saying they're not "numeric." This means that, or at least according to the starter code I found, I have to name them with 0x000000, 0x000001 ... etc.

There are 21 possible combinations of these parameters. Out of these combinations, do you guys think there's any particular one that'd be correct? Would it help if I gathered information about what kind of software or hardware is being used to authenticate users? If a particular software is being used (i.e. Kerberos), would it be wiser to use a different module or approach?

Details about the LogonUser module can be found here: LogonUser function (Windows). Details about the possible logon types and logon providers are kept there

Thanks again for your time. I'm pretty new to perl, and while I've gotten a decent grasp of the basics. More complex things like this however are currently beyond my grasp. Any help is appreciated greatly.

Note: Updates below the break. We've resolved the issue.

Update: I checked my username after unlocking my machine to make sure it was correct. I saw that I was missing the domain name (minus the .com part) in my original tries. I put in the correct username, along with the domain and password. It's still failing, but now it taking a few seconds after telling me it's in the LogonUser block before it fails. It's gotta be processing something, right? Note that it only delays before failing if I include the domain, despite it being "optional" according to the LogonUser documentation link above.

Update (3/19/2013): We solved our issues! Apparently our server wants us to use digest authentication. Our lead programmer was able to build something that works! Now I just have to set things up so that or spider can invoke the code to overcome any kind of 401 error we get.


In reply to Authentication Problem - "Logon failure: unknown user name or bad password" by B-Man

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.