ballJoint has asked for the wisdom of the Perl Monks concerning the following question:

Forget what you see in explorer.  Here what my sciprt saw.
Look for "C:/Documents and Settings/S0007617/Local Settings/History/History.IE5"

"."
/".."
/"desktop.ini"
/"index.dat"
/"MSHist012004021620040223"
/"MSHist012004022320040301"
/"MSHist012004030120040308"
/"MSHist012004030820040309"
/"MSHist012004030920040310"
/"MSHist012004031020040311"
/"MSHist012004031120040312"
/"MSHist012004031220040313"
/"MSHist012004031320040314"

Question is, what's the format?  How do I read this.
 If I do a strings on the file I can see the URL text.


per tachyon - not even an edit.  Thanks.  Works great!
m!(http:\w\.\-:/+)!" 

Replies are listed 'Best First'.
Re: explorer ie6 - history - reading it
by tachyon (Chancellor) on Mar 13, 2004 at 20:18 UTC

    The format is binary ill documented and proprietary - standard M$. You could use Win32::OLE to get at it via Explorer but given that the URLs are in plain text within these files you can just do something like this to extract them into a file:

    perl -ne "BEGIN{$/=' '}print qq!$1\n! if m!(http:[\w\.\-:/]+)!" index +.dat > outfile.txt

    As it is a binary file we can't depend on the default "\n" input record separator. We use a single space as there are space chars around every URL so we get a single URL per chunk of input. Kinda like newlines but not.

    I have IE6 on Win2k but I don't have any files on the system called MSHistNNNNN.. Eyeballing your files it looks like the last 8 digits are YYYYMMDD. It looks the same way IE displays it with individual files for the last 7 days then some sort of archive for 2 and 3 weeks ago.

    HTH

    cheers

    tachyon

      It's an Explorer trick. Even with the options to view system files and hidden files, Explorer will not show you your own IE6 History in this format. If you look at the History folder of a user other than the currently logged in user, you'll see these "special" folders. You can also load up cmd.exe and cd "C:\Documents and Settings\username\Local Settings\History\History.IE5". It's really fricked up. Even cmd.exe gets confused with some things here. If you chdir up once to (Local Settings\History) and do a 'dir', it'll tell you file not found. Even though you can still cd to History.IE5. That's Microsoft for you.

        Try a "dir /a" instead. The directory itself is marked as System.
Re: explorer ie6 - history - reading it
by Anonymous Monk on Mar 13, 2004 at 20:44 UTC

    My best guess is the format goes like this:

    # $c_xxxx means 'current' year, month and date # $l_xxxx means 'last' year, month and date # (of the previous history file) "MSHist01" . $l_year . $l_month . $l_date . $c_year . $c_month . $c_date
Re: explorer ie6 - history - reading it
by iguanodon (Priest) on Mar 14, 2004 at 01:18 UTC
    Well this is pretty interesting. It seems like Windows is deliberately suppressing the contents of these directories when viewed from explorer or cmd.exe. But I have to wonder... why?