Hi Util, The sample attachment posted is not in plain text. It comes as "text/html".
Dumping the attachment content yields the following output (I'm sorry to paste the html, bear with me):

------=_Part_64_24417480.1094663686411 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: BASE64 ------=_Part_64_24417480.1094663686411 Content-Type: application/octet-stream; name=User_Logins_UnitDaily_Test_User_09_08_2004_10_14_45_DAY20 +04-9-8_18.mhtml Content-Transfer-Encoding: BASE64 Content-Disposition: attachment; filename=User_Logins_UnitDaily_Test_User_09_08_2004_10_14_45_D +AY2004-9-8_18.mhtml snip... @^@^@^@^A^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^ +@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@Admin_Logins_UnitDaily_Test_User_09_08_2004_10_14_45_DAY2004-9-8_18. +mhtml^@^@^@^@^@^@^@^@^ @^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@Authent +i^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ +^@^@ ^@Admin_Logins_UnitDaily_Test_User_09_08_2004_10_14_45_DAY2004-9-8_18. +mhtml ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ +^@^@^@^@^@application/octet-stream^@ ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ +^@^@^@^@attachment^@ <snip><snip>... Content-Type: multipart/related; boundary="----=_NextPart_000_0000_01C0D2F6.0C049AB0"; type="text/html" This is a multi-part message in MIME format. ------=_NextPart_000_0000_01C0D2F6.0C049AB0 Content-Type:text/html; charset=utf8 Windows-1252 Content-Transfer-Encoding: quoted-printable <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Frameset//EN"> <html> <head> <title><font color=3Dred size=3D4>Firewall Reports</font> - Authentica +tion/Login</title> <LINK REL=3DSTYLESHEET HREF=3D"/sgms/reports/reports.css" TYPE=3D"text +/css"> </head> <body bgcolor=3D"#FFFFFF"> <SCRIPT LANGUAGE=3D"JavaScript"> document.write(""); document.write("<!--"); var sWidth=3D725; var sHeight=3Dwindow.screen.availHeight-100; if(navigator.appName=3D=3D'Microsoft Internet Explorer') { window.moveTo(15,50); window.resizeTo(sWidth,sHeight);document.write("<div align=3D'center'> +<font size=3D'1' face=3D'Verdana, Arial, Helvetica, sans-serif' color=3D'#000000'><b>Scheduled "); document.write(" report for Cisco at IP address&nbsp;18.1 +87.12.10 (Test1 - 0002321)</b></font></div>"); document.write(" </td>"); document.write(" <td width=3D'35'>&nbsp;</td>"); document.write(" </tr>"); document.write(" <tr>"); document.write(" <td width=3D'20'>&nbsp;</td>"); document.write(" <td width=3D'200'>&nbsp;</td>"); }document.write(" <tr>"); document.write(" <td bgcolor=3D'#CCCCCC' width=3D'759' align=3D'left' +><font color=3D'#000000' face=3D'Verdana, Arial, Helvet ica' size=3D'2'><b>&nbsp; Admin Logins for&nbsp;2004-09-08</b></font>< +/td>"); document.write(" </tr>"); if(ver =3D=3D 4) { document.write("<td nowrap style=3D'white-space: nowrap;padding-right: + 20px;padding-left: 20px;FONT-FAMILY: Helvetica, Arial, Times, Times New Roman;COLOR: #ffffff;BACKGROUND-COLOR: #003399;FONT-SIZE: 8p +t;FONT-STYLE: normal;FONT-WEIGHT: normal;height: 20;' type=3D'text/css'> Time </td><td nowrap style=3D'white-space: nowrap;p +adding-right: 20px;padding-left: 20px;FONT-FAMILY: Helvetica, Arial, Times, Times New Roman;COLOR: #ffffff;BACKGROUND-COLOR: #003399 +;FONT-SIZE: 8pt;FONT-STYLE: normal;FONT-WEIGHT: normal;height: 20;' type=3D'text/css'>Source</td></tr>"); } document.write("<td nowrap>09:31:26</td><td>192.168.128.2</td></tr>"); document.write("<td nowrap>09:31:39</td><td>192.168.128.2</td></tr>"); { document.write("<font class=3DtimezoneDisclaimer color=3D'#000000' fac +e=3D'arial' size=3D'1'>&nbsp;&nbsp;* Reports generated based on data summarized on: 09/08/2004 16:47:38 UTC</font>"); } document.write("<br>"); document.write("<font face=3D'arial' size=3D'1' color=3D'#999999'>&nbs +p;&nbsp;* Report generated in 0.047 secs. </font>"); document.write(""); document.write(""); </SCRIPT> </body> </html>------=_NextPart_000_0000_01C0D2F6.0C049AB0 Content-Type: image/gif Content-Transfer-Encoding: base64 Content-Location: file:///C:/SGMS2/Tomcat/webapps/sgms/images/ubslogo. +gif R0lGODlhYQAmAPQAAP///wMDA5+fn/8AAPLt7UpKSv9eXv+dnSgoKNPT03Nzc/8iIv//// +97e7u7 u//MzAECAwECAwECAwECAwECAwECAwECAwECAwECAwECAwECAwECAwECAwECAwECAwECAy +H/C01T T0ZGSUNFOS4wGAAAAAxtc09QTVNPRkZJQ0U5LjAQAiDF3gAh/wtNU09GRklDRTkuMBgAAA +AMY21Q <snip><snip>....
Can you let me know how to parse the required data from the above .

Thanks well in advance

In reply to Re^2: Parsing mhtml attachment reports by chanakya
in thread Parsing mhtml attachment reports by chanakya

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.