gk2010 has asked for the wisdom of the Perl Monks concerning the following question:

I am looking for a way to accurately get the URL data from the tcp->{data} value using NetPacket. What I have right now seems to work but the data I am getting is sometimes incomplete. Any help on making this more accurate? What I need to do is quickly detect executable, pdf or other downloads so my other app can use the URL to retrieve the binary, analyze it and then if something is out of the ordinary perform further actions.
... my $object = Net::Pcap::open_live($dev, 1024, 1, 0, \$err); ... sub token_data { my $data = shift; my @items = split (/\n|\r\n|\r/,$data); my $item; my $host = ''; my $uri = ''; foreach $item (@items) { if ($item =~ /GET\s+(.*)\s+HTTP.*/) { $uri= $1; } if ($item =~ /Host:\s+(.*)/) { $host = $1; } if ($item =~ /User-Agent:\s+(.*)/) { $agent = $1; $agent =~ s/\(//g; $agent =~ s/\)//g; } } return "$host,$uri,$agent"; } sub process_packet { my($user_data, $header, $pkt) = @_; my $ether_data = NetPacket::Ethernet::strip($pkt); my $ip = NetPacket::IP->decode($ether_data); my $tcp = NetPacket::TCP->decode(ip_strip(eth_strip($pkt))); if ($tcp->{data} =~ /GET|POST|get|post/) { if ($tcp->{data} =~ /.*Host:\s+(.*)/) { my $result = token_data ($tcp->{data}); print "DEBG::$ip->{'src_ip'},$ip->{'dest_ip'},$tcp->{'de +st_port'},$result\n"; } } }

Replies are listed 'Best First'.
Re: libpcap, netpacket and decoding http data
by Illuminatus (Curate) on Mar 17, 2010 at 15:42 UTC
    You did not include how you are getting the pcap data. Things like tcpdump use a default snaplen of something like 90. This is sometimes not even enough to capture the entire header info (like if you are in a tunnel). Your capture may not include all of the header information you are looking for.
      good point. I will go back and edit the post
Re: libpcap, netpacket and decoding http data
by Corion (Patriarch) on Mar 17, 2010 at 14:59 UTC

    You will need to combine the TCP packets you're receiving to restore the TCP stream. I've done something like that with Sniffer::HTTP. Also note that "requesting" the same resource will not necessarily give you the same response.

      i looked at that but i could not find a way to write the resulting data to disk. The when i read through the module I noticed there was a note explaining why. I need to dump the data to a file while i debug and write the app.

        You can write things to disk, but you will first accumulate the complete download in RAM. The module documentation says so.