in reply to Re: how to extract original string from binary files?
in thread how to extract original string from binary files?

Any feedback on http://search.cpan.org/~mhx/Convert-Binary-C-0.76/lib/ +Convert/Binary/C.pm ? Regards, Swapnil

Replies are listed 'Best First'.
Re^3: how to extract original string from binary files?
by Corion (Patriarch) on May 19, 2016 at 16:51 UTC

    Convert::Binary::C seems like an ill fit for your situation. It is most usable when you want to pack (or unpack) data according a given C structure and you want to match the alignment that the C compiler has applied to the structure.

    So far, you have not shown us that you already have the existing C structure. Because of that, I can only advise against using this module. Maybe you want to look at pack and perlpacktut instead.

      Hi Corion,

      Thanks for reply. I have posted hex dump & code I am using.



      Regards,
      Swapnil

        What you posted is a hex dump of an ASCII file. It certainly is not a hexdump of a binary file.

        I can only suggest that you look at the output of your hexdump and try to explain to us what you see there, and how what you see there should relate to the data in the binary file.

        Maybe you can show us the exact command you used to create this hexdump. It should be something like:

        od -x /home/swapy/thatbinaryfile.bin

      Hi Corion,

      Thanks for your reply. One more input: Binary is created using C & C header file isn't available with me. So could we parse binary C source if header isn't available?



      Regards,


      Swapnil

        Why is your file named binary1.txt? That is a highly misleading name.

        If you don't know the format of your binary file and have no documentation by the person who provided you the file, you have to find out the data format yourself. This is not an easy task for a beginner in programming, but on the other hand, it does not require that much programming and much much more thinking and guess work.

        The first step is to find out what pieces of data could be in the file and if there are repeating structures in the file. Once you have determined these boundaries, you can then look further.

        You mentioned an EBCDIC data structure earlier. Maybe you can find out with a printout of the hexdump in one hand, and a colored marker in the other what parts belong where.

        Byte Field Code 1 - 11 Record header 12 Type of activation (1) BIN 13 Type of message (2) BIN 14 - 19 Name of the process EBCDIC 20 - 25 Name of the command EBCDIC 26 Type of terminal (3) BIN 27 Address of terminal BIN 28 - n Complementary message EBCDIC

        So, you could mark the bytes 0 to 10 with one color, marking them as the header.

        The byte number 12 would be the type of activation and the byte number 13 would be the message type.

        The name of the process would be a 5 byte string, but the bytes are encoded in EBCDIC. You can find a translation table from EBCDIC to ASCII online. Use that table to translate the text to something human readable and to understand the data.

        Starting with byte 28, there is more EBCDIC encoding another message. Repeat the previous process, using the same colours for the same parts of the message.