Accessing Initialized and/or Loaded Sections of Object Files

eibwen has asked for the wisdom of the Perl Monks concerning the following question:

The other day I found myself on windows and wanted to use `strings`, which I sufficiently replicated in perl readily enough to do what I wanted. On a whim, I decided to finish the port; however I'm having some trouble implementing two of the options from the following excerpts of the `strings` binutils man page:

-a
--all
-
    Do not scan only the initialized and loaded sections of object fil
+es; scan the whole files.

[...]

-T bfdname
--target=bfdname
    Specify an object code format other than your system's default for
+mat. See Target Selection, for more information. 

[...]

-e encoding
--encoding=encoding
    Select the character encoding of the strings that are to be found.
+ Useful for finding wide character strings. Possible values for encod
+ing are:
`s' = single-7-bit-byte characters (ASCII, ISO 8859, etc., default),
`S' = single-8-bit-byte characters,
`b' = 16-bit bigendian,
`l' = 16-bit littleendian,
`B' = 32-bit bigendian,
`L' = 32-bit littleendian.
[download]

My port is presently acting as `strings --all` for all file types, including object files. I am vaguely familiar with general object file structure, ELF in particular, having read several articles/docs recently; however I still don't understand how to differentiate or identify sections (albeit it's likely format dependent), much less distinguish whether a section is loaded and/or initialized.

Question 1: How can I access the content of an initialized and/or loaded section of an object file? How do I identify whether a file is an object file (file magic perhaps, eg File::MimeInfo::Magic)? Are there modules for any of the various formats or will I need to manually open, binmode, and regex? Any resources I should be aware of to understand if/when/why a section is initialized and/or loaded?

The second problem I had (and one I actually know a bit about, hence the emphasis of the title on the first question) was with regard to the encoding. From what I understand of the documentation, encoding of a file can be specified either with the open pragma or appended to the mode argument of a 3-argument open. Given that the encoding is specified as an option and I'd rather not have to move Getopt::Long to a BEGIN block so I can subsequently use the open pragma, I'll go with the mode argument of the 3-arg open, eg:

open(FH, "<:utf8", "file")

Question 2: Can someone confirm that this is the correct (or at least a valid) approach to support encoding of opened files? Does this affect the [:print:] class (`strings` returns only printable characters after all), or is this constant irrespective of encoding? If the latter, how would I adjust the class to support, eg wide char printable characters of a variable encoding? If a file is opened with a specified encoding, do regexes maintain the encoding such that print $1, $/ will print the match in the specified encoding? Lastly, what encodings do the various --encoding options correspond to (I'm used to seeing encodings expressed as names, eg "utf8", rather than by number of bits)?

UPDATE: I recently found binary_analyze from the AppArmor project (Suse wiki, Novell project page), that seems to work with object files from perl at length. I have quite a bit more reading / researching, et all to come to an appreciable understanding of that aspect; however I would appreciate comments with respect to the availability of object file modules or for Question 2 in the interim.

Thanks!

Comment on Accessing Initialized and/or Loaded Sections of Object Files Select or Download Code