Thank you for the suggestions! The super long sub is actually an experiment. I wrote a short one, and I noticed that there are a lot of variables, and passing them around to sub routines is no fun. For example, the part that decodes custom format BMPs would require at least 14 arguments. So, I've tried to break up this sub into 5-6 smaller sub-routines, and the result is not nice. The code didn't look pretty. So, I don't know what to do...

"An excellent rule of thumb is that each subroutine should be fully visible on the screen without scrolling."

Yeah, I know... It's wonderful when you have a small easy task to do, and you can do it in 35 lines of code and then return with the result. I like those kind of subs too. Then you have these monster subs that look like never-ending code and even when you try to divide the code into sections, it looks like a plate of spaghetti. Variables are created in one sub, then they are used in other subs all over. It can be a nightmare.

The only reason I like to keep my scripts compatible with Perl 5.004 is because that way they work on the earliest Perl interpreter I have in my possession. If a Perl script is backward compatible, it will still run on newer versions of Perl! It just means that it will also run on older versions too. And having a script that can run on anything is great. It's one less thing to worry about.

I didn't realize that putting the main program in brackets keeps the scope of variables out of the rest of the program. Actually, I remember reading about it. But I forgot that I can do that.

"undef $HEADER;"

I have played around a lot with string variables and I noticed that if I have a very large string in a sub, exiting the sub will not destroy the big string in memory! That's unusual, but we have talked about this before in this thread: Memory usage double expected and doing undef $VARIABLE causes Perl to free up that memory.

But let me say a word about BMP image format as well. Whoever invented this stuff was really not thinking about how hard it's going to be for programmers who just want to decode a simple image. BMP images are usually stored upside down in the file BUT NOT ALWAYS. When the highest bit of the ImageHeight is 1, that means the BMP image is actually stored right side up. But it also means that you have to negate the ImageHeight before you can use it. BMP images also contain padding. And lots of it. It's all over the place. The header itself has a few bytes of empty space. The palette usually has a bunch of empty space in it (about 256 bytes or so). And each line of the image has padding at the end. The padding depends on how wide the image. In RLE compressed BMP files, the end of each line has an extra null byte SOMETIMES but not always. The BMP file header contains several different variables to describe the image format. One is used for compression. Okay, this value is stored in a 32-bit unsigned int, but it only has 4 VALID VALUES: 0, 1, 2, 3. Zero means no compression is used at all. One and two means RLE compression is used. And 3 indicates that it's a custom-format BMP file (which means that the red, green and blue values do not appear where they normally do in standard uncompressed BMP file. Normally, you have BLUE, then GREEN, then RED in this order. But in a custom-format BMP file, the RGB values can be in any order, and they can take up 1-8 bits or even more. There is total flexibility here.) So, the compression is stored in 32 bits but only the first two bits are used. We store the BITS_PER_PIXEL (BPP) in 16 bits but only the first 6 bits are ever used. Then there are the COLORS value which is 32 bits but never uses more than 8 bits. The FILESIZE and DATASIZE, on the other hand, uses ONLY 32 bits to store the obvious, and they are completely unnecessary. In fact, most programs ignore these values, because they CAN BE ZERO sometimes. The BMP header has a variable size. And it also has version number. There are specific versions, and each one is slightly different. There's version 12, 16, 40, 54, 56, 108 and 124. After version 40, which is considered the "standard," there was little change. But the fact that you have so many unused space in a file and so many rules makes it only harder to decode the image. The designers of BMP weren't thinking about SIMPLICITY. Even a 5-year-old could design a better file format.

Consider the SUN Raster image file format, for example. It has a 32-byte header which is ALWAYS 32 bytes long and always holds exactly eight 32-bit unsigned integers in big-endian format. They contain the image width and height, the palette size, datasize, compression and a few other things. VERY SIMPLE. You don't need a 300-line sub to decode it. Even the RLE compression algorithm they use is A LOT nicer and more efficient than what BMP uses. RAS files do not use padding and don't waste space, and usually a simple uncompressed image is going to be somewhat smaller than a BMP file. Not only it's smaller, but the decoder program is a lot simpler too. I really wish RAS images were more popular than BMP, but today RAS is hardly even known. It's literally a non-existent format nowadays.


In reply to Re^2: Convert BMP to HTML by harangzsolt33
in thread Convert BMP to HTML by harangzsolt33

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.