The 3 file-detect modules mentioned already: File::Type, MIME::Detect, File::MMagic all report an xlsx file as application/zip because it really is a zip file as choroba wrote.

The catch is that even if your uploaded file is a zip file, it is not necessarily an xlsx. So, you will probably have to extract the list of files from the archive and check whether certain files any xlsx document should contain are present. I am not sure if there is a method 100% accurate to do that (detection I mean) due to possible exceptions unless you feed it to M$. In fact I do not know if xlsx file format is officially public knowledge or one has to reverse engineer it.

If you want to go light and moduleless the file signature of an xlsx file I created with LibreOffice at a linux/intel box using hexdump is 4b50 0403 0014 0808 which is comparable - bar endianess - to, say, info from https://www.garykessler.net/library/file_sigs.html.

Then, even detecting a csv file can be tricky, if it contains unicode. You can't even count on the abundance of commas it should normally contain because they may be encoded in unicode as fancy counter-clockwise commas :) they do that with inverted commas and even gcc spits out unicode nowadays.


In reply to Re: determine file type from data read from filehandle by bliako
in thread determine file type from data read from filehandle by expo1967

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.