zentara has asked for the wisdom of the Perl Monks concerning the following question:

Ok, I'm overwhelmed by all the scripts and unicode information, and I just need a headstart on this.

What I want to do, is get a list of all the hex numbers of a font family, namely the Symbola font collection, but the family name can be generalized.

For instance, this code gives me the descriptive name of the unicode element 0x1F42A, from the Symbola font collection.

#!/usr/bin/perl use warnings; use strict; use charnames(); print charnames::viacode(0x1F42A); # prints "DROMEDARY CAMEL"

What I am looking for is a way to get all those hex numbers which Symbola provides. How do I dump Symbola to a list of hex numbers?

Thanks, you are saving me a mind-boggle if you know the answer. :-)


I'm not really a human, but I play one on earth.
Old Perl Programmer Haiku ................... flash japh

Replies are listed 'Best First'.
Re: getting a list of all font elements by font family name
by JavaFan (Canon) on Jul 30, 2011 at 23:07 UTC
    For instance, this code gives me the descriptive name of the unicode element 0x1F42A, from the Symbola font collection.
    Actually, that code gives you the Unicode name for 0x1F42A. charnames::viacode is unaware of fonts.

    Now, following your link and following a link in there gets me the zip file. Unzipping the archive, I find a file called Symbola605.txt. In there, I find a line: "Character repertoire of Symbola", followed by a long list of characters.

    I expect your answer can be found in this file. And ultimately in the file Symbola605.ttf, but that requires knowledge of the ttf format - knowledge I do not have.

Re: getting a list of all font elements by font family name
by zentara (Cardinal) on Jul 31, 2011 at 15:58 UTC
    Googling gives me the answer to use Font::TTF::Name but the docs are hard to figure out. However I did get some results with the script ttfwidth.pl from the Examples directory of Font-TTF-Scripts, which gave output like
    $ ./ttfwidth.pl -u Symbola605.ttf There are 6796 glyphs font mapping Microsoft id = 3, encoding = 1 (encoding => UGL coding) Unicode, Glyph, AdvWidth, LSdBearing, Xmin, Xmax, Ymin, Ymax, XCentre 0x0020,3,512,0,0,0,0,0,512 0x0021,10,682,229,229,453,0,1466,341 0x0022,11,809,108,108,701,771,1339,404 0x0023,12,1024,39,39,985,-121,1265,512 0x0024,13,964,108,108,854,-108,1446,483 0x0025,14,1662,50,50,1612,-134,1578,831 0x0026,15,1593,95,95,1498,-45,1466,796 ...etc ...etc
    and I did have some good results with
    #!/usr/bin/perl use warnings; use strict; use Font::TTF::Font; my $font = shift || "./Symbola605.ttf"; my $f=Font::TTF::Font->open($font) or die "$!\n";; my @out = @{$f->{'post'}->read->{'VAL'}}; #print "@out\n"; my @unihex = grep{/^u[\d+]/} @out; print "@unihex\n";
    But my head is starting to spin now, and I think I will set it aside for awhile, until unicode support gets better. Even the app gucharmap fails to display all the glyphs, even though my Tk program can.

    I'm not really a human, but I play one on earth.
    Old Perl Programmer Haiku ................... flash japh
Re: getting a list of all font elements by font family name
by Tux (Canon) on Jul 31, 2011 at 17:54 UTC
    $ xlsfonts -lll -fn -misc-symbola-medium-r-semicondensed--0-0-0-0-p-0- +*-* |\ perl -ne'($a,$b)=/^\s+0x\w+\s+\((\d+)\)((?:\s+\d+)+)\s+0x\w+/ and$b= +~/[1-9]/ and$fcp{$a}++}END{printf"0x%04X ",$_ for sort{$a<=>$b}keys%f +cp' 0020 0021 0022 0023 0024 0025 0026 0027 0028 0029 002A 002B 002C 002D +002E 002F 0030 ... 2E25 2E26 2E27 2E28 2E29 2E2A 2E2B 2E2C 2E2D 2E2E 2E2F 2E30 2E31 2E32 +2E33 2E34 2E35 2E36 2E37 2E38 2E39 2E3A 2E3B 2E3C 2E3D 2E3E 2E3F 4DC0 4DC1 4DC2 4DC3 +4DC4 4DC5 4DC6 4DC7 4DC8 4DC9 4DCA 4DCB 4DCC 4DCD 4DCE 4DCF 4DD0 4DD1 4DD2 4DD3 4DD4 +4DD5 4DD6 4DD7 4DD8 4DD9 4DDA 4DDB 4DDC 4DDD 4DDE 4DDF 4DE0 4DE1 4DE2 4DE3 4DE4 4DE5 +4DE6 4DE7 4DE8 4DE9 4DEA 4DEB 4DEC 4DED 4DEE 4DEF 4DF0 4DF1 4DF2 4DF3 4DF4 4DF5 4DF6 +4DF7 4DF8 4DF9 4DFA 4DFB 4DFC 4DFD 4DFE 4DFF FFF9 FFFA FFFB FFFC FFFD

    update: fixed command line and shortened the output.

    FWIW my xlsfonts does not show the code points above 0xFFFF at all :(


    Enjoy, Have FUN! H.Merijn
      Hi, thanks for that command string, but on my Slackware system, even though I have the Symbola font installed and is working, it dosn't show up in the xlsfonts output. Maybe it's because I don't have a font server running?

      Oops, I see I need to take the name out of /usr/share/fonts/truetype/fontdir


      I'm not really a human, but I play one on earth.
      Old Perl Programmer Haiku ................... flash japh

        What I did with the .ttf from that zip:

        $ chdir ~/.fonts $ mv /tmp/Symbola605.ttf . $ mkfontscale $ mkfontdir $ fc-cache $ sudo /etc/init.d/xfs restart

        And yes, I have my ~/.fonts directory in the font server list (/etc/X11/fs/config).


        Enjoy, Have FUN! H.Merijn