Many epubs come with unprofessional CSS that will not display correctly on some ebook readers. For instance, the font size may be illegibly small on a mobile device, or the user may have dark mode turned on, but the CSS specifies element foreground colors according to an assumed (but not specified) white background, so there is little or no contrast with the actual black background. I recently wrote a script to detect epubs with those problems, then one to detect and fix them.

My first attempt at this used EPUB::Parser, but I soon found that it didn't (as far as I could tell) have the functionality I needed to get at the internal CSS files and edit them. So I fell back on Archive::Zip (which EPUB::Parser uses) -- an epub is a zip file containing css, html, and xml files (and sometimes jpg's, etc.).

The full code and assocated files
The documentation

Here, I present two of the trickier functions; inverse_color() is passed a CSS color value of some kind (which can be a wide array of formats), calculates a complementary color, and returns it. It makes use of functions from Graphics::ColorUtils to map CSS color names to rgb values. It is called by fix_css_colors() when it finds a CSS block containing a color: attribute but no background-color: attribute.

sub inverse_color { my $color = shift; die "Missing argument to inverse_color()" unless $color; state $color_names; if ( not $color_names ) { #set_default_namespace("www"); $color_names = available_names(); } $color =~ s/^\s+//; $color =~ s/\s+$//; if ( $color =~ /^#[[:xdigit:]]{3}$/ ) { $color =~ s/#//; my $n = hex $color; my $i = 0xFFF - $n; my $inverse = sprintf "#%03x", $i; return $inverse; } elsif ( $color =~ /^#[[:xdigit:]]{6}$/ ) { $color =~ s/#//; my $n = hex $color; my $i = 0xFFFFFF - $n; my $inverse = sprintf "#%06x", $i; return $inverse; } elsif ( $color =~ /rgb \s* \( \s* ([0-9]+) \s* , \s* ([0-9]+) , +\s* ([0-9]+) \s* \) /x ) { my ($r, $g, $b) = ($1, $2, $3); my $n = $r * 65536 + $g * 256 + $b; printf "converted %s to %06x\n", $color, $n if $verbose; my $i = 0xFFFFFF - $n; my $inverse = sprintf "#%06x", $i; return $inverse; } elsif ( $color =~ /rgba \s* \( \s* ([0-9]+) \s* , \s* ([0-9]+) , + \s* ([0-9]+) \s* , \s* ([0-9.]+) \s* \) /x ) { my ($r, $g, $b, $alpha) = ($1, $2, $3, $4); my $inverse = sprintf "rgba( %d, %d, %d, %0.2f )", 255 - $r, 255 - + $g, 255 - $b, 1 - $alpha; return $inverse; } elsif ( $color =~ /hsl \s* \( \s* ([0-9]+) \s* , \s* ([0-9]+)% +, \s* ([0-9]+)% \s* \) /x ) { my ( $hue, $saturation, $lightness ) = ($1, $2, $3); my $hue2 = ($hue + 180) % 360; my $sat2 = 100 - $saturation; my $light2 = 100 - $lightness; my $inverse = sprintf "hsl( %d, %d%%, %d%% )", $hue2, $sat2, $ligh +t2; return $inverse; } elsif ( $color =~ /hsla \s* \( \s* ([0-9]+) \s* , \s* ([0-9]+)% + , \s* ([0-9]+)% \s* , \s* ([0-9.]+) \s* \) /x ) { my ( $hue, $saturation, $lightness, $alpha ) = ($1, $2, $3, $4); my $hue2 = ($hue + 180) % 360; my $sat2 = 100 - $saturation; my $light2 = 100 - $lightness; my $alpha2 = 1 - $alpha; my $inverse = sprintf "hsl( %d, %d%%, %d%%, %0.2f )", $hue2, $sat2 +, $light2, $alpha2; return $inverse; } elsif ( $color =~ /currentcolor/i ) { warn "Should have removed currentcolor in fix_css_colors()"; } elsif ( $color =~ /inherit/i ) { return "inherit"; } elsif ( $color_names->{ "www:". $color} or $color_names->{ $colo +r} ) { my $hexcolor = name2rgb( $color ); if ( not $hexcolor ) { $hexcolor = name2rgb( "www:" . $color ); if ( not $hexcolor ) { die "Can't resolve color name $color"; } } $hexcolor =~ s/#//; my $i = 0xFFFFFF - hex($hexcolor); my $inverse = sprintf "#%06x", $i; return $inverse; } else { die "Color format not implemented: $color"; } } sub fix_css_colors { my ($csstext, $css_fn, $epub_fn) = @_; return if not $csstext; my $errors = 0; my $corrections = 0; my $printed_filename = 0; say "Checking $epub_fn:$css_fn for bad colors\n" if $verbose; # this might be a good use of negative lookbehind? my @css_blocks = split /(})/, $csstext; for my $block ( @css_blocks ) { if ( $block =~ m/color: \s* ( [^;]+ ) \s* (?:;|$) /x ) { my $fgcolor = $1; print "found color: $fgcolor\n" if $verbose; if ( $fgcolor =~ m/currentcolor/i ) { $block =~ s/(color: \s* currentcolor \s* ;? \s* ) \n* //xi; print "Stripping out $1 as it is a pleonasm\n" if $verbose; $corrections++; next; } if ( $block !~ m/background-color:/ ) { my $bgcolor = inverse_color( $fgcolor ); $block =~ s/(color: \s* [^;}]+ \s* (?:;|$) )/background-color: + $bgcolor;\n$1/x; print "corrected block:\n$block\n}\n" if $verbose; $corrections++; } } } if ( $corrections ) { my $new_css_text = join "", @css_blocks; return $new_css_text; } else { return undef; } }

In reply to Fixing bad CSS in EPUB files by jimhenry

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.