Feep! » Blog » Post

Favicons

Today I added favicons the search results, to make them more visually interesting.

Once again I'm using redo to implement the glue logic; it's a bit hairy in places but not too bad. The pipeline looks a bit like this:

  1. Get a list of domains (currently manually obtained from various datasources)
  2. Download /favicon.ico
  3. Convert ico to a 32x32 (or less) PNG:
    1. Pick a frame from the PNG
    2. Figure out the final size
    3. Use imagemagick to extract, resize, and convert to PNG
    4. Apply zopflipng to squish the PNG
  4. Get a list of all PNG images and their sha256sums
  5. Make a SHA-based filename for each PNG and generate a lookup table for the frontend

Written down like that, it seems pretty simple, but there are a lot of complications. Here's the ones I've encountered so far:

Once I've gotten a legitimate ICO file, there's still more fun to be had. ICO files can contain multiple images; the idea is that they can contain, for example, separate hand-drawn icons for different display sizes, or icons at multiple bit-depths depending on graphics quality. (I'm more familiar with the classic Mac approach, which uses ICN#, icl4, icl8, etc. resources, grouped together into a BNDL. ICO files are separate, since Windows doesn't have a resource fork, but more flexible about the kinds of icon they can store.) Favicons are nominally 16x16, but I'm rendering them as 32x32 if possible to support high-resolution screens. (If there's only a 16x16 icon available, I don't bother scaling it up, since all that would do is make a bigger file that looks blurry.)

The heuristic I'm using right now, which works on the sites I have indexed so far, is:

This heuristic has some glaring gaps, but they turn out not to matter, at least not yet. It seems that all of the favicons I have so far fall into two buckets:

The one exception to these two buckets so far is the icon for fishshell.org, which contains two icons, of dimensions 15x13 and 32x27. The intent here appears to be that the browser will fit these into its favicon square; my color-based fallback picks the higher-resolution one and I have some code that handles this appropriately. (Imagemagick doesn't seem to have an option to scale to fit in an aspect ratio—the promising-looking option averages the two values instead—so I have some special code to detect this case and pick the appropriate image size.)

Once I've produced a standardized PNG image, I pass it through zopflipng to compress it. zopflipng is a PNG encoder that uses the zopfli algorithm to do gzip compression; this produces a compressed output that's much smaller than most gzip algorithms, at the expense of more CPU time. These images are so tiny to begin with that the extra CPU time is negligible in comparison to all the other stuff I'm doing, and often achives files 70% smaller than the original imagemagick output. The result is that most images are well under 1K; the smallest (99 bytes) is underscorejs.org, which is 2 colors plus transparency; the largest is gis.stackexchange.com (2,370 bytes) which has a lot of shading so almost no two colors are alike.

Finally, I take the sha256sum (encoded as base64url) of each icon to use as the filename for the web server. Making the files content-addressed has several benefits:

The end result of all this work is a slighly prettier results page; hopefully the new icons will make it easier to skim the results at a glance.