Nigel Tao

Premultiplied Alpha

In computing, colors are often represented by a four-tuple of numbers: Red, Green, Blue, Alpha. Each of these range ranging from zero up to some maximum, such as up to 1.0 (for floating point RGBA values) or up to 255 (for uint8_t RGBA values). The maximum value is usually obvious from context.

For example, RGBA(0.0, 0.0, 1.0, 1.0) represents a fully saturated blue that is also fully opaque (it has maximum blue and maximum alpha). Similarly, RGBA(0, 0, 200, 255) represents a mostly saturated blue that is still fully opaque.

These four-tupes are well understood when the Alpha value is maximal (and the color is fully opaque). Fully opaque colors are so common that the four-tuple is often abbreviated as a three-tuple: RGBA(0.4, 1.0, 0.8, 1.0) can also be written as RGB(0.4, 1.0, 0.8) or RGB(0x66, 0xFF, 0xCC) or #66ffcc. Informally, this is on the greenish side of cyan.

It’s not so clear how to interpret the RGB values when the color is partially transparent. There are two models (premultiplied alpha and non-premultiplied alpha) that, confusingly, are often written with the same RGBA(...) notation.

Philosophically, this comes down to whether you believe “transparent red” and “transparent blue” are distinguishable colors (the NPA model) or whether every completely invisible color is effectively the same “just transparent” color (the PA model).

Computationally, there are good reasons to use PA (e.g. the blending formula for the ubiquitous Porter-Duff over operator is simpler and therefore often faster; interpolating “transparent red” doesn’t suprisingly recreate any red; invalid four-tuples can be re-purposed the way the standard read function returns “the number of bytes read” in a signed integer but negative values are re-purposed for error codes) and there are good reasons to use NPA (e.g. NPA’s colors are a superset of PA’s, so a round trip PA to NPA to PA conversion is ‘lossless’, ignoring truncation errors, but not vice versa; there are no invalid four-tuples).

My point isn’t that there’s one ‘right’ answer. There’s only trade-offs. However:

  1. Document which alpha model you use. Use explicit type names.
  2. Be aware of different alpha models.

Document Which Alpha Model You Use

PNG

The PNG file format specification is clear and exemplary. Section 6.2 Alpha representation explicitly says “PNG does not use premultiplied alpha”. The emphasis is in the original text.

Cairo

The widely used Cairo graphics library also starts well. The CAIRO_FORMAT_ARGB32 documentation explicitly says “Pre-multiplied alpha is used. (That is, 50% transparent red is 0x80800000, not 0x80ff0000.)”

The four values packed into the 8-hexadecimal-digit number here are listed in a different order (ARGB) because of CPU endianness and other reasons, which is a interesting topic but tangential to this blog post.

However, the cairo_set_source_rgba documentation is not as clear. It just says that the alpha function argument is the “alpha component of color”, without clarifying PA vs NPA.

Even though CAIRO_FORMAT_ARGB32 uses premultiplied alpha, it turns out (see § below) that cairo_set_source_rgba uses non-premultiplied alpha. The discrepancy is unfortunate, but also impossible to fix (in a backwards compatible way) by changing the semantics of existing cairo API names and functions, only by adding new API.

Cairo’s docs can be easily amended (and new API could be added, less easily), but the general documentation lesson remains. “Alpha” or “RGBA” by itself is ambiguous. Strive to be clearer.

Use Explicit Type Names

Over a decade ago, when I worked on Go’s standard library, I named the standard image/color types RGBA and NRGBA. These are distinct types and the less-recommended but still-supported one has an “N” in its name to denote NPA. However, in hindsight, I made the mistake of naming the PA flavor just RGBA. There has been multiple cases over the years where people were confused (and filed bugs) because they tried to interchange Go’s RGBA with some other library’s RGBA (both types have the same name!), without realizing that the former is PA and the latter is NPA.

With the wisdom of that hindsight, more recently in Wuffs, I’ve named the corresponding concepts RGBA_PREMUL and RGBA_NONPREMUL. The important point is that there is no bare RGBA name. Hopefully anyone trying to interop with some other library’s RGBA concept will have to stop and think about which of PREMUL or NONPREMUL they need, instead of defaulting to the (possibly incorrect) RGBA.

A general API design lesson: if FOO is a widely used but ambiguous term, with two flavors X and Y, consider FOO_X and FOO_Y names instead of FOO and FOO_Y, even if the X flavor is more popular. Users of your API can often alias FOO for FOO_X if they really want a shorter name.

Be Aware of Different Alpha Models

Cairo also provides a cairo_surface_write_to_png to encode an in-memory pixel buffer in the PNG file format. That’s a relatively high level API function (a one liner) but you can also integrate Cairo’s lower level API functions with the libpng API functions. www.lemoda.net/c/cairo-to-png/ is one example of doing that, also serving as a relatively simple “hello world” example for getting acquainted with libpng’s not-so-obvious API.

However, it makes the mistake of (silently) confusing the two alpha models. As stated above, PNG uses NPA but CAIRO_FORMAT_ARGB32 uses PA. That C program is incorrect: it outputs an incorrect encoding of the pixel buffer.

If 99+% of your pixel buffer consists of fully opaque or fully transparent pixels then it will be hard to see the difference with the naked eye. To make it more obvious, let’s modify that lemoda C program with this patch:

$ diff -u lemoda-original.c lemoda-edited.c
--- lemoda-original.c	2022-03-28 15:39:22.794267941 +1100
+++ lemoda-edited.c	2022-03-28 15:46:00.393372547 +1100
@@ -111,7 +111,7 @@
 {
     int SIZEX = 80;
     int SIZEY = 80;
-    char * fname = "file.png";
+    char * fname = "premultiplied-alpha-by-lemoda.png";
     cairo_t *c;
     cairo_surface_t *cs;
     bitmap_t bitmap;
@@ -130,7 +130,7 @@
     cairo_fill (c);
     cairo_rectangle (c, SIZEX / 3.0, SIZEY / 3.0,
                      (SIZEX) / 3.0, (SIZEY) / 3.0);
-    cairo_set_source_rgb (c, 1.0, 0.0, 0.0);
+    cairo_set_source_rgba (c, 0.5, 0.0, 0.0, 0.75);
     cairo_fill (c);

     /* We have to call this before reading the data. */
@@ -144,6 +144,7 @@
     if (rv != 0) {
 	fprintf (stderr, "Failed to write PNG to file.\n");
     }
+    cairo_surface_write_to_png(cs, "premultiplied-alpha-by-cairo.png");
     cairo_surface_destroy (cs);
     return 0;
 }

The resultant etc-by-lemoda.png and etc-by-cairo.png images are visibly different (look at the central red squares). Running a PNG decoder on those two images gives these NPA four-tuples for the pixel at (45, 45):

Writing those uint8_t values as fractions-of-1.0:

For “etc-by-lemoda.png”, the Red value is 25% smaller than it should be. For “etc-by-cairo.png”, the (0.50, 0.00, 0.00, 0.75) four-tuple matches the cairo_set_source_rgba arguments in the patch above and hence the deduction (see § above) that cairo_set_source_rgba uses NPA.

To recap, the libpng library (by itself) is correct and the cairo library (by itself) is also correct. For cairo, even though CAIRO_FORMAT_ARGB32 uses PA and PNG uses NPA, cairo_surface_write_to_png will call its internal unpremultiply_data function to correct for this. However naively combining libpng and cairo is incorrect without explicitly converting between NPA and PA.

Again, this particular piece of code can be amended, but the general lesson remains. Be aware which alpha model your graphics libraries use. You may need to convert between them when two libraries meet.


Published: 2022-03-28