How to join png with alpha / transparency in a frame in realtime

Note: I will explain the general principle and give you an example implementation in Python, as I don’t have the Android development environment set up. It should be fairly straightforward to port this to Java. Feel free to post your code as a separate answer.


You need to do something similar to what the addWeighted operation does, that is the operation

Linear blend formula

However, in your case, α needs to be a matrix (i.e. we need a different blending coefficient per pixel).


Sample Images

Let’s use some sample images to illustrate this. We can use the Lena image as a sample face:

Sample Face

This image as an overlay with transparency:

Overlay with Alpha

And this image as an overlay without transparency:

Overlay without Alpha


Blending Matrix

To obtain the alpha matrix, we can either determine the foreground (overlay) and background (the face) masks using thresholding, or use the alpha channel from the input image if this is available.

It is useful to perform this on floating point images with values in range 0.0 .. 1.0. We can then express the relationship between the two masks as

foreground_mask = 1.0 - background_mask

i.e. the two masks added together result in all ones.

For the overlay image in RGBA format we get the following foreground and background masks:

Foreground mask from transparency

Background mask from transparency

When we use thresholding, erode and blur in case of RGB format, we get the following foreground and background masks:

Foreground mask from threshold

Background mask from threshold


Weighted Sum

Now we can calculate two weighted parts:

foreground_part = overlay_image * foreground_mask
background_part = face_image * background_mask

For RGBA overlay the foreground and background parts look as follows:

Foreground part (RGBA overlay)

Background part (RGBA overlay)

And for RGB overlay the foreground and background parts look as such:

Foreground part (RGB overlay)

Background part (RGB overlay)


And finally add them together, and convert the image back to 8bit integers in range 0-255.

The result of the operations looks as follows (RGBA and RGB overlay respectively):

Merged (RGBA overlay)

Merged (RGB overlay)


Code Sample – RGB Overlay

import numpy as np
import cv2

# ==============================================================================

def blend_non_transparent(face_img, overlay_img):
    # Let's find a mask covering all the non-black (foreground) pixels
    # NB: We need to do this on grayscale version of the image
    gray_overlay = cv2.cvtColor(overlay_img, cv2.COLOR_BGR2GRAY)
    overlay_mask = cv2.threshold(gray_overlay, 1, 255, cv2.THRESH_BINARY)[1]

    # Let's shrink and blur it a little to make the transitions smoother...
    overlay_mask = cv2.erode(overlay_mask, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3)))
    overlay_mask = cv2.blur(overlay_mask, (3, 3))

    # And the inverse mask, that covers all the black (background) pixels
    background_mask = 255 - overlay_mask

    # Turn the masks into three channel, so we can use them as weights
    overlay_mask = cv2.cvtColor(overlay_mask, cv2.COLOR_GRAY2BGR)
    background_mask = cv2.cvtColor(background_mask, cv2.COLOR_GRAY2BGR)

    # Create a masked out face image, and masked out overlay
    # We convert the images to floating point in range 0.0 - 1.0
    face_part = (face_img * (1 / 255.0)) * (background_mask * (1 / 255.0))
    overlay_part = (overlay_img * (1 / 255.0)) * (overlay_mask * (1 / 255.0))

    # And finally just add them together, and rescale it back to an 8bit integer image
    return np.uint8(cv2.addWeighted(face_part, 255.0, overlay_part, 255.0, 0.0))

# ==============================================================================

# We load the images
face_img = cv2.imread("lena.png", -1)
overlay_img = cv2.imread("overlay.png", -1)

result_1 = blend_non_transparent(face_img, overlay_img)
cv2.imwrite("merged.png", result_1)

Code Sample – RGBA Overlay

import numpy as np
import cv2

# ==============================================================================

def blend_transparent(face_img, overlay_t_img):
    # Split out the transparency mask from the colour info
    overlay_img = overlay_t_img[:,:,:3] # Grab the BRG planes
    overlay_mask = overlay_t_img[:,:,3:]  # And the alpha plane

    # Again calculate the inverse mask
    background_mask = 255 - overlay_mask

    # Turn the masks into three channel, so we can use them as weights
    overlay_mask = cv2.cvtColor(overlay_mask, cv2.COLOR_GRAY2BGR)
    background_mask = cv2.cvtColor(background_mask, cv2.COLOR_GRAY2BGR)

    # Create a masked out face image, and masked out overlay
    # We convert the images to floating point in range 0.0 - 1.0
    face_part = (face_img * (1 / 255.0)) * (background_mask * (1 / 255.0))
    overlay_part = (overlay_img * (1 / 255.0)) * (overlay_mask * (1 / 255.0))

    # And finally just add them together, and rescale it back to an 8bit integer image    
    return np.uint8(cv2.addWeighted(face_part, 255.0, overlay_part, 255.0, 0.0))

# ==============================================================================

# We load the images
face_img = cv2.imread("lena.png", -1)
overlay_t_img = cv2.imread("overlay_transparent.png", -1) # Load with transparency

result_2 = blend_transparent(face_img, overlay_t_img)
cv2.imwrite("merged_transparent.png", result_2)

Leave a Comment