World's most popular travel blog for travel bloggers.

# [Answers] Image Comparison tuned to Human Perception

, ,
Problem Detail:

Say I have two systems that take the same drawing commands and starts drawing on their respective platforms (i.e. HTML Canvas). After this is done, I want to save and compare these two images to ensure that they are perceptually the same, even though the shapes and colors may be very slightly off (non-perceptible differences).

I tried using perceptual hashes, but unfortunately it does not seem sensitive enough for this task. So for example, if you set 1000 pixels all black (for an image that is 1920x1080) on one and compare it to the other, it will be very obvious perceptually but the pHash algorithm will still return a 100% match.

I tried using nearest neighbor averaging for all pixels. Basically, for each pixel, I averaged its value with that of a number of its neighbors and set this value back to the pixel. This seems to work well with shapes that have hard edges and straight lines, because it only takes a very small localized change for it to be perceptibly different. For long curves, this is not so good since there can be moderate changes throughout the entire curve with it still being considered perceptually the same.

Does there exist an algorithm that can handle perceptually small changes in many possible shapes/curves? Should I use multiple algorithms to handle this task?

Unfortunately there is not likely to be any good general algorithm for this. Human perception is extremely complex. The best you can do is find algorithms that will sometimes recognize two images as perceptually similar, but there will inevitably be some pairs of images that humans would classify as perceptually similar but your algorithm won't detect as similar.

Your best bet is probably to try to find out exactly what minor differences tend to be introduced by these particular platforms, and then try to devise an algorithm that is tuned to those particular kinds of differences -- you'll probably have better luck than that than with any general algorithm.

One very simple algorithm is to align the two images (e.g., using SIFT/SURF/etc.), add a bit of Gaussian blur to the two images, downscale, and then compare using a L2 norm. Another simple algorithm is to align the two images with SIFT/SURF/etc., and then compute a metric based upon the number of inliers vs outliers in the feature points (how well the set of feature points from the first image match up with the set of feature points from the second image). However, it's hard to know how well this will work for your particular application, without knowing the details of the differences that tend to arise in your situation.