Sergei Mikhailovich Prokudin-Gorskii (1863-1944) [Сергей Михайлович Прокудин-Горский, to his Russian friends] was a man well ahead of his time. Convinced, as early as 1907, that color photography was the wave of the future, he won Tzar's special permission to travel across the vast Russian Empire and take color photographs of everything he saw including the only color portrait of Leo Tolstoy. And he really photographed everything: people, buildings, landscapes, railroads, bridges... thousands of color pictures! His idea was simple: record three exposures of every scene onto a glass plate using a red, a green, and a blue filter. Never mind that there was no way to print color photographs until much later -- he envisioned special projectors to be installed in "multimedia" classrooms all across Russia where the children would be able to learn about their vast country. Alas, his plans never materialized: he left Russia in 1918, right after the revolution, never to return again. Luckily, his RGB glass plate negatives, capturing the last years of the Russian Empire, survived and were purchased in 1948 by the Library of Congress. The LoC has recently digitized the negatives and made them available on-line.
Walk through the colorization attempt of Prokudin-Gorskii's pictures
The input image will be a negative, as shown above, and we will have to split the three color channels (Blue, Green, Red -- this is the order of the channels from top to bottom in the negative) and align them. It isn't possible to place the three different color channels right on top of each other without some image processing, because the pictures are translated from another. As seen here:
Images after being inputed are represented as an array where each element in matrix is the pixel's brightness. Since the pixel brightness are relatively similar, we can use
those and align the pictures. To see how close two images, we can exhaustively search over some pixel displament window. I used [-15, 15]. At each displacement, we compare the
two pictures by some alignment metric telling us how close the two pictures are. There are many that one could try but the two I tried were L2 Norm (or Euclidean Distance) and Normal Cross-Correlation (NCC).
To me they were similar, so I stuck with L2 Norm.
As seen above, just aligning the pictures with no preprocessing works decently for some pictures and not well at all for others.
The images' boarders are darker, chopped off, etc, this makes the calculation of the metric to be skewed causing misalignment. To improve alignment of pictures,
it is better to look at the internal pixels of the pictures. Now we can run the searching algorithm to a cropped version of the picture and apply the displacement found
to the uncropped version.
Even though we won't be looking at a chunk of the edges of the picture, it is fine because most of the detail of the pictures we are looking at is in the general center of the image. Thus, we will
still get a nicely aligned uncropped image.
Exhaustive search with a small pixel window such as [-15,15] works well for lower resolution images, because there are less pixels to search so you don't have to shift the image around as much to find a match. For higher resolution pictures, this searching method becomes a problem because we would need to increase the search window which will take forever to align.
A solution to this problem is to construct an image pyramid where we will downscale the image (I downscaled by a factor of 2), and apply the exhaustive search on a lower resolution version of the image. We will then use the higher level's (lower resolution image's) offset to shift the next level's image and begin our exhaustive search on it. The idea is get the broad details of the image aligned from the lower resolution version of the image, and apply that shift to the next resolution; as we get higher and higher resolution, we align on the finer details.
While aligning on pixel brightness values works for many images, it fails to work for images that are saturated with one color in a specific area.
For instance, aligning on pixel brightness fails on the emir picture and church picture because there is a lot of one color, say, blue in the emir's robe and blue in much of the background
of the church.
This saturation of color in one area throws off the calculation of the metric in the alignment process.
A solution to this problem is to align based on edges instead. In my image processing pipeline, I chose to employ Canny Edge Detection.
Uncropped versions of all the aligned images in the data given, and the last 4 are other photos from Prokudin-Gorskii's collection.
Image Name | Green Offset | Red Offset |
---|---|---|
cathedral | (2,5) | (3,12) |
monastery | (2,-3) | (2,3) |
tobolsk | (3,3) | (3,6) |
train | (8, 40) | (32, 88) |
church | (0, 24) | (-8, 56) |
emir | (24, 48) | (40, 104) |
harvesters | (16, 56) | (24, 88) |
icon | (8, 56) | (8, 112) |
lady | (8, 56) | (8, 112) |
melons | (8, 80) | (16, 176) |
onion_church | (24, 48) | (40, 104) |
sculpture | (-8, 32) | (-24, 136) |
self_portrait | (32, 80) | (40, 176) |
three_generations | (16, 48) | (16, 112) |
desk | (16, 80) | (24, 152) |
machine | (8, 56) | (16, 104) |
picture_frame | (24, 32) | (40, 48) |
boy | (-16, 48) | (-8, 104) |