CS 180 Project 1: Colorizing the Prokudin-Gorskii photo collection
By Rhythm Seth
Introduction
For each subject in the Prokudin-Gorskii collection, three exposures were taken through a red, a green, and a blue filter. However, across these three channels, the images are not perfectly aligned, so a more intelligent algorithm is needed to produce the photographs in color. This project focuses on using single-scale and multi-scale algorithms to align both small and very large images from the collection. Additionally, methods of automatically border-cropping the resulting color images were also explored.
Single-scale Alignment
Approach
For my single-scale implementation, I utilized a more straightforward approach, applying an alignment method directly to the original image without downsampling. I began by splitting the image into its red, green, and blue channels, then applied my alignment algorithm, which evaluates the pixel-wise difference between channels using a cross-correlation function. In this case, I opted for normalized cross-correlation (NCC) as it seemed more effective for this particular task, as opposed to the L2 norm. I trialed various optimizations, including edge detection using Sobel filters and border cropping, to focus the alignment process on the inner regions of the image and avoid artifacts from the borders. The search range for the alignment function was also expanded to improve accuracy, which allowed the algorithm to explore a broader set of possible displacements. In the end, this single-scale approach produced fairly accurate results for most images as are seen below.
Results
Multi-scale Alignment
Approach
For handling the larger .tif images, I utilized an image-pyramid based approach to speed up the alignment process. The pyramid structure allowed the image to be processed at multiple scales, starting from coarsest at level 3 all the way down to the finest at level 0. I did this by downsampling the image pixels by 1/2 at every level and then applying my align function on the downscaled images to get the best displacement offsets for x and y. I trialled a variety of approaches including border cropping and applying the sobel, but in the end, I settled on just using the pyramid approach with the L2 norm as my score function instead of the NCC (as L2 seemed to be performing better for this task) and tuning hyperparameters for the number of levels and the iterative range for each align function call. However, I noticed that most images seemed to have the green and red channels slightly misaligned with the blue. I counteracted this by manually shifting the x and y displacements I received from the pyramid approach before applying them to each channel.
Results
Bells and Whistles
Automatic Border Cropping
In some cases, images had visible black or white borders, which negatively impacted the alignment process. To address this, I implemented an automatic border cropping function that detects uniform regions at the borders and removes them. The function works by converting the image to grayscale and identifying areas where the pixel values fall below a set threshold, indicating uniformity. A bounding box is then calculated around the non-uniform pixels, and the image is cropped accordingly. This method worked well for removing unnecessary borders before alignment, resulting in cleaner, more accurate colorized images. In the images below, if you look carefully, you can see that the external white borders have been trimmed and the channels reposition very slightly, leading to a slightly better alignment and hence, a better colorized image.
Conclusion
Overall, both the single-scale and multi-scale alignment methods performed well in reconstructing color images from the Prokudin-Gorskii photo collection. The multi-scale approach, in particular, excelled with larger, high-resolution images, reducing processing time while maintaining accuracy. The automatic border cropping method also worked well in removing unnecessary borders before alignment, leading to cleaner, more accurate colorized images. Future improvements could include using a white balance algorithm to achieve more natural color reproduction, and experimenting with non-linear contrast adjustments for even better visual results.