from the how-it's-done dept.
Following up on NCommander's recent "original content" series, I decided to write up a bit about a recent thesis I supervised. Full disclaimer: I supervised this thesis and I'd like to see the thesis results being used more widely.
In 2013, the nation-wide exams for primary schools somehow leaked onto the streets and were being traded over the internet. One newspaper reporting on it showed a blurred page of the exam questions 5 days before the exam was to take place. This made me wonder: would it be possible to recover the exam questions from that blurred photo?
Fast-forward one bachelor thesis investigating this (resulting in a 3/4ths complete plugin) and a few years. A team of BSc students is interested in picking up the deblurring project. Previous experience showed that asking one student to pick this up for a BSc thesis is a lot, but at my current institute BSc theses are done in teams. Challenging, but why not?
The original assignment was to implement Cho's algorithm for deblurring [Cho et al 2013] as a GIMP plugin. The previous bachelor thesis had found this algorithm as the best deblurring algorithm for recovering text. However, time marches on. During the literature review phase, the team came across some advances in deblurring. Moreover, the algorithm's description in the paper was incomplete, and patented. (Interestingly enough, the patent did not clarify the incompleteness.) There was a new algorithm by Pan et al [Pan et al 2014] that was simpler, faster, and: open source. However, the original was coded in Matlab, which is (1) proprietary, (2) not freely available, and (3) not in much use by people who want to edit pictures.
So, the team investigated both algorithms in great (and hairy) detail, and implemented Pan et al's algo as an open source GIMP plugin. This required a working understanding of the maths involved (which is not explicitly taught in the Bachelor programme). Moreover, the end result is a sleek piece of work, showcasing the team's CS creds as well.
Below, a tiny bit about how blurring works, and how to deblur. I'll skip most of the maths, promised.
Blurring is an operation on a digital image. We can represent each pixel by a value (rgb, grayscale, etc.). For example (marking one pixel for blurring):
For each pixel in the output image, its value is computed by a weighted average of the neighboring pixels in the input image. In typical Gaussian blur, the pixel under consideration is given the heighest weight, and weights for pixels further away lower with distance to this pixel.
There are two parameters: how many neighbors to consider, and how to distribute the weights. For example, if you take into account all neighbors one pixel away (including diagonal), you could assign the central pixel a weight of 40%, the neighbors up/down/left/right weights of 10%, and the diagonal neighbors weights of 0.05%:
This matrix is called the kernel. The one here is for a Gaussian blur ("a bit vagued up" if you will). If we use it to blur the marked pixel (original value: 143), we put the kernel's center on the pixel under consideration and then just multiply all weights and add the result together. Overlaying results in the following:
|0.05 * 112||0.10 * 216||0.05 * 255|
|0.10 * 131||0.40 * 143||0.10 * 128|
|0.05 * 167||0.10 * 140||0.05 * 198|
And adding all these together should result in 155.3 (rounded to 155).
Of course, other kernel sizes and other distributions are possible. Another typical class of blur is motion blur - the blur you'd get in the background if you take a photo of a racecar while keeping the racecar focused. While weights in a Gaussian blur trail off (sort of) circular around the pixel that will be blurred, in motion blurring the main weights are distributed in the direction of motion - usually left to right or vice versa.
How to deblur
Blurring can be expressed as: Y = X * K + n.
Here, X is the input image, Y the output image, K the kernel, n a noise factor and '*' is a mathematical operation called convolution. Deblurring is then a matter of inverting this, i.e. deconvolution. Deblurring if you don't know the kernel used originally is called blind deconvolution. There's tons of works about this. What we're interested in is recovering text from blurred images. So perhaps there's a way to use specific properties of text to "help" the blind deconvolution?
This is what Cho et al's algorithm did. It used the fact that text has high contrast (e.g. black on white). Of course, not all such stark contrasts are text. Cho et al used an algorithm to determine the "stroke width" - the width of each stroke of the letters. If this was fairly consistent, then this was probably a letter.
Of course, if you're looking for strokes in the picture, your process slows down quite a bit. Interestingly enough, a follow-up algorithm by Pan et al does not use specific features of text, but still improves upon the algorithm of Cho et al.
Both algorithms follow roughly the same outline: First, a kernel is estimated and then a preliminary deblur image is computed. The preliminary result is used to improve the kernel estimate, etc. etc.. Finally, the final kernel estimate is applied to the original input image -- the preliminary images help reveal details about contrast, but ruin the image and thus cannot be used for this step.
There's a lot more to this (Fourier transforms, gradients, Laplacian deconvolution, etc), but that goes beyond an already lengthy article. If you're interested, read the thesis or see the papers.
About the plugin
You might think it's easy to convert code from Matlab to a GIMP plugin. Well... no.
You see, Matlab comes packed to the brim with mathematical functions. GIMP plugins are coded in C, C++, or Python. So you have to use libraries for the maths functions you need. Unfortunately, some implementations of the mathematical functions needed give different results than others. The team had to discover that the hard way, and then either update their calls if they could track down the problem or implement the function themselves if they couldn't.
But that's just the mathematics - how does it work, you ask?
Well, the end result is a sleek-looking, mature plugin. You select an area, run the filter, and up pops a dialog window where you can set some parameters (e.g. kernel size estimate), and even toggle a preview function. What impressed me is that the team managed to implement a meaningful progress bar. Since the algorithm loops, and the amount of processing in each loop is not universally determined, this was pretty tricky. They ended up running a lot of test cases with timing controls embedded in the code, and found out how much progress was made in each step.
Sometimes, the progress bar pauses for a bit longer than anticipated - basically when the next portion uses more time than the generic estimate allowed.
All in all, I encourage you to download the plugin and play with it. I'll try to be around in the comments when this story runs.
In conclusion: there is a free, open source GIMP plugin available. The plugin could be (significantly) sped up by making use of GPU processing - but this is a finished academic project, so don't look to me for that. Interesting to note is that the project took a bit of a side step: Pan's algorithm (like Cho's) focuses mainly on deblurring uniform motion blur, not on deblurring artificially induced blur.
Given that the main difference between these is the kernel used for blurring, it's possible to update the plugin to be able to inject your own specially crafted kernel estimate to be used for deblurring. Basically, instead of the algorithm estimating a blur kernel, you just supply it. That would enable it to be used for the originally intended purpose.
Be that as it may, I'm (more than) happy with the result, and hope that others will make use of this plugin.