Slash Boxes

SoylentNews is people

posted by janrinok on Sunday September 25 2016, @02:09AM   Printer-friendly
from the how-it's-done dept.

Following up on NCommander's recent "original content" series, I decided to write up a bit about a recent thesis I supervised. Full disclaimer: I supervised this thesis and I'd like to see the thesis results being used more widely.

In 2013, the nation-wide exams for primary schools somehow leaked onto the streets and were being traded over the internet. One newspaper reporting on it showed a blurred page of the exam questions 5 days before the exam was to take place. This made me wonder: would it be possible to recover the exam questions from that blurred photo?

Fast-forward one bachelor thesis investigating this (resulting in a 3/4ths complete plugin) and a few years. A team of BSc students is interested in picking up the deblurring project. Previous experience showed that asking one student to pick this up for a BSc thesis is a lot, but at my current institute BSc theses are done in teams. Challenging, but why not?

The original assignment was to implement Cho's algorithm for deblurring [Cho et al 2013] as a GIMP plugin. The previous bachelor thesis had found this algorithm as the best deblurring algorithm for recovering text. However, time marches on. During the literature review phase, the team came across some advances in deblurring. Moreover, the algorithm's description in the paper was incomplete, and patented. (Interestingly enough, the patent did not clarify the incompleteness.) There was a new algorithm by Pan et al [Pan et al 2014] that was simpler, faster, and: open source. However, the original was coded in Matlab, which is (1) proprietary, (2) not freely available, and (3) not in much use by people who want to edit pictures.

So, the team investigated both algorithms in great (and hairy) detail, and implemented Pan et al's algo as an open source GIMP plugin. This required a working understanding of the maths involved (which is not explicitly taught in the Bachelor programme). Moreover, the end result is a sleek piece of work, showcasing the team's CS creds as well.

Below, a tiny bit about how blurring works, and how to deblur. I'll skip most of the maths, promised.

[Continues ...]

Blurring is an operation on a digital image. We can represent each pixel by a value (rgb, grayscale, etc.). For example (marking one pixel for blurring):

145 167 222 255 255 255
109 112 216 255 243 248
124 131 143 128 232 201
145 167 140 198 196 203
113 125 132 162 178 193

For each pixel in the output image, its value is computed by a weighted average of the neighboring pixels in the input image. In typical Gaussian blur, the pixel under consideration is given the heighest weight, and weights for pixels further away lower with distance to this pixel.
There are two parameters: how many neighbors to consider, and how to distribute the weights. For example, if you take into account all neighbors one pixel away (including diagonal), you could assign the central pixel a weight of 40%, the neighbors up/down/left/right weights of 10%, and the diagonal neighbors weights of 0.05%:

0.05 0.10 0.05
0.10 0.40 0.10
0.05 0.10 0.05

This matrix is called the kernel. The one here is for a Gaussian blur ("a bit vagued up" if you will). If we use it to blur the marked pixel (original value: 143), we put the kernel's center on the pixel under consideration and then just multiply all weights and add the result together. Overlaying results in the following:

0.05 * 112 0.10 * 216 0.05 * 255
0.10 * 131 0.40 * 143 0.10 * 128
0.05 * 167 0.10 * 140 0.05 * 198

And adding all these together should result in 155.3 (rounded to 155).

Of course, other kernel sizes and other distributions are possible. Another typical class of blur is motion blur - the blur you'd get in the background if you take a photo of a racecar while keeping the racecar focused. While weights in a Gaussian blur trail off (sort of) circular around the pixel that will be blurred, in motion blurring the main weights are distributed in the direction of motion - usually left to right or vice versa.

How to deblur
Blurring can be expressed as: Y = X * K + n.
Here, X is the input image, Y the output image, K the kernel, n a noise factor and '*' is a mathematical operation called convolution. Deblurring is then a matter of inverting this, i.e. deconvolution. Deblurring if you don't know the kernel used originally is called blind deconvolution. There's tons of works about this. What we're interested in is recovering text from blurred images. So perhaps there's a way to use specific properties of text to "help" the blind deconvolution?

This is what Cho et al's algorithm did. It used the fact that text has high contrast (e.g. black on white). Of course, not all such stark contrasts are text. Cho et al used an algorithm to determine the "stroke width" - the width of each stroke of the letters. If this was fairly consistent, then this was probably a letter.

Of course, if you're looking for strokes in the picture, your process slows down quite a bit. Interestingly enough, a follow-up algorithm by Pan et al does not use specific features of text, but still improves upon the algorithm of Cho et al.

Both algorithms follow roughly the same outline: First, a kernel is estimated and then a preliminary deblur image is computed. The preliminary result is used to improve the kernel estimate, etc. etc.. Finally, the final kernel estimate is applied to the original input image -- the preliminary images help reveal details about contrast, but ruin the image and thus cannot be used for this step.

There's a lot more to this (Fourier transforms, gradients, Laplacian deconvolution, etc), but that goes beyond an already lengthy article. If you're interested, read the thesis or see the papers.

About the plugin
You might think it's easy to convert code from Matlab to a GIMP plugin. Well... no.
You see, Matlab comes packed to the brim with mathematical functions. GIMP plugins are coded in C, C++, or Python. So you have to use libraries for the maths functions you need. Unfortunately, some implementations of the mathematical functions needed give different results than others. The team had to discover that the hard way, and then either update their calls if they could track down the problem or implement the function themselves if they couldn't.

But that's just the mathematics - how does it work, you ask?
Well, the end result is a sleek-looking, mature plugin. You select an area, run the filter, and up pops a dialog window where you can set some parameters (e.g. kernel size estimate), and even toggle a preview function. What impressed me is that the team managed to implement a meaningful progress bar. Since the algorithm loops, and the amount of processing in each loop is not universally determined, this was pretty tricky. They ended up running a lot of test cases with timing controls embedded in the code, and found out how much progress was made in each step.
Sometimes, the progress bar pauses for a bit longer than anticipated - basically when the next portion uses more time than the generic estimate allowed.

All in all, I encourage you to download the plugin and play with it. I'll try to be around in the comments when this story runs.

Final words and links
- The GIMP plugin is available from here
- Pan et al's project page

In conclusion: there is a free, open source GIMP plugin available. The plugin could be (significantly) sped up by making use of GPU processing - but this is a finished academic project, so don't look to me for that. Interesting to note is that the project took a bit of a side step: Pan's algorithm (like Cho's) focuses mainly on deblurring uniform motion blur, not on deblurring artificially induced blur.
Given that the main difference between these is the kernel used for blurring, it's possible to update the plugin to be able to inject your own specially crafted kernel estimate to be used for deblurring. Basically, instead of the algorithm estimating a blur kernel, you just supply it. That would enable it to be used for the originally intended purpose.

Be that as it may, I'm (more than) happy with the result, and hope that others will make use of this plugin.

Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by martyb on Sunday September 25 2016, @01:47PM

    by martyb (76) Subscriber Badge on Sunday September 25 2016, @01:47PM (#406249) Journal

    To the submitter: thanks for a very interesting and informative story.

    And, an apology. I'll be the first to admit that there are aspects of the site's UI that are less than ideal. But high on that list is trying to submit a story where you want to present a grid of information (like in your matrices).

    There is a restricted subset of valid HTML elements which are made available to submitters so as to protect the site from, say, trolls and griefers.

    Even then, there is a gnarly chunk of code (balance_tags IIRC) which attempts to make sure that all elements which are opened in a submission and/or comment is also closed. As the number of permutations goes up as the factorial of the number of permitted elements, this is a non-trivial task.

    tl;dr: Please accept my kudos for braving the task of submitting those matrices!

    Wit is intellect, dancing.
    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 2) by Pino P on Sunday September 25 2016, @02:42PM

    by Pino P (4721) on Sunday September 25 2016, @02:42PM (#406266) Journal

    Can't you balance tags by parsing the submission using the HTML5 rules (using something like html5lib []) and then reserializing the result to HTML5 or XHTML? If the objection is that html5lib is written in Python as opposed to Perl, you could run the balance operation in a separate process.

    • (Score: 2) by martyb on Monday September 26 2016, @12:52AM

      by martyb (76) Subscriber Badge on Monday September 26 2016, @12:52AM (#406460) Journal

      Thanks for the thoughtful reply!

      Can't you balance tags by parsing the submission using the HTML5 rules (using something like html5lib []) and then reserializing the result to HTML5 or XHTML? If the objection is that html5lib is written in Python as opposed to Perl, you could run the balance operation in a separate process.

      Would that it were that easy. To add to the complexity, there are site-specific additions such as in this case: <sarcasm>Oh, really?</sarcasm>.

      Further, even if it were necessary to augment/replace balance_tags, that may not be sufficient to support the addition of other HTML elements.

      All of our code is up on GitHub, so if you are interested, you could also take a look there. If you have any other questions, I'd encourage you to ask around on IRC []... issue "/join #dev" to join the channel intended for discussions about site development. If nobody responds there, then try "/join #Soylent" where it is more likely that there is going to be someone around.

      Again, many thanks the feedback and suggestions... we've got a good crew around here and if you are interested in taking a stab at fixing up the code, I'm sure our devs would be more than willing to help you get up to speed!

      Wit is intellect, dancing.