This is the sixth article in a twelve-part series that discusses the twelve finalists and their calculator submissions for the OMGWTF Programming Contest. The entries are being presented in the order submitted, and the winner will be announced on June 18, 2007.
Yes, you read that title correctly. OMG!--O-C-R--CAL. That's right, someone just had to go and implement Optical Character Recognition for their calculator. But if that was all that Entry #100175 (Ivan Milyakov's OMG!OCRCAL) did, it surely wouldn't have made finalist. The OMG!OCRCAL is a bit more... in depth... than what you might think.
First and foremost, a screen shot of me entering "881+456="
You too can enter calculations in the OMG!OCRCAL with a simple download of the executable (omgocrcal.exe) and the configuration file (definition.txt, must be in same directory as exe). I highly recommended checking it out first hand. There's even a user guide (omgocrcal.pdf) in case you run into difficulties.
What makes OMG!OCRCAL even more unique is that it doesn't actually represent the numbers using numbers: everything is based on abstract shapes. This has the added benefit of allowing the shapes to be completely user configurable. And this is exactly why I noted above that the definition.txt file had to be downloaded as well: this file contains all of the number/shape definitions.
I'll bet you assumed the same thing about definition.txt that I did: a file filled with all sorts of abstract OCR gobbledygook that only Ivan and his OCR cadre can comprehend. Good news! It's not. In fact, the OMG!OCRCAL configuration file is written in English and is completely human readable:
Let's assume that tab equals 8 spaces Let's draw characters randomly Let's allow user to make mistakes Now we'll define numbers Zero is a large circle One is either a long vertical line or a long vertical line a very small slash line top of first joins top of second Two is either a part of circle 12 to 6 a long horizontal line 6 of first joins left of second or a part of circle 9 to 5 a slash line a long horizontal line 5 of first joins top of second bottom of second joins left of third -- snip --
The OMG!OCRCAL comes with its very own parser that reads this file and builds shapes from it. Go ahead, change it. Make your own shapes that represent numbers.
As for the mathematics, only a handful of things are "hard coded":
- ten symbol names ("Zero" through "Nine")
- order of symbols ("Zero" preceeds "One" which preceeds "Two" which preceeds "Three" ... which preceeds "Nine")
- "Zero" acts as a placeholder symbol
The operations work by simple increments and decrements based on the order of symbols. Addition is performed symbol by symbol (ala grade school "long addition") by incrementing "Zero" the number of times each operand can be decremented (not so much like grade school). Multiplication is repeated addition, and subtraction and division are mostly opposite of their counterparts.
There is a fair amount of code behind OMG!OCRCAL – over 4,000 lines of code in 370+ functions – and from what I’ve seen, it’s a well thought-out, object-oriented approach to developing this solution. It has a handful of fun parts (e.g. probability.h: “typedef Probability Possibility; //Alternate spelling”), but the value in checking it out is educational. I found it very interesting to see how Ivan did things.
As for how he came up with the concept, I had to ask:
First and foremost, Ivan, what drew you to the contest?
I felt an urge to participate in the contest: somehow it’s much more fun to write something creative and... err, useless. I used to write exclusively C++ code, so I felt qualified to give it a shot.
At what point did you decide OCR was the theme?
Later on, actually. My first idea was to use some complicated finite automata configurable in using plain English text. Thinking about it a little longer, I decided to eliminate all numbers from calculation and operate on entirely abstract entities.
Though the rules said that a custom UI was unnecessary, I felt that a perverse program logic requires a unique UI. I asked myself, multiple windows? Nah, too obvious. 3-D interface? Eh. Handwriting? Too complicated… or, was it?
The left and right side of my brain had a quick meeting: if we can describe the shape of numbers, why can't we recognize the mouse movements based on the same description? And hence, the OMG!OCRCAL was born.
How did you go about developing the OMG!OCRCAL?
The first thing I started with was the configuration file. I defined all of the character shapes in it before even writing a single line of code. I was hoping to be able to implement custom shapes (like a “figure-eight”) instead of using two circles, but just ran out of time. Everything is now just lines, circles, and arcs.
After the config file, I implemented the parser… then the shape recognizer… then the join matcher. And actually, I “cheated” a bit: the first UI was built in C# program, as it was much easier to implement. I later ported it to C++.
Whatever happened to the original finite automata concept?
Well, after 100,000 bytes of code and a few days left before the deadline, the OMG!OCRCAL was nothing more than a UI: it couldn’t even add one plus one. Goodbye automata, hello elementary school addition! It was a bit tight. I found doing mathematics without any numbers to be a bit of a challenge. I would have really liked to do more, but there’s only so much time.
Do you work with OCR in your job?
Not at all. My day job is a .NET developer at a software engineering company in Moscow, Russia.
Where did you attend university?
I graduated from Moscow State University of Electronics. I grew up here, and work here, and just can’t ever see leaving. It’s a beautiful city with lots of work opportunities for programmers.
And one last question: would you replace Window’s Calculator with OMG!OCRCAL?
Sure. It wouldn't make any difference as I calculate using either Google or a sheet of paper. But then again, my program looks like a sheet, so maybe it's worth trying!
Download Entry #100175, OMG!OCRCAL (ZIP File)
UPDATE: Rebuilt the OMG!OCRCAL executable (omgocrcal.exe). It did work fine on Vista, but not on XP. Now it works on XP. I have no idea why...