2010 February 23 / j d a v i s @ c a r l e t o n . e d u

Exam 2

You will submit three files — exam2a.txt, exam2b.py, and exam2c.py — electronically, as usual.

A: Binary

This question requires you to experiment a little with the Python interpreter. You must run your experiments on the lab machines in CMC 306. Type your answer in a file exam2a.txt. (My answer is a few sentences long.)

We have seen that Python has two different notions of integers — small and large:

>>> 2**21
2097152
>>> 2**100
1267650600228229401496703205376L

Determine the power of 2 at which Python switches from small integers to large integers. Explain precisely why Python switches at that power of 2, rather than at some other power of 2.

B: Recursion

The following images display a certain fractal at four different levels of detail: 0, 1, 2, and 3. The fractal is made up of squares. To get from one level to the next, we subdivide each square into five squares in a sort of checkerboard pattern.

In a file exam2b.py, write a function to draw this fractal. Your function must use recursion in a meaningful way, and its definition must begin as follows.

def drawSquareFractal(t, length, detail):
    """Takes as input a Turtle t, a float length, and a nonnegative integer detail. Uses the turtle to draw the square fractal of the given level of detail, with the given overall side length."""

For example, the four images above were produced using using four different calls to drawSquareFractal(), with details of 0, 1, 2, and 3, all with length 100.0. Don't forget to employ comments, docstrings, good Python style, etc.

C: Text Recognition

In this question we will write a function that "reads" text from images. That is, the function takes in an image and outputs the string of text that is pictured in the image. Such systems are used heavily in industry; for example, the U.S. Postal Service sorts mail using computers that automatically recognize the addresses written on envelopes.

Download these two image files: short.gif and long.gif. These are pictures of text. The long one is more interesting than the short one, but while you are debugging your program you may prefer to work with the short one for the sake of speed. Also download these 26 image files: A.gif, B.gif, C.gif, D.gif, E.gif, F.gif, G.gif, H.gif, I.gif, J.gif, K.gif, L.gif, M.gif, N.gif, O.gif, P.gif, Q.gif, R.gif, S.gif, T.gif, U.gif, V.gif, W.gif, X.gif, Y.gif, Z.gif. These are 7x9 pictures of the 26 uppercase English letters, in the same font as the other text images.

For the sake of speed, you will want to convert each of the 26 letter images into a more convenient format — namely, a grid of integers. Let's give a name to such a grid; let's call it a glyph. That is, a glyph is a tuple of seven tuples, each containing nine integers between 0 and 255.

In exam2c.py, write a function to load letter images into glyphs, like this:

def glyph(image):
    """Given a grayscale image, returns the corresponding glyph."""

Also write a function like this:

def text(image, glyphs, threshold):
    """Given a grayscale image, a list of 26 glyphs corresponding to the 26 uppercase English letters (in order), and a threshold value, returns the string of text that is pictured in the image."""

I recommend the following strategy. A glyph is essentially a 7x9 kernel. Scan through all of the interior pixels in the image (where "interior" is suitably defined). For each interior pixel, compare the 7x9 block of pixels around it to each glyph. If the pixels match any glyph well (as defined by the threshold), then figure out which glyph is matched best; that character belongs in your result string.

Don't forget to employ comments, docstrings, good Python style, etc. Your demo code should load the 26 image files into glyphs and use them to recognize the text in long.gif.