2018 September 19,

Carleton College, Joshua R. Davis

There are five problems labeled A-E.

A. Section 2.1 Exercise 30. (In this problem, part a is harder than part b.)

B. Section 2.1 Exercise 36.

C. Section 2.1 Exercise 57ab.

D. A student is taking a multiple-choice exam, in which four options are given for each question. The probability that the student knows the answer to a question is 2 / 3. If she knows the answer, then she marks it correctly. If she does not know the answer, then she selects an answer (uniformly) randomly. What is the probability that an answer, that was marked correctly, was not marked randomly?

In class we discussed a one-word Bayesian junk e-mail filter based on the equation

*P*(*J* | *W*) = *P*(*W* | *J*) *P*(*J*) / [*P*(*W* | *J*) *P*(*J*) + *P*(*W* | *J ^{c}*)

Now let's design a two-word spam filter. Let *W*_{1} be the event that a message contains one suspicious word ("Rolex") and *W*_{2} the event that it contains a different suspicious word ("refinance"). To streamline the problem, we make two assumptions of *conditional independence* (Definition 2.5.7):

*P*(*W*_{1} *W*_{2} | *J*) = *P*(*W*_{1} | *J*) *P*(*W*_{2} | *J*) and *P*(*W*_{1} *W*_{2} | *J ^{c}*) =

E. Show that *P*(*J* | *W*_{1} *W*_{2}) = *x* / [*x* + *y*], where

*x* = *P*(*W*_{1} | *J*) *P*(*W*_{2} | *J*) *P*(*J*) and *y* = *P*(*W*_{1} | *J ^{c}*)