2017 September 20,

Math 265: Day 4

Carleton College, Prof. Joshua R. Davis

Due at the start of class on Day 8

Complete these problems. Write them up carefully, in the order assigned, for handing in with the rest of your homework.

2.16, 2.19, 2.24
A student is taking a multiple-choice exam, in which four options are given for each question. The probability that the student knows the answer to a question is 2 / 3. If she knows the answer, then she marks it correctly. If she does not know the answer, then she selects an answer (uniformly) randomly. What is the probability that an answer, that was marked correctly, was not marked randomly?
Do the problem described below. [Warning: This problem has been re-written in the past two days.]

In class we discussed a one-word Bayesian spam filter based on the equation

P(S | W) = P(W | S) P(S) / [P(W | S) P(S) + P(W | S^c) P(S^c)].

Now let's design a two-word spam filter. Let W₁ be the event that a message contains one suspicious word ("Rolex") and W₂ the event that it contains a different suspicious word ("refinance"). To streamline the problem, we impose a simplifying "independence" assumption, that

P(W₁ W₂ | S) = P(W₁ | S) P(W₂ | S) and P(W₁ W₂ | S^c) = P(W₁ | S^c) P(W₂ | S^c).

Show that P(S | W₁ W₂) = x / [x + y], where

x = P(W₁ | S) P(W₂ | S) P(S) and y = P(W₁ | S^c) P(W₂ | S^c) P(S^c).

Finally, derive an analogous n-word spam filter based on P(S | W₁ W₂ ... W_n) and a suitable independence assumption.