[CS241] Asgn2 questions

Eugene Charniak ec at cs.brown.edu
Fri Sep 21 21:47:02 EDT 2007


Justin,

The files 00.strip and 01.strip were missing the empty lines
that signal a story end.  2-9.strip had them.  I have
replaced 00.strip and 01.strip with correct versions.

Other than that I am picking up the Tree::newline signal
just fine.

Your interpretation of the assignment looks right to me.

Eugene

On Fri, Sep 21, 2007 at 06:27:09PM -0400, Justin Palmer wrote:
> Hi Lenora,
> 
> Here's my take on your questions.  I might be wrong...
> 
> For KL divergence, yes, order matters.  D(p || q) != D(q || p).
> That's why KL divergence is not a distance/metric.  So, I'm using the
> order Eugene asked for in the assignment, e.g, D( p(tense) || p(tense
> | tense in prev sent) ).  Regarding the log(0) issue, I ran into it
> too; my understanding is that for entropy calculations, we define 0 *
> log 0 = 0.
> 
> To compute D( p(tense) || p(tense|tense in prev sent) ), I did:
> 
>    KL += p(tense) * log (p(tense) / p(tense | tense in prev sent))
> 
> for all tenses.  So we get 2 numbers saying how useful the two
> conditional probability distributions are as predictors of tense.
> Does that make sense?
> 
> I understand part d to mean calculate p(tense | tense any but prev
> sent).  I'm computing:
> 
>   P(past | past), P(future | past), etc.
> 
> So if you're at sentence i, and it's past, and a past tense also
> occurred in any of the previous sentences other than the last one,
> that counts.  Also, if a future occurred in any sentence other than
> the previous sentence, add another count.  And so on.
> 
> If you'd like to to compare numbers, please let me know.
> 
> Also, is anyone else having problems with the Tree::newstory flag?
> Appears that it never gets set, but I'm probably missing something
> obvious.
> 
> Thanks,
> 
>   -- j
> _______________________________________________
> CS241 mailing list
> CS241 at list.cs.brown.edu
> http://list.cs.brown.edu/mailman/listinfo/cs241


More information about the CS241 mailing list