[CS241] data sharing
Skip
dave.hirshberg at gmail.com
Mon Oct 29 06:54:18 EDT 2007
An amendment
Where Avram and I have the same path, we have the same tense-counts for all
reasonable (tree.vals[8] == [01][01][1-4]) tenses.
When I include unreasonable tenses, we get different counts.
Since I counted paths with unreasonable tenses when I took paths with 10,000
instances, we got some different paths.
The following is a list of #path-instances by path-type for (my, avram's)
top 12 path-types.
(128041, 115487),
(125588, 88564),
(93718, 72037),
(39514, 38546),
(35535, 31282),
(32972, 25657),
(25879, 23351),
(21643, 18062),
(19329, 17029),
(17293, 10623),
(11499, 4074),
(10465, 521)
These are the corresponding path-pairs (in my notation, ^X^ => X is the
root):
^VP^ SBAR S VP
VP_SBAR_S_VP_topNode_VP0.txt
^VP^ S VP
VP_S_VP_topNode_VP0.txt
^VP^ VP
VP_VP_topNode_VP0.txt
VP S ^S^ S VP
VP_S_S_S_VP_topNode_S2.txt
^VP^ NP SBAR S VP
VP_NP_SBAR_S_VP_topNode_VP0.txt
VP ^S^ S VP
VP_S_SBAR_S_VP_topNode_S3.txt
VP ^S^ SBAR S VP
VP_S_S_VP_topNode_S2.txt
^VP^ PP NP SBAR S VP
VP_PP_NP_SBAR_S_VP_topNode_VP0.txt
^VP^ PP S VP
VP_S_SBAR_NP_S_VP_topNode_S4.txt
VP ^S^ NP SBAR S VP
VP_S_SINV_VP_topNode_SINV2.txt
VP ^SINV^ S VP
VP_NP_VP_topNode_VP0.txt
^VP^ NP VP
VP_PP_S_VP_topNode_VP0.txt
Avram:
You said you'd not counted a some paths and attached a new file of
P(vp1|vp0,path)/P(vp1|path)s, but not a new file of counts.
If the counts you sent out aren't current, can you attach your new counts in
the same format?
On 10/29/07, Skip <dave.hirshberg at gmail.com> wrote:
>
> I get the same paths you have, but different tense-counts.
> 4 or 5 of our paths differ from Avram's, but where Avram and I have the
> same path (our top 6 or so are the same), we have the same tense-counts.
>
> ?
>
> On 10/28/07, Tim St. Clair <tstclair at cs.brown.edu> wrote:
>
> > Here is my initial set of data. It looks like it is different from
> > juris, but I haven't checked it out that closely yet.
> >
> > The listserv would not let me attach it, so here it is in google
> > document format. Let me know if you want a copy of the csv file.
> >
> > http://spreadsheets.google.com/ccc?key=p3coYwZqOPPzwP5bOiMBrBQ&hl=en
> >
> > --
> > Tim St. Clair
> >
> > (617) 460 - 6497
> > _______________________________________________
> > CS241 mailing list
> > CS241 at list.cs.brown.edu
> > http://list.cs.brown.edu/mailman/listinfo/cs241
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://list.cs.brown.edu/pipermail/cs241/attachments/20071029/870ac343/attachment.html
More information about the CS241
mailing list