java - Clause Segmentation using Stanford OpenIE -
i'm in search of tool segmenting complex sentences clauses. since use corenlp tools parsing, got know openie deals clause segmentation in process of extracting relation triples sentence. currently, use sample code provided in openiedemo class github repository doesn't segment sentence clauses. here code:
// create stanford corenlp pipeline properties props = propertiesutils.asproperties(         "annotators", "tokenize,ssplit,pos,lemma,parse,natlog,openie");  stanfordcorenlp pipeline = new stanfordcorenlp(props); //annotate sample sentence text = "i don't think able handle this.";  annotation doc = new annotation(text); pipeline.annotate(doc);  // loop on sentences in document int sentno = 0; (coremap sentence : doc.get(coreannotations.sentencesannotation.class)) {       list<sentencefragment> clauses = new openie(props).clausesinsentence(sentence);   (sentencefragment clause : clauses) {     system.out.println("clause: "+clause.tostring());   } } i expect output 3 clauses:
- i don't think
- he able
- to handle this
instead, code returns exact same input:
- i n't think able handle this
however, sentence
obama born in hawaii , no longer our president.
gets 2 clauses:
- obama born in hawaii , no longer our president
- he no longer our president
(seems coordinating conjunction segmentation indicator)
is openie used clause segmentation , if so, how properly?
any other practical approaches/tools on clause segmentation welcome. in advance.
so, clause segmenter bit more tightly integrated openie name imply. goal of module produce logically entailed clauses, can shortened logically entailed sentence fragments. going through 2 examples:
- i don't think able handle this. - none of 3 clauses think entailed original sentence: - "i don't think" -- still "think," if don't think true.
- "he able" -- if "think world flat," doesn't mean world flat. similarly, if "think he'll able" doesn't mean he'll able.
- "to handle this" -- i'm not sure clause... i'd group "he able handle this," "able to" being treated single verb.
 
- obama born in hawaii , no longer our president. - naturally 2 clauses should "obama born in hawaii" , "he no longer our president." nonetheless, clause splitter outputs original sentence in place of first clause, in expectation next step of openie extractor strip off "conj:and" edge. 
Comments
Post a Comment