C# · 12月 31, 2021

是否有一个C#实用程序来匹配(语法分析)树中的模式?

我正在使用自然语言处理(NLP)项目,其中我使用句法解析器从给定的句子中创建句法解析树.

示例输入:我碰到乔和吉尔,然后我们去购物
示例输出:[TOP [S [S [NP [PRP I]] [VP [VBD ran] [PP [IN into] [NP [NNP Joe] [CC和] [NNP吉尔]]]]] [S [ADVP [RB then]] [NP [PRP we]] [VP [VBD去] [NP [NN购物]]]]]]

我正在寻找一个C#实用程序,让我做复杂的查询,如:

获取与“Joe”相关的第一个VBD
获取NP最接近“购物”

这是一个Java utility这样做,我正在寻找一个C#等效.
任何帮助将不胜感激.

解决方法 我们已经用了

一个选项是parse the output into C# code,然后将其编码为XML,使每个节点成为string.Format(“< {0}>”,this.Name);和string.Format(“< / {0}>”,this._name);在中间放置所有的子节点.

执行此操作后,我将使用a tool for querying XML/HTML来解析树.数以千计的人已经使用查询选择器和jQuery来根据节点之间的关系解析树状结构.我认为这远远优于TRegex或其他过时和未维护的java实用程序.

例如,这是为了回答你的第一个例子:

var xml = CQ.Create(d.ToXml());//this can be simpler with CSS selectors but I chose Linq since you’ll probably find it easier//Find joe,in our case the node that has the text ‘Joe’var joe = xml[“*”].First(x => x.InnerHTML.Equals(“Joe”)); //Find the last (deepest) element that answers the critiria that it has “Joe” in it,and has a VBD in it//in our case the VPvar closestToVbd = xml[“*”].Last(x => x.Cq().Has(joe).Has(“VBD”).Any());Console.WriteLine(“Closest node to VPD:\n ” +closestToVbd.OuterHTML);//If we want the VBD itself we can just find the VBD in that elementConsole.WriteLine(“\n\n VBD itself is ” + closestToVbd.Cq().Find(“VBD”)[0].OuterHTML);

这是你的第二个例子

//Now for NP closest to ‘Shopping’,find the element with the text ‘shopping’ and find it’s closest NPvar closest = xml[“*”].First(x => x.InnerHTML.Equals(“shopping”)).Cq() .Closest(“NP”)[0].OuterHTML;Console.WriteLine(“\n\n NP closest to shopping is: ” + closest);