XPath-basedParsingFramework(XPaF)是一个简单、方便的开源解析框架,便于从HTML和XML文档中提取语法上的相关性(subject-predicate-objecttriples)。
代码示例:
<table> <tr> <td class="name">Aaron</td> <td class="occ">Engineer</td> </tr> <tr> <td class="name">Jennifer</td> <td class="occ">Archeologist</td> </tr></table>parser_name: "my_parser"relation_tmpls { subject: "//td[@class='name']" predicate: "occupation" object: "//td[@class='occ']" subject_cardinality: MANY object_cardinality: MANY}
评论