KQA Pro

A Large-Scale, Diverse, Challenging Dataset of Complex Question Answering over Knowledge Base

What is KQA Pro

KQA Pro is a large-scale dataset of complex question answering over knowledge base. The questions are very diverse and challenging, requiring multiple reasoning capabilities including compositional reasoning, multi-hop reasoning, quantitative comparison, set operations, and etc. Strong supervisions of SPARQL and program are provided for each question.

The target knowledge base of KQA Pro is a dense subset of Wikidata. Questions are split into training/validation/test set, including 94,376/11,797/11,797 respectively.
Download

[version 1.0 / size 25MB]

We implement several baselines and release them in our Github repository. Have a quick start on KQA Pro by trying out them!
@inproceedings{KQAPro,
  title={{KQA P}ro: A Large Diagnostic Dataset for Complex Question Answering over Knowledge Base},
  author={Cao, Shulin and Shi, Jiaxin and Pan, Liangming and Nie, Lunyiu and Xiang, Yutong and Hou, Lei and Li, Juanzi and He, Bin and Zhang, Hanwang},
  booktitle={ACL'22},
  year={2022}
}

Citation

Question:


Program:

SPARQL:

 

Choices:


Answer: