Leaderboard

Submit your predictions and get your ranking.

How to submit

You need to predict answers for all questions of test set and write them in a text file in order. Here is an example. Then you need to send the prediction file to us by email caosl19@mails.tsinghua.edu.cn. We will reply to you with the performance as soon as possible.
If you would like to participate in the learderboard, you must provide following information in your email:

  • Model name
  • Affiliation
  • Open-ended OR multiple-choice
  • Whether use the supervision of SPARQL OR not
  • Whether use the supervision of program OR not
  • single model OR ensemble model
  • (optional) Paper link
  • (optional) Paper is accepted in which conference or journal
  • (optional) Code link

Open-Ended Setting
=== multiple choice ===
Rank Model Overall Details
Multi-hop Qualifier Comparison Logical Count Verify Zero-shot
# Human 97.50 97.24 95.65 100.00 98.18 83.33 95.24 100.00
1
8-June-22
GRC-MTLProgramEnsemble
Harbin Institute of Technology (ITNLP Lab)
93.85 92.87 92.13 96.68 93.71 89.32 91.09 93.85
2
21-Nov-21
XT-MTLProgram
Beihang University (ACT Lab)
92.45 91.37 87.95 96.59 90.26 87.51 94.06 91.37
3
14-May-22
GRC_ModelProgramEnsemble
Harbin Institute of Technology (ITNLP Lab)
91.94 90.78 88.94 94.76 91.13 88.56 90.26 87.37
4
21-Oct-20
BART ProgramProgram
[paper][code]
90.55 89.46 84.76 95.51 89.30 86.68 93.30 89.59
5
02-Jun-21
GtReasonProgram
Huazhong University of Science and Technology (CCIIP Lab)
89.95 88.59 83.98 95.79 88.79 86.53 89.57 89.78
6
21-Oct-20
BART SPARQLSPARQL
[paper][code]
89.68 88.49 83.09 96.12 88.67 85.78 92.33 87.88
7
21-Oct-20
RNN ProgramProgram
43.85 37.71 22.19 65.90 47.45 50.04 42.13 34.96
8
21-Oct-20
RNN SPARQLSPARQL
41.98 36.01 19.04 66.98 37.74 50.26 58.84 26.08
9
21-Oct-20
RGCN
[ESWC18]
35.07 34.00 27.61 30.03 35.85 41.91 65.88 0.00
10
21-Oct-20
Blind GRU
34.36 33.25 28.82 25.77 34.17 39.43 62.15 0.06
11
21-Oct-20
EmbedKGQA
[ACL20][code]
28.36 26.41 25.20 11.93 23.95 32.88 61.05 0.06
12
21-Oct-20
KVMemNet
[EMNLP16]
16.61 16.50 18.47 1.17 14.99 27.31 54.70 0.06
1
21-Oct-20
RGCN
[ESWC18]
53.75 51.70 58.24 48.64 55.88 47.93 66.57 1.46
2
21-Oct-20
Blind GRU
53.23 51.00 58.88 47.85 54.47 45.82 62.85 1.65
3
21-Oct-20
EmbedKGQA
[ACL20][code]
45.19 42.39 54.66 23.01 42.36 36.34 61.95 8.88
4
21-Oct-20
KVMemNet
[EMNLP16]
39.15 36.78 50.09 18.71 38.61 34.69 56.08 6.73
Multiple-Choice Setting
Rank Model Overall Details
Multi-hop Qualifier Comparison Logical Count Verify Zero-shot
# Human 97.50 97.24 95.65 100.00 98.18 83.33 95.24 100.00
1
21-Oct-20
RGCN
[ESWC18]
53.75 51.70 58.24 48.64 55.88 47.93 66.57 1.46
2
21-Oct-20
Blind GRU
53.23 51.00 58.88 47.85 54.47 45.82 62.85 1.65
3
21-Oct-20
EmbedKGQA
[ACL20][code]
45.19 42.39 54.66 23.01 42.36 36.34 61.95 8.88
4
21-Oct-20
KVMemNet
[EMNLP16]
39.15 36.78 50.09 18.71 38.61 34.69 56.08 6.73