You need to predict answers for all questions of test set and write them in a text file in order. Here is an
example. Then you need to send the prediction file
to us by email caosl19@mails.tsinghua.edu.cn. We will reply to you with the performance
as soon as possible.
If you would like to participate in the learderboard, you must provide following information
in your email:
Rank | Model | Overall | Details | ||||||
---|---|---|---|---|---|---|---|---|---|
Multi-hop | Qualifier | Comparison | Logical | Count | Verify | Zero-shot | |||
# | Human | 97.50 | 97.24 | 95.65 | 100.00 | 98.18 | 83.33 | 95.24 | 100.00 |
1 8-June-22 |
GRC-MTLProgramEnsemble Harbin Institute of Technology (ITNLP Lab) |
93.85 | 92.87 | 92.13 | 96.68 | 93.71 | 89.32 | 91.09 | 93.85 |
2 21-Nov-21 |
XT-MTLProgram Beihang University (ACT Lab) |
92.45 | 91.37 | 87.95 | 96.59 | 90.26 | 87.51 | 94.06 | 91.37 |
3 14-May-22 |
GRC_ModelProgramEnsemble Harbin Institute of Technology (ITNLP Lab) |
91.94 | 90.78 | 88.94 | 94.76 | 91.13 | 88.56 | 90.26 | 87.37 |
4 21-Oct-20 |
BART ProgramProgram [paper][code] |
90.55 | 89.46 | 84.76 | 95.51 | 89.30 | 86.68 | 93.30 | 89.59 |
5 02-Jun-21 |
GtReasonProgram Huazhong University of Science and Technology (CCIIP Lab) |
89.95 | 88.59 | 83.98 | 95.79 | 88.79 | 86.53 | 89.57 | 89.78 |
6 21-Oct-20 |
BART SPARQLSPARQL [paper][code] |
89.68 | 88.49 | 83.09 | 96.12 | 88.67 | 85.78 | 92.33 | 87.88 |
7 21-Oct-20 |
RNN ProgramProgram |
43.85 | 37.71 | 22.19 | 65.90 | 47.45 | 50.04 | 42.13 | 34.96 |
8 21-Oct-20 |
RNN SPARQLSPARQL |
41.98 | 36.01 | 19.04 | 66.98 | 37.74 | 50.26 | 58.84 | 26.08 |
9 21-Oct-20 |
RGCN [ESWC18] |
35.07 | 34.00 | 27.61 | 30.03 | 35.85 | 41.91 | 65.88 | 0.00 |
10 21-Oct-20 |
Blind GRU |
34.36 | 33.25 | 28.82 | 25.77 | 34.17 | 39.43 | 62.15 | 0.06 |
11 21-Oct-20 |
EmbedKGQA [ACL20][code] |
28.36 | 26.41 | 25.20 | 11.93 | 23.95 | 32.88 | 61.05 | 0.06 |
12 21-Oct-20 |
KVMemNet [EMNLP16] |
16.61 | 16.50 | 18.47 | 1.17 | 14.99 | 27.31 | 54.70 | 0.06 |
1 21-Oct-20 |
RGCN [ESWC18] |
53.75 | 51.70 | 58.24 | 48.64 | 55.88 | 47.93 | 66.57 | 1.46 |
2 21-Oct-20 |
Blind GRU |
53.23 | 51.00 | 58.88 | 47.85 | 54.47 | 45.82 | 62.85 | 1.65 |
3 21-Oct-20 |
EmbedKGQA [ACL20][code] |
45.19 | 42.39 | 54.66 | 23.01 | 42.36 | 36.34 | 61.95 | 8.88 |
4 21-Oct-20 |
KVMemNet [EMNLP16] |
39.15 | 36.78 | 50.09 | 18.71 | 38.61 | 34.69 | 56.08 | 6.73 |
Rank | Model | Overall | Details | ||||||
---|---|---|---|---|---|---|---|---|---|
Multi-hop | Qualifier | Comparison | Logical | Count | Verify | Zero-shot | |||
# | Human | 97.50 | 97.24 | 95.65 | 100.00 | 98.18 | 83.33 | 95.24 | 100.00 |
1 21-Oct-20 |
RGCN [ESWC18] |
53.75 | 51.70 | 58.24 | 48.64 | 55.88 | 47.93 | 66.57 | 1.46 |
2 21-Oct-20 |
Blind GRU |
53.23 | 51.00 | 58.88 | 47.85 | 54.47 | 45.82 | 62.85 | 1.65 |
3 21-Oct-20 |
EmbedKGQA [ACL20][code] |
45.19 | 42.39 | 54.66 | 23.01 | 42.36 | 36.34 | 61.95 | 8.88 |
4 21-Oct-20 |
KVMemNet [EMNLP16] |
39.15 | 36.78 | 50.09 | 18.71 | 38.61 | 34.69 | 56.08 | 6.73 |