MAVEN

A Massive General Domain Event Detection Dataset.

What is MAVEN?

MAVEN is a event detection dataset with massive general domain human annotated data. It contains 4,480 documents; 118,732 event mentions and 168 event types. It is collected by researcher at Tsinghua University and WeChat AI.

For more details, please refer to our EMNLP 2020 paper:

(Wang et al., 2020)

Quick Start

MAVEN is distributed under a CC BY-SA 4.0 License. The dataset can be obtained below:

Tsinghua Cloud
Google Drive

For the reference codes and detailed data format, please refer to our github repository.

Github repo

Once you have finished your model, you can submit your predictions to our competition hosted on CodaLab.

CodaLab

If you want your results to be appeared on the official leaderboard here, please read the guideline following.

Leaderboard Guideline

Citation

If you use MAVEN in your research, please cite our paper.

@inproceedings{ wang2020MAVEN,
  title={ {MAVEN}: A Massive General Domain Event Detection Dataset },
  author={ Wang, Xiaozhi and Wang, Ziqi and Han, Xu and Jiang, Wangyi and Han, Rong and Liu, Zhiyuan and Li, Juanzi and Li, Peng and Lin, Yankai and Zhou, Jie },
  booktitle={ Proceedings of EMNLP 2020 },
  year={2020}
}
            
Leaderboard
Rank Model Micro Macro
F1 Precission Recall F1 Precission Recall
1
Oct 8, 2020
BERT+CRF (single) 10
Tsinghua University
[paper] [code]
67.8 65 70.9 0 0 0
2
Oct 8, 2020
DMBERT (single) 10
Tsinghua University
[paper] [code]
67.1 62.7 72.3 0 0 0
3
Oct 8, 2020
BiLSTM+CRF (single) 10
Tsinghua University
[paper] [code]
64.1 63.4 64.8 0 0 0
4
Oct 8, 2020
MOGANED (single) (reproduced) 10
Chinese Academy of Science
[paper] [code]
63.8 63.4 64.1 0 0 0
5
Oct 8, 2020
BiLSTM (single) 10
Tsinghua University
[paper] [code]
62.8 59.8 67 0 0 0
6
Oct 8, 2020
DMCNN (single) (reproduced) 10
Chinese Academy of Science
[paper] [code]
60.6 66.3 55.9 0 0 0