VLUE
Vietnamese Language Understanding Evaluation benchmark
To establish a standardized set of benchmarks for Vietnamese NLU, we introduce the first Vietnamese Language Understanding Evaluation (VLUE) benchmark. The VLUE benchmark encompasses five datasets covering different NLU tasks, including text classification, span extraction, and natural language understanding. VLUE standard version is a collection of five language understanding tasks in Vietnamese: UIT-ViQuAD 2.0, ViNLI, VSMEC, ViHOS, and NIIVTB POS. The goal of VLUE is to provide a set of high-quality benchmarks to assess the Vietnamese language understanding of newly proposed models.
Rank | Name | Model | Score | UIT-ViQuAD 2.0 | ViNLI | VSMEC | ViHOS | NIIVTB POS | ||
---|---|---|---|---|---|---|---|---|---|---|
EM | F1 | Acc | F1 | F1 | F1 | F1 | ||||
1 January 1, 2024 |
The UIT NLP Group University of Infomation Technology |
CafeBERT | 77.51 | 65.25 | 76.36 | 86.11 | 86.16 | 66.12 | 78.56 | 86.04 |
2 January 1, 2024 |
XLM-Roberta large | 76.53 | 64.71 | 75.36 | 85.99 | 86.10 | 62.24 | 77.70 | 83.62 | |
3 January 1, 2024 |
PhoBERT large | 73.07 | 57.27 | 70.88 | 80.67 | 80.69 | 65.44 | 77.16 | 79.36 | |
4 January 1, 2024 |
PhoBERT base | 69.22 | 51.00 | 64.29 | 78.00 | 78.05 | 59.91 | 75.69 | 77.60 | |
5 January 1, 2024 |
XLM-Roberta base | 68.84 | 50.49 | 59.23 | 76.83 | 77.01 | 61.89 | 74.67 | 81.76 | |
6 January 1, 2024 |
mBERT | 67.90 | 52.34 | 63.71 | 73.45 | 73.62 | 54.59 | 76.22 | 81.34 | |
7 January 1, 2024 |
WikiBERT | 63.95 | 42.16 | 52.62 | 71.18 | 71.49 | 57.64 | 77.05 | 75.52 | |
8 January 1, 2024 |
DistilBERT | 58.62 | 35.78 | 53.83 | 44.39 | 66.77 | 53.83 | 75.72 | 80.05 |
You can download the VLUE evaluate script:
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Please cite our paper as below if you use the VLUE benchmark.
Please, email to vlue.benchmark.vn@gmail.com.