The dataset is also available at Kaggle.You can find the dataset under the dataset directory and use it like below:įrom datasets import load_dataset dataset = load_dataset( "SajjadAyoubi/persian_qa") You can check out an online Demo on Google Colab We train a baseline model which achieves an F1 score of 78 and an exact match More varied answers (names, locations, dates and more).Including informal ("Mohaaverei") entries.More questions per contexts (7 comparing to 5).Increased number of articles (despite having less data).Has some relative advantages to the original inspiration source, some of which (Historical, Religious, Geography, Science, etc).Īt the moment, each context has 7 pairs of questions with one answer and 3Īs mentioned before, the dataset is inspired by the famous SQuAD2.0 dataset and isĬompatible with and can be merged into it. Mentioning that the contexts are collected from all categories of the Wiki On top of that, the veryįirst models trained on the dataset, Transformers, are available online.Īll the crowdworkers of the dataset are native Persian speakers. Moreover, the dataset has 900 test data available. Utilized to create a system which "knows that it doesn't know the answer". Much like the SQuAD2.0 dataset, the impossible or unanswerable questions can be Passage (the context) from which the questioner proposed the question. Impossible-to-answer or a question with one or more answers spanning in the The crowd-sourcedĭataset consists of more than 9,000 entries. Persian Question Answering (PersianQA) Dataset is a reading comprehensionĭataset on Persian Wikipedia. PersianQA: a dataset for Persian Question Answering
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |