diff --git a/README.md b/README.md index 61b9c59..c310d9e 100644 --- a/README.md +++ b/README.md @@ -73,19 +73,21 @@ dataset_info: dataset_size: 105611404 --- -# Dataset Card for Dataset Name +# Dataset Card for OASST1 ## Dataset Description -- **Homepage:** -- **Repository:** -- **Paper:** -- **Leaderboard:** -- **Point of Contact:** +- **Homepage:** https://www.open-assistant.io/ +- **Repository:** https://github.com/LAION-AI/Open-Assistant +- **Paper:** TBA ### Dataset Summary -This dataset card aims to be a base template for new datasets. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/datasetcard_template.md?plain=1). +In an effort to democratize research on large-scale alignment, we release OpenAssistant +Conversations (OASST1), a human-generated, human-annotated assistant-style conversation +corpus consisting of 161,443 messages distributed across 66,497 conversation trees, in +35 different languages, annotated with 461,292 quality ratings. The corpus is a product +of a worldwide crowd-sourcing effort involving over 13,500 volunteers. ### Supported Tasks and Leaderboards