Update README.md
This commit is contained in:
parent
1217df5932
commit
bf7d7cc428
26
README.md
26
README.md
@ -18,8 +18,8 @@ size_categories:
|
|||||||
- 10M<n<100M
|
- 10M<n<100M
|
||||||
---
|
---
|
||||||
## Table of Contents
|
## Table of Contents
|
||||||
- [Dataset Attribution](#dataset-attribution)
|
|
||||||
- [Dataset Summary](#dataset-summary)
|
- [Dataset Summary](#dataset-summary)
|
||||||
|
- [Dataset Attribution](#dataset-attribution)
|
||||||
- [Supported Tasks and Leaderboards](#supported-tasks-and-leaderboards)
|
- [Supported Tasks and Leaderboards](#supported-tasks-and-leaderboards)
|
||||||
- [Languages](#languages)
|
- [Languages](#languages)
|
||||||
- [Dataset Structure](#dataset-structure)
|
- [Dataset Structure](#dataset-structure)
|
||||||
@ -37,12 +37,25 @@ size_categories:
|
|||||||
|
|
||||||
<p><h1>🐋 The Open Orca Dataset! 🐋</h1></p>
|
<p><h1>🐋 The Open Orca Dataset! 🐋</h1></p>
|
||||||
|
|
||||||
<a name="dataset-attribution"></a>
|
<a name="dataset-announcement"></a>
|
||||||
|
|
||||||
We are thrilled to announce the release of the Open Orca dataset!
|
We are thrilled to announce the release of the Open Orca dataset!
|
||||||
This rich collection of augmented FLAN data aligns, as best as possible, with the distributions outlined in the [Orca paper](https://arxiv.org/abs/2306.02707).
|
This rich collection of augmented FLAN data aligns, as best as possible, with the distributions outlined in the [Orca paper](https://arxiv.org/abs/2306.02707).
|
||||||
It has been instrumental in generating high-performing model checkpoints and serves as a valuable resource for all NLP researchers and developers!
|
It has been instrumental in generating high-performing model checkpoints and serves as a valuable resource for all NLP researchers and developers!
|
||||||
|
|
||||||
|
<a name="dataset-summary"></a>
|
||||||
|
|
||||||
|
Dataset Summary
|
||||||
|
|
||||||
|
The Open Orca dataset is a collection of unaugmented and augmented FLAN data.
|
||||||
|
Currently ~1M GPT-4 completions, and ~3.5M GPT-3.5 completions.
|
||||||
|
It is tabularized in alignment with the distributions presented in the ORCA paper and currently represents a partial completion of the full intended dataset, with ongoing generation to expand its scope.
|
||||||
|
The data is primarily used for training and evaluation in the field of natural language processing.
|
||||||
|
|
||||||
|
<a name="dataset-attribution"></a>
|
||||||
|
|
||||||
|
Dataset Attribution
|
||||||
|
|
||||||
We would like to give special recognition to the following contributors for their significant efforts and dedication:
|
We would like to give special recognition to the following contributors for their significant efforts and dedication:
|
||||||
|
|
||||||
|
|
||||||
@ -70,15 +83,6 @@ Many thanks to NanoBit and Caseus, makers of [Axolotl](https://github.com/OpenAc
|
|||||||
We are welcoming sponsors or collaborators to help us build these models to the scale they deserve. Please reach out via our socials:
|
We are welcoming sponsors or collaborators to help us build these models to the scale they deserve. Please reach out via our socials:
|
||||||
http://Alignmentlab.ai https://discord.gg/n9hXaBPWxx
|
http://Alignmentlab.ai https://discord.gg/n9hXaBPWxx
|
||||||
|
|
||||||
<a name="dataset-summary"></a>
|
|
||||||
|
|
||||||
Dataset Summary
|
|
||||||
|
|
||||||
The Open Orca dataset is a collection of unaugmented and augmented FLAN data.
|
|
||||||
Currently ~1M GPT-4 completions, and ~3.5M GPT-3.5 completions.
|
|
||||||
It is tabularized in alignment with the distributions presented in the ORCA paper and currently represents a partial completion of the full intended dataset, with ongoing generation to expand its scope.
|
|
||||||
The data is primarily used for training and evaluation in the field of natural language processing.
|
|
||||||
|
|
||||||
<a name="supported-tasks-and-leaderboards"></a>
|
<a name="supported-tasks-and-leaderboards"></a>
|
||||||
|
|
||||||
Supported Tasks and Leaderboards
|
Supported Tasks and Leaderboards
|
||||||
|
Loading…
x
Reference in New Issue
Block a user