From c2edde545cc196c90993d4870c5e70397438aa8c Mon Sep 17 00:00:00 2001 From: Bleys Date: Sat, 1 Jul 2023 05:13:46 +0000 Subject: [PATCH] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 9a5e487..32fddf7 100644 --- a/README.md +++ b/README.md @@ -48,7 +48,7 @@ It has been instrumental in generating high-performing model checkpoints and ser Dataset Summary The Open Orca dataset is a collection of unaugmented and augmented FLAN data. -Currently ~1M GPT-4 completions, and ~3.5M GPT-3.5 completions. +Currently ~1M GPT-4 completions, and ~3.0M GPT-3.5 completions. It is tabularized in alignment with the distributions presented in the ORCA paper and currently represents a partial completion of the full intended dataset, with ongoing generation to expand its scope. The data is primarily used for training and evaluation in the field of natural language processing.