From 55215ac09aea7cec05ca81d6921e1481769335a3 Mon Sep 17 00:00:00 2001 From: Mike Conover Date: Wed, 12 Apr 2023 03:24:21 +0000 Subject: [PATCH] Updating README. --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 4d49675..d9c6788 100644 --- a/README.md +++ b/README.md @@ -54,8 +54,8 @@ maximize the potential of all individuals and organizations. ### Benchmark Metrics -Below you'll find various models benchmark performance on the [EleutherAI LLM Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) -model results are sorted by geometric mean to produce an intelligible ordering. These results demonstrate that `dolly-v2-12b` is not state of the art, +Below you'll find various models benchmark performance on the [EleutherAI LLM Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness); +model results are sorted by geometric mean to produce an intelligible ordering. As outlined above, these results demonstrate that `dolly-v2-12b` is not state of the art, and in fact underperforms `dolly-v1-6b` in some evaluation benchmarks. We believe this owes to the composition and size of the underlying fine tuning datasets, but a robust statement as to the sources of these variations requires further study.