From d427ac9d61f0fdd6327ad9ce8ed69ef320fa5aa7 Mon Sep 17 00:00:00 2001
From: Yoach Lacombe <ylacombe@users.noreply.huggingface.co>
Date: Tue, 1 Oct 2024 08:22:24 +0000
Subject: [PATCH] Update README.md

---
 README.md | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/README.md b/README.md
index 93ff03b..daca737 100644
--- a/README.md
+++ b/README.md
@@ -493,12 +493,7 @@ In particular, we caution against using Whisper models to transcribe recordings
 
 ## Training Data
 
-TODO
-
-The large-v3 checkpoint is trained on 1 million hours of weakly labeled audio and 4 million hours of pseudo-labeled audio collected using Whisper large-v2. 
-
-As discussed in [the accompanying paper](https://cdn.openai.com/papers/whisper.pdf), we see that performance on transcription in a given language is directly correlated with the amount of training data we employ in that language.
-
+No information provided.
 
 ## Performance and Limitations