diff --git a/README.md b/README.md index 93ff03b..daca737 100644 --- a/README.md +++ b/README.md @@ -493,12 +493,7 @@ In particular, we caution against using Whisper models to transcribe recordings ## Training Data -TODO - -The large-v3 checkpoint is trained on 1 million hours of weakly labeled audio and 4 million hours of pseudo-labeled audio collected using Whisper large-v2. - -As discussed in [the accompanying paper](https://cdn.openai.com/papers/whisper.pdf), we see that performance on transcription in a given language is directly correlated with the amount of training data we employ in that language. - +No information provided. ## Performance and Limitations