add tokenizer
This commit is contained in:
parent
d39f4f8a18
commit
d71cdce5b4
50001
merges.txt
Normal file
50001
merges.txt
Normal file
File diff suppressed because it is too large
Load Diff
1
special_tokens_map.json
Normal file
1
special_tokens_map.json
Normal file
@ -0,0 +1 @@
|
||||
{"bos_token": "<|endoftext|>", "eos_token": "<|endoftext|>", "unk_token": "<|endoftext|>", "pad_token": "<|endoftext|>"}
|
1
tokenizer.json
Normal file
1
tokenizer.json
Normal file
File diff suppressed because one or more lines are too long
1
tokenizer_config.json
Normal file
1
tokenizer_config.json
Normal file
@ -0,0 +1 @@
|
||||
{"unk_token": "<|endoftext|>", "bos_token": "<|endoftext|>", "eos_token": "<|endoftext|>", "add_prefix_space": false, "model_max_length": 1024, "special_tokens_map_file": null, "name_or_path": "./models/", "tokenizer_class": "GPT2Tokenizer"}
|
1
vocab.json
Normal file
1
vocab.json
Normal file
File diff suppressed because one or more lines are too long
Loading…
x
Reference in New Issue
Block a user