Update README.md

This commit is contained in:
Gustavo de Rosa 2023-12-13 23:02:10 +00:00 committed by huggingface-web
parent ca30dd4560
commit 25a6b01ef7

@ -17,7 +17,6 @@ Phi-2 is a Transformer with **2.7 billion** parameters. It was trained using the
Our model hasn't been fine-tuned through reinforcement learning from human feedback. The intention behind crafting this open-source model is to provide the research community with a non-restricted small model to explore vital safety challenges, such as reducing toxicity, understanding societal biases, enhancing controllability, and more. Our model hasn't been fine-tuned through reinforcement learning from human feedback. The intention behind crafting this open-source model is to provide the research community with a non-restricted small model to explore vital safety challenges, such as reducing toxicity, understanding societal biases, enhancing controllability, and more.
## Intended Uses ## Intended Uses
Phi-2 is intended for research purposes only. Given the nature of the training data, the Phi-2 model is best suited for prompts using the QA format, the chat format, and the code format. Phi-2 is intended for research purposes only. Given the nature of the training data, the Phi-2 model is best suited for prompts using the QA format, the chat format, and the code format.
@ -69,13 +68,34 @@ def print_prime(n):
``` ```
where the model generates the text after the comments. where the model generates the text after the comments.
**Notes** **Notes:**
* Phi-2 is intended for research purposes. The model-generated text/code should be treated as a starting point rather than a definitive solution for potential use cases. Users should be cautious when employing these models in their applications. * Phi-2 is intended for research purposes. The model-generated text/code should be treated as a starting point rather than a definitive solution for potential use cases. Users should be cautious when employing these models in their applications.
* Direct adoption for production tasks is out of the scope of this research project. As a result, the Phi-2 model has not been tested to ensure that it performs adequately for any production-level application. Please refer to the limitation sections of this document for more details. * Direct adoption for production tasks is out of the scope of this research project. As a result, the Phi-2 model has not been tested to ensure that it performs adequately for any production-level application. Please refer to the limitation sections of this document for more details.
* If you are using `transformers>=4.36.0`, always load the model with `trust_remote_code=True` to prevent side-effects. * If you are using `transformers>=4.36.0`, always load the model with `trust_remote_code=True` to prevent side-effects.
## Sample Code ## Sample Code
There are four types of execution mode:
1. FP16 / Flash-Attention / CUDA:
```python
model = AutoModelForCausalLM.from_pretrained("microsoft/phi-2", torch_dtype="auto", flash_attn=True, flash_rotary=True, fused_dense=True, device_map="cuda", trust_remote_code=True)
```
2. FP16 / CUDA:
```python
model = AutoModelForCausalLM.from_pretrained("microsoft/phi-2", torch_dtype="auto", device_map="cuda", trust_remote_code=True)
```
3. FP32 / CUDA:
```python
model = AutoModelForCausalLM.from_pretrained("microsoft/phi-2", torch_dtype=torch.float32, device_map="cuda", trust_remote_code=True)
```
4. FP32 / CPU:
```python
model = AutoModelForCausalLM.from_pretrained("microsoft/phi-2", torch_dtype=torch.float32, device_map="cpu", trust_remote_code=True)
```
To ensure the maximum compatibility, we recommend using the second execution mode (FP16 / CUDA), as follows:
```python ```python
import torch import torch
from transformers import AutoModelForCausalLM, AutoTokenizer from transformers import AutoModelForCausalLM, AutoTokenizer
@ -85,8 +105,7 @@ torch.set_default_device("cuda")
model = AutoModelForCausalLM.from_pretrained("microsoft/phi-2", torch_dtype="auto", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("microsoft/phi-2", torch_dtype="auto", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("microsoft/phi-2", trust_remote_code=True) tokenizer = AutoTokenizer.from_pretrained("microsoft/phi-2", trust_remote_code=True)
inputs = tokenizer('''```python inputs = tokenizer('''def print_prime(n):
def print_prime(n):
""" """
Print all primes between 1 and n Print all primes between 1 and n
"""''', return_tensors="pt", return_attention_mask=False) """''', return_tensors="pt", return_attention_mask=False)
@ -96,7 +115,7 @@ text = tokenizer.batch_decode(outputs)[0]
print(text) print(text)
``` ```
**Remark.** In the generation function, our model currently does not support beam search (`num_beams > 1`). **Remark:** In the generation function, our model currently does not support beam search (`num_beams > 1`).
Furthermore, in the forward pass of the model, we currently do not support outputting hidden states or attention values, or using custom input embeddings. Furthermore, in the forward pass of the model, we currently do not support outputting hidden states or attention values, or using custom input embeddings.
## Limitations of Phi-2 ## Limitations of Phi-2