169 lines
2.8 KiB
Markdown
169 lines
2.8 KiB
Markdown
---
|
|
license: apache-2.0
|
|
dataset_info:
|
|
features:
|
|
- name: message_id
|
|
dtype: string
|
|
- name: parent_id
|
|
dtype: string
|
|
- name: user_id
|
|
dtype: string
|
|
- name: created_date
|
|
dtype: string
|
|
- name: text
|
|
dtype: string
|
|
- name: role
|
|
dtype: string
|
|
- name: lang
|
|
dtype: string
|
|
- name: review_count
|
|
dtype: int32
|
|
- name: review_result
|
|
dtype: bool
|
|
- name: deleted
|
|
dtype: bool
|
|
- name: rank
|
|
dtype: int32
|
|
- name: synthetic
|
|
dtype: bool
|
|
- name: model_name
|
|
dtype: string
|
|
- name: detoxify
|
|
struct:
|
|
- name: toxicity
|
|
dtype: float64
|
|
- name: severe_toxicity
|
|
dtype: float64
|
|
- name: obscene
|
|
dtype: float64
|
|
- name: identity_attack
|
|
dtype: float64
|
|
- name: insult
|
|
dtype: float64
|
|
- name: threat
|
|
dtype: float64
|
|
- name: sexual_explicit
|
|
dtype: float64
|
|
- name: message_tree_id
|
|
dtype: string
|
|
- name: tree_state
|
|
dtype: string
|
|
- name: emojis
|
|
sequence:
|
|
- name: name
|
|
dtype: string
|
|
- name: count
|
|
dtype: int32
|
|
- name: labels
|
|
sequence:
|
|
- name: name
|
|
dtype: string
|
|
- name: value
|
|
dtype: float64
|
|
- name: count
|
|
dtype: int32
|
|
splits:
|
|
- name: train
|
|
num_bytes: 39465495
|
|
num_examples: 55507
|
|
download_size: 14788268
|
|
dataset_size: 39465495
|
|
---
|
|
|
|
# Dataset Card for Dataset Name
|
|
|
|
## Dataset Description
|
|
|
|
- **Homepage:**
|
|
- **Repository:**
|
|
- **Paper:**
|
|
- **Leaderboard:**
|
|
- **Point of Contact:**
|
|
|
|
### Dataset Summary
|
|
|
|
This dataset card aims to be a base template for new datasets. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/datasetcard_template.md?plain=1).
|
|
|
|
### Supported Tasks and Leaderboards
|
|
|
|
[More Information Needed]
|
|
|
|
### Languages
|
|
|
|
[More Information Needed]
|
|
|
|
## Dataset Structure
|
|
|
|
### Data Instances
|
|
|
|
[More Information Needed]
|
|
|
|
### Data Fields
|
|
|
|
[More Information Needed]
|
|
|
|
### Data Splits
|
|
|
|
[More Information Needed]
|
|
|
|
## Dataset Creation
|
|
|
|
### Curation Rationale
|
|
|
|
[More Information Needed]
|
|
|
|
### Source Data
|
|
|
|
#### Initial Data Collection and Normalization
|
|
|
|
[More Information Needed]
|
|
|
|
#### Who are the source language producers?
|
|
|
|
[More Information Needed]
|
|
|
|
### Annotations
|
|
|
|
#### Annotation process
|
|
|
|
[More Information Needed]
|
|
|
|
#### Who are the annotators?
|
|
|
|
[More Information Needed]
|
|
|
|
### Personal and Sensitive Information
|
|
|
|
[More Information Needed]
|
|
|
|
## Considerations for Using the Data
|
|
|
|
### Social Impact of Dataset
|
|
|
|
[More Information Needed]
|
|
|
|
### Discussion of Biases
|
|
|
|
[More Information Needed]
|
|
|
|
### Other Known Limitations
|
|
|
|
[More Information Needed]
|
|
|
|
## Additional Information
|
|
|
|
### Dataset Curators
|
|
|
|
[More Information Needed]
|
|
|
|
### Licensing Information
|
|
|
|
[More Information Needed]
|
|
|
|
### Citation Information
|
|
|
|
[More Information Needed]
|
|
|
|
### Contributions
|
|
|
|
[More Information Needed] |