Create a dataset by uploading a file. See ‘Dataset Creation’ for more information.
The name of the project that is making the request.
The name of the uploaded dataset.
The dataset type, which is used to validate the data. Valid types are embed-input
, reranker-finetune-input
, single-label-classification-finetune-input
, chat-finetune-input
, and multi-label-classification-finetune-input
.
Indicates if the original file should be stored.
Indicates whether rows with malformed input should be dropped (instead of failing the validation check). Dropped rows will be returned in the warnings field.
List of names of fields that will be persisted in the Dataset. By default the Dataset will retain only the required fields indicated in the schema for the corresponding Dataset type. For example, datasets of type embed-input
will drop all fields other than the required text
field. If any of the fields in keep_fields
are missing from the uploaded file, Dataset validation will fail.
List of names of fields that will be persisted in the Dataset. By default the Dataset will retain only the required fields indicated in the schema for the corresponding Dataset type. For example, Datasets of type embed-input
will drop all fields other than the required text
field. If any of the fields in optional_fields
are missing from the uploaded file, Dataset validation will pass.
Raw .txt uploads will be split into entries using the text_separator value.
The delimiter used for .csv uploads.
The dataset ID