generate
DeprecatedHeaders
Bearer authentication of the form Bearer <token>
, where token is your auth token.
Request
The input text that serves as the starting point for generating the response. Note: The prompt will be pre-processed and modified before reaching the model.
The maximum number of generations that will be returned. Defaults to 1
, min value of 1
, max value of 5
.
A non-negative float that tunes the degree of randomness in generation. Lower temperatures mean less random generations. See Temperature for more details.
Defaults to 0.75
, min value of 0.0
, max value of 5.0
.
If specified, the backend will make a best effort to sample tokens deterministically, such that repeated requests with the same seed and parameters should return the same result. However, determinism cannot be totally guaranteed. Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker/Bedrock, Private Deployments
Ensures only the top k
most likely tokens are considered for generation at each step.
Defaults to 0
, min value of 0
, max value of 500
.
Ensures that only the most likely tokens, with total probability mass of p
, are considered for generation at each step. If both k
and p
are enabled, p
acts after k
.
Defaults to 0.75
. min value of 0.01
, max value of 0.99
.
Used to reduce repetitiveness of generated tokens. The higher the value, the stronger a penalty is applied to previously present tokens, proportional to how many times they have already appeared in the prompt or prior generation.
Using frequency_penalty
in combination with presence_penalty
is not supported on newer models.
One of GENERATION|NONE
to specify how and if the token likelihoods are returned with the response. Defaults to NONE
.
If GENERATION
is selected, the token likelihoods will only be provided for generated text.
WARNING: ALL
is deprecated, and will be removed in a future release.
When enabled, the user’s prompt will be sent to the model without any pre-processing.