Fine-tuning with the Python SDK

In addition to using the Web UI for fine-tuning models, customers can also kick off fine-tuning jobs programmatically using the Cohere Python SDK. This can be useful for fine-tunes that happen on a regular cadence, such as fine-tuning nightly on newly-acquired data.

Datasets

Before a fine-tune job can be started, users must upload a Dataset with training and (optionally) evaluation data. The contents and structure of the dataset will vary depending on the type of fine-tuning. Read more about preparing the training data for Generate, Chat, Classify, and Rerank fine-tuning.

The snippet below creates a dataset for fine-tuning a model to sound like Shakespeare.

# create a dataset
my_dataset = co.create_dataset(
  name="customer service logs",
  dataset_type="chat-finetune-input",
  data=open("./customer-chat.jsonl", "rb"),
  eval_data=open("./customer-chat-eval.jsonl", "rb")
).await_validation()

Starting a Fine-tuning Job

Below is an example of starting a fine-tune job of a generative model for Chat using a dataset of conversational data.

# start training a custom model using the dataset
custom_model = co.create_custom_model(
	name="customer-service-chat-model", 
	model_type="GENERATIVE", 
	dataset=my_dataset)

Fine-tuning results

When the fine-tune model is ready you will receive an email notification. You can explore the evaluation metrics using the Dashboard Web UI and try out your model using one of our APIs on the interactive Playground. You can also poll for the status of the fine-tuning job, and explore the results programmatically, as shown in the example below:

# configure timeout and wait parameters for wait timeout and poll interval respectively
custom_model.wait(timeout=None, interval=60)