This sample notebook shows you how to finetune and deploy a custom Command-R model using Amazon SageMaker.
Note: This is a reference notebook and it cannot run unless you make changes suggested in the notebook.
You can run this notebook one cell at a time (By using Shift+Enter for running a cell).
To subscribe to the model algorithm:
The algorithm is available in the list of AWS regions specified below.
Select a path on S3 to store the training and evaluation datasets and update the s3_data_dir below:
Upload sample training data to S3:
You’ll need your data in a .jsonl file that contains chat-formatted data. Doc
JSONL:
Note: If eval dataset is absent, we will auto-split the training dataset into training and evaluation datasets with the ratio of 80:20.
Each dataset must contain at least 1 examples. If an evaluation dataset is absent, training dataset must cointain at least 2 examples.
We recommend using a dataset than contains at least 100 examples but a larger dataset is likely to yield high quality finetunes. Be aware that a larger dataset would mean that the time to finetune would also be longer.
Specify a directory on S3 where finetuned models should be stored. Make sure you do not reuse the same directory across multiple runs.
Create Cohere client:
train_epochs: Integer. This is the maximum number of training epochs to run for. Defaults to 1learning_rate: Float. The initial learning rate to be used during training. Default to 0.0001train_batch_size: Integer. The batch size used during training. Defaults to 16 for Command.early_stopping_enabled: Boolean. Enables early stopping. When set to true, the final model is the best model found based on the validation set. When set to false, the final model is the last model of training. Defaults to true.
early_stopping_patience: Integer. Stop training if the loss metric does not improve beyond ‘early_stopping_threshold’ for this many times of evaluation. Defaults to 10
early_stopping_threshold: Float. How much the loss must improve to prevent early stopping. Defaults to 0.001.If the algorithm is command-r-0824-ft, you have the option to define:
lora_rank': Integer. Lora adapter rank. Defaults to 32Create fine-tuning jobs for the uploaded datasets. Add a field for eval_data if you have pre-split your dataset and uploaded both training and evaluation datasets to S3. Remember to use p4de for Command-R Finetuning.
The finetuned weights for the above will be store in a tar file {s3_models_dir}/test-finetune.tar.gz where the file name is the same as the name used during the creation of the finetune.
The Cohere AWS SDK provides a built-in method for creating an endpoint for inference. This will automatically deploy the model you finetuned earlier.
Note: This is equivalent to creating and deploying a
ModelPackagein SageMaker’s SDK.
Now, you can access all models deployed on the endpoint for inference:
After you’ve successfully performed inference, you can delete the deployed endpoint to avoid being charged continuously. This can also be done via the Cohere AWS SDK:
If you would like to unsubscribe to the model package, follow these steps. Before you cancel the subscription, ensure that you do not have any deployable models created from the model package or using the algorithm. Note - You can find this information by looking at the container name associated with the model.
Steps to unsubscribe to product from AWS Marketplace: