Train task-specific small language models (SLMs) using Distil Labs CLI. Ideal for operations teams needing compact, specialized models. Connects to Distil Labs CLI for model training and deployment.
git clone https://github.com/distil-labs/distil-cli-skill.gitTrain task-specific small language models (SLMs) using Distil Labs CLI. Ideal for operations teams needing compact, specialized models. Connects to Distil Labs CLI for model training and deployment.
["Install Distil Labs CLI: `pip install distil-labs` and authenticate with your API key.","Prepare your dataset: Organize text data in JSONL format with clear labels (e.g., `{\"text\": \"...\", \"label\": \"category\"}`). Use the CLI’s built-in preprocessing tools if needed.","Train the model: Run the `distil train` command with your base model (e.g., `distilbert-base-uncased`), training/validation files, and hyperparameters. Monitor logs for loss/accuracy trends.","Evaluate and export: Use `distil evaluate` on a test set to generate metrics, then export the model with `distil export` for deployment.","Deploy: Integrate the exported model into your application via the Distil Labs inference API or export it to ONNX/TensorRT for edge deployment. Set up monitoring for model drift."]
No install command available. Check the GitHub repository for manual installation instructions.
git clone https://github.com/distil-labs/distil-cli-skillCopy the install command above and run it in your terminal.
Launch Claude Code, Cursor, or your preferred AI coding agent.
Use the prompt template or examples below to test the skill.
Adapt the skill to your specific use case and workflow.
Use the Distil Labs CLI to fine-tune a small language model (SLM) for [TASK_DOMAIN, e.g., customer support ticket classification]. Follow these steps: 1) Preprocess the dataset in [FOLDER_PATH] into Distil Labs-compatible format. 2) Run the CLI command to train the model with [MODEL_NAME, e.g., 'distilbert-base-uncased'] as the base. 3) Evaluate the model on [TEST_SET_PATH] and report metrics like accuracy, F1-score, and inference speed. 4) Export the model to [OUTPUT_DIR] for deployment. Share the full CLI command sequence and a summary of results.
### Distil Labs CLI Fine-Tuning Report: Customer Support Ticket Classifier **Dataset Overview:** - Source: 12,450 labeled support tickets from Q1 2024 (CSV format in `/data/support_tickets/`). - Classes: 8 (e.g., Billing, Technical Issue, Account Access, Product Feedback). - Class distribution: Billing (32%), Technical Issue (28%), Account Access (18%), Product Feedback (12%), Others (10%). **Preprocessing:** Converted raw tickets into Distil Labs format using: ```bash python preprocess.py --input /data/support_tickets/raw.csv --output /data/support_tickets/processed.jsonl --text-field 'description' --label-field 'category' ``` Result: 12,450 JSONL entries with cleaned text and labels. **Training Command:** ```bash distil train \ --model_name distilbert-base-uncased \ --train_file /data/support_tickets/processed.jsonl \ --validation_file /data/support_tickets/validation.jsonl \ --output_dir /models/support_ticket_classifier_v1 \ --per_device_train_batch_size 16 \ --num_train_epochs 3 \ --learning_rate 2e-5 ``` **Training Metrics:** - Training loss: 0.21 (epoch 3) - Validation accuracy: 92.3% - F1-score (macro avg): 0.91 - Inference speed: 450 tokens/sec (on AWS g4dn.xlarge) - Model size: 240MB (compressed) **Evaluation on Test Set:** | Class | Precision | Recall | F1-Score | Support | |------------------|-----------|--------|----------|---------| | Billing | 0.94 | 0.93 | 0.93 | 1,200 | | Technical Issue | 0.91 | 0.90 | 0.90 | 1,100 | | Account Access | 0.89 | 0.92 | 0.90 | 800 | | Product Feedback | 0.87 | 0.85 | 0.86 | 600 | **Deployment:** Exported model to `/models/support_ticket_classifier_v1` with: ```bash distil export --model_dir /models/support_ticket_classifier_v1 --output_dir /deploy/support_ticket_classifier ``` The model is now ready for integration with our ticketing system via REST API. **Recommendations:** 1. Monitor drift monthly using a holdout validation set. 2. Retrain quarterly with new ticket data to maintain accuracy. 3. Consider quantizing the model to 8-bit for edge deployment (expected size: 60MB).
AI assistant built for thoughtful, nuanced conversation
IronCalc is a spreadsheet engine and ecosystem
ITIL-aligned IT service management platform
Customer feedback management made simple
Enterprise workflow automation and service management platform
Automate your spreadsheet tasks with AI power
Take a free 3-minute scan and get personalized AI skill recommendations.
Take free scan