Awesome-Model-Merging-Methods-Theories-Applications

🥈Silver

Model merging techniques for LLMs, MLLMs, and other AI models. Operations teams use this to combine models for improved performance, robustness, and multi-task learning. Integrates with Claude and other AI agents.

713370Updated 2mo ago

Intermediate30min to implementautomation

Saves ~180 min per use

Quick InstallView Source

git clone https://github.com/EnnengYang/Awesome-Model-Merging-Methods-Theories-Applications.git

Works with:

Claude

Overview

About This Skill

Awesome-Model-Merging-Methods-Theories-Applications is a curated collection of research and techniques for merging large language models, multimodal large language models, and other AI models without requiring raw training data or expensive computation. The resource covers exhaustive taxonomies of model merging methods and their theoretical foundations, making it a reference for understanding how to combine trained models effectively. Applications span continual learning, multi-task learning, few-shot learning, and 10+ machine learning subfields. Operations and ML teams use these techniques to enhance model performance and robustness by merging complementary trained models. The underlying survey, published in ACM Computing Surveys 2026, provides comprehensive coverage of merging methods, practical applications, and future research directions in the field.

How to Use

["Prepare your models: Ensure [MODEL_A] and [MODEL_B] are downloaded and compatible (e.g., same architecture, same tokenizer). Use `transformers` or `peft` libraries for LoRA/Adapter support.","Select merge method: Choose from TIES, Task-Arithmetic, SLERP, or DARE based on your goals (e.g., TIES for robustness, Task-Arithmetic for task-specific improvements).","Configure parameters: Set weighting schemes (e.g., equal weights, task-specific scaling), tuning parameters (e.g., LoRA rank, dropout), and evaluation datasets.","Merge and validate: Run the merge operation (e.g., using `mergekit` or custom scripts) and evaluate the merged model on your target task. Use tools like `lm-eval-harness` for standardized benchmarks.","Deploy and monitor: Export the merged model (e.g., to ONNX, GGUF, or HuggingFace Hub) and monitor performance in production. Track trade-offs like latency, memory, and accuracy over time."]

Use Cases

Automate the merging of multiple LLMs to create a more robust and versatile model.

Implement model merging techniques to improve the performance of generative models in creative applications.

Utilize model merging to facilitate knowledge transfer in multi-task learning scenarios.

Enhance the efficiency of federated learning systems by aggregating local models through merging.

Setup & Installation

Quick Install

No install command available. Check the GitHub repository for manual installation instructions.

Alternative Install (Git Clone)

git clone https://github.com/EnnengYang/Awesome-Model-Merging-Methods-Theories-Applications

Requirements

Claude Code or compatible AI agent
Works with: Claude

Quick Start Guide

Install the Skill

Copy the install command above and run it in your terminal.

Open Your AI Agent

Launch Claude Code, Cursor, or your preferred AI coding agent.

Try It Out

Use the prompt template or examples below to test the skill.

Customize

Adapt the skill to your specific use case and workflow.

Usage Examples

Prompt Template

Merge [MODEL_A] and [MODEL_B] using [MERGE_METHOD] to optimize for [TARGET_TASK]. Apply [WEIGHTING_SCHEME] with [TUNING_PARAMETERS] and validate performance on [EVALUATION_DATASET]. Generate a merged model file and provide a summary of trade-offs (e.g., latency vs. accuracy, robustness improvements). Include code snippets for [FRAMEWORK] if applicable.

Example Output

### Merged Model Report: Llama-2-7B + Mistral-7B (Task: Code Generation)

**Merge Method**: TIES-Merging (0.75 weight for Llama, 0.25 for Mistral)
**Weighting Scheme**: Task-Arithmetic with task vector scaling factor = 0.5
**Tuning Parameters**: LoRA rank=8, alpha=16, dropout=0.1
**Evaluation Dataset**: HumanEval (164 problems)

**Performance Metrics**:
- Pass@1: 68.3% (vs. 62.1% Llama-2-7B, 65.8% Mistral-7B)
- Latency: 1.2s per 100 tokens (vs. 1.1s Llama-2, 1.3s Mistral)
- Robustness: 15% fewer failures on adversarial prompts

**Trade-offs**:
- Slight increase in memory usage (7.8GB vs. 7.2GB base models)
- Reduced hallucination rate by 22% on factual QA tasks

**Merged Model Artifacts**:
- HuggingFace Hub: `merged-models/llama-mistral-codegen-v1`
- ONNX export: `merged-model.onnx` (4.2GB)
- Quantized INT8 version: `merged-model-int8.gguf` (1.8GB)

**Code Snippet (HuggingFace Transformers)**:
```python
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("merged-models/llama-mistral-codegen-v1")
tokenizer = AutoTokenizer.from_pretrained("merged-models/llama-mistral-codegen-v1")
inputs = tokenizer("Write a Python function to sort a list of dictionaries by key", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```