/ examples / spo / README.md
README.md
  1  # SPO | Self-Supervised Prompt Optimization <img src="../../docs/resources/spo/SPO-logo.png" width="60" height="60" style="vertical-align: middle; margin-left: 10px; position: relative; top: -5px;">
  2  
  3  [![Paper](https://img.shields.io/badge/Paper-arXiv-red)](https://arxiv.org/pdf/2502.06855)
  4  [![Demo](https://img.shields.io/badge/Demo-Hugging%20Face-yellow)](https://huggingface.co/spaces/XiangJinYu/SPO)
  5  [![ModelScope](https://img.shields.io/badge/Demo-ModelScope-blue)](https://modelscope.cn/studios/AI-ModelScope/SPO)
  6  
  7  An automated prompt engineering tool for Large Language Models (LLMs), designed for universal domain adaptation.
  8  
  9  A next-generation prompt engineering system implementing **Self-Supervised Prompt Optimization (SPO)**. Achieves state-of-the-art performance with 17.8-90.9× higher cost efficiency than conventional methods. 🚀
 10  
 11  <p align="center">
 12  <a href=""><img src="../../docs/resources/spo/SPO-method.png" alt="Framework of SPO" title="Framework of SPO <sub>1</sub>" width="80%"></a>
 13  </p>
 14  
 15  ## ✨ Core Advantages
 16  
 17  - 💸 **Ultra-Low Cost** - _$0.15 per task optimization_
 18  - 🏷️ **Zero Supervision** - _No ground truth/human feedback required_
 19  - ⚡ **Universal Adaptation** - _Closed & open-ended tasks supported_
 20  - 🔄 **Self-Evolving** - _Auto-optimization via LLM-as-judge mechanism_
 21  
 22  ## 🔗 Quick Links
 23  
 24  - [📝 Read our paper](https://arxiv.org/pdf/2502.06855)
 25  - [🤗 Try our Hugging Face demo](https://huggingface.co/spaces/XiangJinYu/SPO)
 26  - [🔮 Try our ModelScope demo](https://modelscope.cn/studios/AI-ModelScope/SPO)
 27  
 28  
 29  ## 📊 Experiment
 30  
 31  ###  Closed Tasks
 32  <p align="center">
 33  <a href=""><img src="../../docs/resources/spo/SPO-closed_task_table.png" alt="SPO closed task table" title="SPO closed task table <sub>1</sub>" width="80%"></a>
 34  <a href=""><img src="../../docs/resources/spo/SPO-closed_task_figure.png" alt="SPO closed task figure" title="SPO closed task figure <sub>1</sub>" width="80%"></a>
 35  </p>
 36  
 37  *SPO demonstrates superior cost efficiency, requiring only 1.1% to 5.6% of the cost of state-of-the-art methods while maintaining competitive performance.*
 38  
 39  ### Open-ended Tasks
 40  <p align="center">
 41  <a href=""><img src="../../docs/resources/spo/SPO-open_ended_task_figure.png" alt="Open-ended task figure" title="Open-ended task figure <sub>1</sub>" width="80%"></a>
 42  </p>
 43  
 44  *SPO significantly improves model performance across all model configurations in open-ended tasks.*
 45  
 46  ## 🚀 Quick Start
 47  
 48  ### 1. Configure Your API Key ⚙️
 49  
 50  Configure LLM parameters in `config/config2.yaml` (see `examples/spo/config2.example.yaml` for reference)
 51  ### 2. Define Your Iteration template 📝
 52  
 53  Create a Iteration template file `metagpt/ext/spo/settings/task_name.yaml`:
 54  ```yaml
 55  prompt: |
 56    Please solve the following problem.
 57  
 58  requirements: |
 59    ...
 60  
 61  count: None
 62  
 63  qa:
 64    - question: |
 65        ...
 66      answer: |
 67        ...
 68  
 69    - question: |
 70        ...
 71      answer: |
 72        ...
 73  ```
 74  
 75  Notes:
 76  - `prompt`: Initial prompt for iteration
 77  - `requirements`: Desired effects/outcomes (e.g., generate more thinking, use more humorous language)
 78  - `count`: Target word count for the generated prompt (e.g., 50). Set to None for no limit
 79  - `faq`: QA pairs used for iteration, can include appropriate number of pairs (typically 3)
 80    - `question`: Questions from the dataset used for iteration
 81    - `answer`: Corresponding answers. Can contain desired thinking patterns or responses instead of actual answers, or can be left empty. See `metagpt/ext/spo/settings/Navigate.yaml` for reference
 82  
 83  ### 3. Implement the PromptOptimizer 🔧
 84  
 85  You have three ways to run the PromptOptimizer:
 86  
 87  #### Option 1: Python Script
 88  
 89  ```python
 90  from metagpt.ext.spo.components.optimizer import PromptOptimizer
 91  from metagpt.ext.spo.utils.llm_client import SPO_LLM
 92  
 93  if __name__ == "__main__":
 94    # Initialize LLM settings
 95    SPO_LLM.initialize(
 96      optimize_kwargs={"model": "claude-3-5-sonnet-20240620", "temperature": 0.7},
 97      evaluate_kwargs={"model": "gpt-4o-mini", "temperature": 0.3},
 98      execute_kwargs={"model": "gpt-4o-mini", "temperature": 0}
 99    )
100  
101    # Create and run optimizer
102    optimizer = PromptOptimizer(
103      optimized_path="workspace",  # Output directory
104      initial_round=1,  # Starting round
105      max_rounds=10,  # Maximum optimization rounds
106      template="Poem.yaml",  # Template file
107      name="Poem",  # Project name
108    )
109  
110    optimizer.optimize()
111  ```
112  
113  #### Option 2: Command Line Interface
114  
115  ```bash
116  python -m examples.spo.optimize
117  ```
118  
119  Available command line options:
120  ```
121  --opt-model            Model for optimization (default: claude-3-5-sonnet-20240620)
122  --opt-temp            Temperature for optimization (default: 0.7)
123  --eval-model          Model for evaluation (default: gpt-4o-mini)
124  --eval-temp          Temperature for evaluation (default: 0.3)
125  --exec-model          Model for execution (default: gpt-4o-mini)
126  --exec-temp          Temperature for execution (default: 0)
127  --workspace          Output directory path (default: workspace)
128  --initial-round      Initial round number (default: 1)
129  --max-rounds        Maximum number of rounds (default: 10)
130  --template          Template file name (default: Poem.yaml)
131  --name              Project name (default: Poem)
132  ```
133  
134  For help:
135  ```bash
136  python -m examples.spo.optimize --help
137  ```
138  
139  #### Option 3: Streamlit Web Interface
140  
141  For a more user-friendly experience, you can use the Streamlit web interface to configure and run the optimizer.
142  
143  First, install Streamlit:
144  ```bash
145  pip install "streamlit~=1.42.0"
146  ```
147  
148  Then run the web interface:
149  ```bash 
150  python -m streamlit run metagpt/ext/spo/app.py
151  ```
152  
153  ### 4. View Results
154  ```
155  workspace
156    └── Project_name
157        └── prompts
158            ├── results.json 
159            ├── round_1
160            │   ├── answers.txt
161            │   └── prompt.txt
162            ├── round_2
163            │   ├── answers.txt
164            │   └── prompt.txt
165            ├── round_3
166            │   ├── answers.txt
167            │   └── prompt.txt
168            ├── ...
169            └── round_n
170                ├── answers.txt
171                └── prompt.txt
172  ```
173  
174  - `results.json`: Stores whether each iteration round was judged successful and other related information
175  - `prompt.txt`: The optimized prompt for the corresponding round
176  - `answers.txt`: The output results generated using the prompt for the corresponding round
177  
178  ## Citation
179  
180  If you use SPO in your research, please cite our paper:
181  
182  ```
183  @misc{xiang2025spo,
184        title={Self-Supervised Prompt Optimization}, 
185        author={Jinyu Xiang and Jiayi Zhang and Zhaoyang Yu and Fengwei Teng and Jinhao Tu and Xinbing Liang and Sirui Hong and Chenglin Wu and Yuyu Luo},
186        year={2025},
187        eprint={2502.06855},
188        archivePrefix={arXiv},
189        primaryClass={cs.CL},
190        url={https://arxiv.org/abs/2502.06855}, 
191  }
192  ```