Skip to content

What are Soft Prompts and how to Train them.

So, we all know hard prompts. You ask LLM like llama or ChatGPT: what is a hard prompt? or you tell diffuser model like flux: generate a horse with big eyes and huge frog riding it in anime style.

It’s easy to see what you want to know. Now, it’s harder for llm to understand so people tried to a better way.

It was found, that you can start from random tokens like “fhytghhffbg” “jgygfcbcvhh“ and train them to create soft (not readable by humans) who are perfectly understandable by LLM or diffusion model for images.

So, how to transform a hard prompt into trained soft prompt tokens?

I will show the process using CLIP.

CLIP is a transformer, huge neural network used to associate text to image for diffuse image generation model. CLIP knows how things look and can pass this knowledge to generate complex images. It knows what horse looks like, how eye looks like etc.

How the training process looks?

  1. Starting Point:
  • Hard prompt: “image of serene landscape at sunrise with trees mountains and lake”
  • Random soft tokens: [v1][v2][v3]…[v10] (initialized as random noise) v’s look like “tfhgdth” like shown earlier.
  1. Training Process:

Original prompt → CLIP embeddings ↑ [v1][v2][v3] → Train these to match

  • Get embeddings of your hard prompt using CLIP
  • Gradually update the soft tokens to capture the same meaning
  • Use loss functions to make soft tokens match the semantic meaning. This means program tries to minimize the distance between hard and soft.
  1. Result:
  • Trained tokens like [v1][v2][v3]… that encode the “landscape” concept
  • These might look like gibberish (e.g., “jX#@ kL$%”) but they encode the meaning
  1. Usage:

# Original way "image of serene landscape at sunrise with trees mountains and lake" # With trained soft prompt "[v1][v2][v3] mountain view" # Shorter but carries same meaning

The key idea is:

  • Hard prompt: Human-readable, long, descriptive
  • Soft prompt: Machine-optimized tokens that encode the same meaning more efficiently

Further reading:

Here is a curated list of sources that delve into the concept of soft prompts and their applications in natural language processing:

1. “Soft Prompts” – Learn Prompting

Description: An in-depth exploration of soft prompts, detailing their development and applications.

Link: https://learnprompting.org/docs/trainable/soft_prompting

2. “Soft Prompts” – Hugging Face

Description: A comprehensive guide on soft prompts, discussing their implementation and benefits in model training.

Link: https://huggingface.co/docs/peft/conceptual_guides/prompting

3. “Understanding Prompt Tuning: Enhance Your Language Models with Precision” – DataCamp

Description: An article explaining prompt tuning and how soft prompts can be utilized to improve language model performance.

Link: https://www.datacamp.com/tutorial/understanding-prompt-tuning

4. “Soft Prompts” – KoboldAI/KoboldAI-Client GitHub Wiki

Description: A technical overview of soft prompts, including their creation and integration within the KoboldAI framework.

Link: https://github-wiki-see.page/m/KoboldAI/KoboldAI-Client/wiki/Soft-Prompts

5. “The Power of Scale for Parameter-Efficient Prompt Tuning” – arXiv

Description: A research paper discussing the scalability and efficiency of prompt tuning using soft prompts.

Link: https://arxiv.org/abs/2104.08691

6. “Guiding Frozen Language Models with Learned Soft Prompts” – Google Research

Description: An article exploring how soft prompts can be used to guide pre-trained language models without altering their core parameters.

Link: https://research.google/blog/guiding-frozen-language-models-with-learned-soft-prompts/

These resources provide comprehensive insights into soft prompts and their applications in enhancing language model performance.

Warning: deep rabbit hole ahead

Blonde AI generated model girl to test my ai stuff
Tags:

Leave a Reply

Your email address will not be published. Required fields are marked *