SakuraAI

An Earl McGowen Company

Prompt Library Search

Semantic search across a curated prompt library — find system prompts, instruction templates, and reusable snippets by meaning.

Semantic Search Embeddings Home GPU

Built by Earl McGowen

🔧

Technical Architecture

Search Semantic embeddings via nomic-embed-text
Inference Ollama on Home GPU (RTX 5070)
Backend Flask on local network, tunnelled to external host
Frontend SvelteKit proxy on external host

Search

1 Type a natural-language query — describe what you need by meaning, not keywords.
2 Pick how many results to return with the Top K selector.
3 Click Search. Results are ranked by cosine similarity to your query.

Generate Prompt

1 Run a search first — the top-k results become context for generation.
2 Optionally add a refinement instruction to steer the output.
3 Choose a model and click Generate Prompt.

Prompt Builder

Generate a new prompt based on your search query and the top-k matching sources.

Model guide
  • mistral-small:24b — Best quality, highest latency/VRAM. May time out on complex prompts; if it does, refresh the page and retry.
  • deepseek-r1:14b — Strong reasoning, but can be verbose unless tightly constrained via the refinement field.
  • qwen2.5-coder:7b — Very solid instruction-following for prompt and code text tasks.
  • llama3:latest — Good general-purpose baseline.
  • llama3.2:1b — Too small for reliable prompt generation; results will be inconsistent.

Note: Embedding-only models (e.g. nomic-embed-text) are not suitable for generation and will fail if selected. If a request times out, refresh the page before retrying.