Upload files to "/"

2026-04-05 07:17:02 +00:00
parent 39070e07d8
commit f641dfa2ba
5 changed files with 254 additions and 57 deletions
@@ -9,6 +9,7 @@ An intelligent prompt classification and routing pipeline for [Open WebUI](https
 - **Brave web search** with full page content fetching (top 3 results scraped)
 - **Heuristic search overrides** — safety net that forces search for time-sensitive or factual questions
 - **Image generation** via AUTOMATIC1111/Forge (Stable Diffusion XL) with LLM-refined prompts
+- **Uncensored image generation** — prefix any prompt with `uncen` to bypass all classification/search and generate directly with Juggernaut XL v9
 - **VRAM management** — automatically juggles GPU memory between Ollama and Stable Diffusion
 - **Bilingual** — detects Finnish and forces responses in the correct language
 - **Thinking/reasoning display** — streams model thinking tokens in collapsible blocks
@@ -23,6 +24,7 @@ An intelligent prompt classification and routing pipeline for [Open WebUI](https
 | reasoning (FI) | gpt-oss:120b | gpt-oss:20b | Analysis, comparison, strategy (Finnish) |
 | reasoning (EN) | gpt-oss:120b | gpt-oss:20b | Analysis, comparison, strategy (English) |
 | image generation | gpt-oss:120b + SDXL | gpt-oss:20b + SDXL | "generate an image", "luo kuva" |
+| uncensored image | Juggernaut XL v9 | Juggernaut XL v9 | Prompt starts with `uncen` |
 | vision | llama3.2-vision:11b | llama3.2-vision:11b | User uploads an image |
 | general | gpt-oss:120b | gpt-oss:20b | Everything else |

@@ -99,14 +101,19 @@ cd ~/stable-diffusion-webui
 mkdir -p models/Stable-diffusion
 wget -O models/Stable-diffusion/sd_xl_base_1.0.safetensors \
    "https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors"
+
+# Download Juggernaut XL v9 for uncensored image generation (~6.6GB)
+wget -O models/Stable-diffusion/juggernautXL_v9.safetensors \
+    "https://huggingface.co/RunDiffusion/Juggernaut-XL-v9/resolve/main/Juggernaut-XL_v9_RunDiffusionPhoto_v2.safetensors"
 ```

 #### Fix Python 3.10 build issues (Ubuntu 22.04)

-Before the first launch, pre-install CLIP dependencies to avoid build failures:
+The first launch will create a Python venv and install dependencies. CLIP will fail to build due to a `pkg_resources` issue on Python 3.10. Fix it:

 ```bash
 cd ~/stable-diffusion-webui
+
 # First launch creates the venv — run it once, let it fail, then fix:
 ./webui.sh --api --listen --xformers --no-half-vae || true

@@ -119,7 +126,7 @@ venv/bin/pip install --no-build-isolation \
 ./webui.sh --api --listen --xformers --no-half-vae
 ```

-#### Select SDXL model
+#### Select the default SDXL model

 Once the UI is running, open it in a browser and select `sd_xl_base_1.0` from the checkpoint dropdown. Or via API:

@@ -129,14 +136,18 @@ curl -X POST http://localhost:7860/sdapi/v1/options \
    -d '{"sd_model_checkpoint": "sd_xl_base_1.0.safetensors"}'
 ```

+The pipeline automatically switches between models at runtime — `sd_xl_base_1.0` for normal generation, `juggernautXL_v9` when the `uncen` prefix is used.
+
 #### Create a systemd service

+Using the provided script:
+
 ```bash
 chmod +x setup-sd-service.sh
 sudo ./setup-sd-service.sh
 ```

-Or manually:
+Or manually (replace `$USER` and `$HOME` with actual values):

 ```bash
 sudo tee /etc/systemd/system/stable-diffusion.service > /dev/null <<EOF
@@ -164,12 +175,16 @@ sudo systemctl enable --now stable-diffusion
 #### Verify

 ```bash
+# Check the service is running
+sudo systemctl status stable-diffusion
+
+# Check available models (should list both sd_xl_base and juggernautXL)
 curl -s http://localhost:7860/sdapi/v1/sd-models | python3 -m json.tool
 ```

 ### 4. Network Configuration

-The pipeline runs inside Open WebUI's Docker container and needs to reach:
+The pipeline runs inside Open WebUI's Docker container and needs to reach services on the host:

 | Service | URL from container | Notes |
 |---|---|---|
@@ -182,21 +197,57 @@ To find your bridge gateway IP:
 docker network inspect <your_network> --format '{{range .IPAM.Config}}{{.Gateway}}{{end}}'
 ```

+Update `SD_URL` in the pipeline file if your gateway IP differs from `172.18.0.1`.
+
 Verify connectivity from inside the container:

 ```bash
 docker exec open-webui curl -s http://172.18.0.1:7860/sdapi/v1/sd-models
+docker exec open-webui curl -s http://ollama:11434/api/tags | head -c 100
 ```

+## Image Generation
+
+### Default mode
+
+Any prompt classified as `image_generation` (e.g. "generate an image of a cat in space") uses **SDXL Base 1.0**. The LLM refines the user's request into an optimized Stable Diffusion prompt with quality boosters, then calls the A1111 API.
+
+### Uncensored mode
+
+Prefix any prompt with `uncen` to bypass all classification, web search, and routing — the pipeline goes straight to image generation using **Juggernaut XL v9**:
+
+```
+uncen a beautiful sunset over the ocean
+uncen portrait of a warrior in golden armor
+```
+
+The `uncen` prefix is stripped and the user's text is sent directly to Stable Diffusion with quality tags appended — **no LLM refinement** (to avoid model refusal). The pipeline switches the SD checkpoint via the API automatically.
+
+### How it works
+
+**Default mode:**
+1. LLM (gpt-oss) converts the user request into an optimized SD prompt
+2. Ollama models are unloaded from VRAM
+3. SD checkpoint is loaded (SDXL Base)
+4. Image is generated, compressed PNG→JPEG, and streamed in 4KB chunks
+5. SD checkpoint is unloaded from VRAM and page cache is dropped
+
+**Uncensored mode:**
+1. `uncen` prefix is stripped, quality tags appended directly (no LLM call)
+2. Ollama models are unloaded from VRAM
+3. SD checkpoint is switched to Juggernaut XL v9
+4. Image is generated, compressed PNG→JPEG, and streamed in 4KB chunks
+5. SD checkpoint is unloaded from VRAM and page cache is dropped
+
 ## VRAM Management

-On a single 16GB GPU, gpt-oss:120b and SDXL cannot be loaded simultaneously. The pipeline handles this automatically:
+On a single 16GB GPU, large Ollama models and SDXL cannot be loaded simultaneously. The pipeline handles this automatically:

-1. **Before image generation**: unloads all Ollama models from VRAM
-2. **After image generation**: unloads SD checkpoint from VRAM and drops Linux page cache
+1. **Before image generation**: unloads all Ollama models from VRAM via `keep_alive: 0`
+2. **After image generation**: unloads SD checkpoint via `/sdapi/v1/unload-checkpoint` and drops Linux page cache
 3. Ollama reloads the model on the next chat request (~10-15s warm-up)

-If Ollama fails to load after image generation with a memory error, clear the page cache:
+If Ollama fails to load after image generation with a memory error, manually clear the page cache:

 ```bash
 sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches'
@@ -206,6 +257,8 @@ sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches'

 ```
 User Message
+    │
+    ├─ "uncen" prefix? ─────────────── → Juggernaut XL v9 (direct, no search)
    │
    ├─ Image uploaded? ──────────────── → llama3.2-vision:11b
    │
@@ -214,7 +267,7 @@ User Message
    │       ├─ coding ──────────────── → qwen2.5-coder:14b
    │       ├─ diagram ─────────────── → qwen2.5-coder:14b (Mermaid)
    │       ├─ reasoning ───────────── → gpt-oss:120b (FI/EN system prompt)
-    │       ├─ image_generation ────── → gpt-oss:120b (refine) → SDXL (generate)
+    │       ├─ image_generation ────── → gpt-oss:120b (refine) → SDXL Base
    │       └─ general ─────────────── → gpt-oss:120b
    │
    ├─ Heuristic Search Override
@@ -230,7 +283,7 @@ User Message
 |---|---|
 | `llm_router_v3.py` | Main pipeline (gpt-oss:120b) |
 | `llm_router-20b.py` | Lighter pipeline variant (gpt-oss:20b) |
-| `setup-sd.sh` | Stable Diffusion Forge install script |
+| `setup-sd.sh` | Stable Diffusion Forge install script (Ubuntu 22.04) |
 | `setup-sd-service.sh` | systemd service creation script |

 ## License