This guide is structured to work as a plug-and-play resource for LLMs. Upload the HTML file to Claude, ChatGPT, Gemini or any AI assistant and ask it to walk you through setup step by step — one command at a time, verifying each output before moving on. No prior Linux or mining experience needed.
Step 01
Download the guide
Click "Download HTML" to save the guide file locally
Step 02
Open your AI assistant
Claude, ChatGPT, Gemini — any LLM that accepts file uploads
Step 03
Upload the guide + prompt
Say: "Follow this guide to help me set up a Pearl miner. One command at a time, verify each output."
Step 04
Follow along
Paste each command output back to the AI — it verifies and moves on
⚡ Tip — Expose Port 44108 Before Deploying
Pearl's P2P port is 44108. Adding it in RunPod's TCP port exposures field before deploying gives you more peers and faster block propagation. Takes 5 seconds. If you forgot — don't restart for it, just add it next time. 16 peers works fine for mining.
Pearl Miner Setup Guide Complete installation, configuration, health monitoring & troubleshooting RunPod 2x H200 CUDA 12.8+ DP=2 Mode Llama 70B Pearl Miner Setup Guide Complete installation, configuration, health monitoring & troubleshooting RunPod / Vast.ai / Lambda CUDA 12.4+ H100 / H200 Required DP=2 Mode Llama 70B Ubuntu 22.04 Contents — Hardware Requirements & Pod Verification (READ FIRST) 00a Cloud Provider Paths & Compatibility 00 Quick Reference (Key Settings) 01 System Dependencies 02 Clone & Build 02b Set Environment Variables (Persistent) 03 Wallet & Node Setup 04 Start Mining 05 Health Check & Diagnostics 06 Troubleshooting 07 Block Verification 08 Quick Restart Reference 08b Additional Critical Notes 09 Key Lessons Learned 10 Important Gotchas & Edge Cases ⚡ Hardware Requirements & Pod Verification 🚨 This guide was built and tested exclusively on RunPod 2x H200 SXM. Every command, expected output, VRAM value, power draw figure, and health check threshold in this guide is calibrated for that specific setup. If you use different hardware, commands will still work but expected output values will differ. What This Guide Is Designed For Component This Guide's Setup Notes GPU 2x NVIDIA H200 SXM Confirmed working. All expected values in this guide are for H200. VRAM per GPU 143,771 MiB (~141GB) After model load: ~132,964 MiB GPU0 / ~131,399 MiB GPU1 CUDA Version 12.8 Minimum: 12.4. Tested on 12.8. Driver 570.211.01 Any 520+ should work GPU Count Exactly 2 Guide uses --data-parallel-size 2 System RAM 64GB+ Needed for build + model loading Disk 300GB+ persistent Model ~140GB + builds ~50GB + chain ~5GB OS Ubuntu 22.04 RunPod default image Provider RunPod See Section 00a for other providers GPU power at full mining ~690W each (near 700W TDP) This is the health indicator — if power is 120W, mining is not happening Other Hardware — Community Reports (Not Tested by This Guide) ⚠️ The following is based on Pearl Discord community reports — NOT verified by this guide. If you use different hardware, expected output values will differ from what this guide shows. Proceed with caution and adapt health check thresholds accordingly. GPU Community Status Notes H200 SXM ×2 ✅ This guide — confirmed Reference setup for this guide H100 SXM ×2 ✅ Community confirmed Works. 80GB VRAM each. Adjust VRAM expectations in health checks. H100 NVL ×1 + H200 ×1 ⚠️ Community reported Mixed setup. Some users got blocks. Single H200 ⚠️ Possible Use --data-parallel-size 1, 64 requests. Lower hashrate. A100 ×2 ❌ Not recommended Ampere architecture — Pearl kernel targets Hopper. May not compile. RTX 4090 ×2 ❌ Insufficient VRAM 24GB each = 48GB total. Not enough for 70B model. Step 0 — Verify Your Pod Before Starting Run these immediately after SSH-ing in. If any check fails, reprovision before continuing. Check 1 — GPU model, CUDA, VRAM nvidia-smi ✅ Good (H200) 2x H200, CUDA 12.8, Driver 570+, 143771 MiB each, 0 MiB used ❌ Bad Wrong GPU, CUDA <12.4, only 1 GPU, or VRAM already used → reprovision Check 2 — Disk space (need 300GB+ free) df -h | sort -rh | head -8 ✅ Good 300GB+ available on at least one partition ❌ Bad Less than 300GB → expand disk before proceeding Check 3 — RAM free -h ✅ Good 64GB+ total RAM ❌ Bad Under 64GB → may OOM during build or model load Check 4 — OS lsb_release -a 2>/dev/null || cat /etc/os-release | head -5 ✅ Good Ubuntu 22.04 LTS (Jammy) ⚠️ Untested Ubuntu 20.04 — may work but not verified by this guide Check 5 — Internet curl -s --max-time 5 https://github.com > /dev/null && echo "GitHub OK" && curl -s --max-time 5 https://huggingface.co > /dev/null && echo "HuggingFace OK" ✅ Good GitHub OK / HuggingFace OK ❌ Bad Blocked → check provider firewall / outbound rules Full pre-flight one-liner echo "=== GPU ===" && nvidia-smi --query-gpu=name,memory.total,driver_version --format=csv,noheader && echo "=== DISK ===" && df -h | sort -rh | head -5 && echo "=== RAM ===" && free -h | grep Mem && echo "=== OS ===" && lsb_release -d 2>/dev/null && echo "=== NETWORK ===" && curl -s --max-time 5 https://github.com > /dev/null && echo "GitHub OK" || echo "GitHub BLOCKED" ✅ Only proceed to Step 01 if: 2x H200 (or compatible GPU), CUDA 12.4+, 300GB+ disk, 64GB+ RAM, Ubuntu 22.04, GitHub reachable. 01 System Dependencies 02 Clone & Build 02b Set Environment Variables (Persistent) 03 Wallet & Node Setup 04 Start Mining 05 Health Check & Diagnostics 06 Troubleshooting 07 Block Verification 08 Quick Restart Reference 08b Additional Critical Notes (Watchdog, Debug, Gotchas) 09 Key Lessons Learned 10 Important Gotchas & Edge Cases 00a Cloud Provider Paths & Compatibility This guide was built and tested on RunPod. The core setup is identical across providers — only storage paths and a few installation details differ. Use this table to adapt the guide for your provider. Provider HF_HOME path UV Cache path Persistent storage Notes RunPod ✅ Tested /workspace/.hf /workspace/.uv-cache /workspace Deadsnakes PPA blocked — use UV for Python 3.12. Ubuntu 22.04. Vast.ai /root/.cache/huggingface or /workspace/.hf /root/.cache/uv /workspace (if attached) Use Custom Template. Ubuntu 22.04 works. apt python3.12 may work via deadsnakes. Lambda Labs /home/ubuntu/.cache/huggingface /home/ubuntu/.cache/uv /home/ubuntu Ubuntu 22.04. Python 3.12 via deadsnakes should work. Run as ubuntu not root. CoreWeave /mnt/data/.hf /mnt/data/.uv-cache /mnt/data Kubernetes-based. Persistent volume must be mounted manually. Paperspace /notebooks/.hf /notebooks/.uv-cache /notebooks Ubuntu 20.04/22.04. Python 3.12 via deadsnakes. Any provider Any path with 200GB+ free space Any writable path Check df -h for largest partition Find largest partition: df -h | sort -rh | head -5 How to Adapt This Guide for Any Provider Replace every occurrence of /workspace/.hf with your provider's persistent storage path, and /workspace/.uv-cache with the UV cache path. The two places these appear are: 1. In ~/.bashrc export HF_HOME=/YOUR_PROVIDER_PATH/.hf 2. In the build:miner command (Step 2) cd /root/pearl && export UV_CACHE_DIR=/YOUR_PROVIDER_PATH/.uv-cache && export HF_HOME=/YOUR_PROVIDER_PATH/.hf && task build:miner Python 3.12 Installation by Provider Provider Python 3.12 Method Command RunPod apt blocked — use UV uv python install 3.12 Vast.ai Try apt first, fallback to UV apt-get install -y python3.12 || uv python install 3.12 Lambda / Paperspace apt via deadsnakes PPA add-apt-repository ppa:deadsnakes/ppa && apt-get install -y python3.12 Any provider (universal) UV always works uv python install 3.12 ℹ️ UV-based Python install (Step 1) always works regardless of provider — it downloads a standalone CPython binary. Use it as the universal fallback if apt fails. Pre-flight Check (run on any fresh pod) Verify GPU + CUDA before starting nvidia-smi && echo "CUDA OK" || echo "NO GPU DETECTED" ✅ Good Shows H100/H200, CUDA 12.x, Driver 520+ ❌ Bad No GPU detected → wrong instance type, reprovision Find largest storage partition (for HF_HOME) df -h | sort -rh | head -5 Pick the partition with 300GB+ free space for HF_HOME. The 70B model needs ~140GB. 00 Quick Reference Setting Value Why Parallelism --data-parallel-size 2 NOT tensor parallel — TP reduces m dimension Prefix Caching --no-enable-prefix-caching MUST disable — caching = no GEMM = no mining Chunked Prefill --no-enable-chunked-prefill Must disable for correct mining behavior GPU Memory --gpu-memory-utilization 0.9 Leave 10% headroom Model Length --max-model-len 8192 Fits in 80GB VRAM Execution --enforce-eager Required for Pearl kernel ZK Speed export RAYON_NUM_THREADS=96 Faster proof generation Deep GEMM export VLLM_USE_DEEP_GEMM=0 Disable — conflicts with Pearl GEMM Requests 128 concurrent long-prompt requests Long prompts (~150+ tokens) needed for m≥5000 Loop pattern sleep 1 (NOT wait) wait causes GPU to idle between batches → 0% util Request port port 8000 ONLY DP=2 exposes single port — port 8001 drops silently Socket Count 4 ESTAB connections 2 per DP engine = 4 total when healthy n value in NOISY_GEMM 57344 Confirms DP mode (TP gives 28672) Node RPC port 44107 (pearld) pearl daemon Wallet RPC port 44207 (oyster) wallet daemon 01 System Dependencies ℹ️ Run each block separately. Verify output before moving to next step. Go Language Run wget -q https://go.dev/dl/go1.24.2.linux-amd64.tar.gz && tar -C /usr/local -xzf go1.24.2.linux-amd64.tar.gz && export PATH=\$PATH:/usr/local/go/bin && echo 'export PATH=\$PATH:/usr/local/go/bin' >> ~/.bashrc Verify go version ✅ Good go version go1.24.2 linux/amd64 ❌ Bad command not found → re-run wget/tar command Rust Toolchain Run curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y && source ~/.cargo/env Verify rustc --version ✅ Good rustc 1.xx.x (xxxxxxx YYYY-MM-DD) ❌ Bad command not found → source ~/.cargo/env UV Package Manager Run curl -LsSf https://astral.sh/uv/install.sh | sh && source \$HOME/.local/bin/env ✅ Good uv 0.x.x ❌ Bad command not found → run: source \$HOME/.local/bin/env Taskfile Run sh -c "\$(curl --location https://taskfile.dev/install.sh)" -- -d -b /usr/local/bin ✅ Good Task version: vx.x.x ❌ Bad permission denied → check /usr/local/bin permissions tmux Run apt-get update -qq && apt-get install -y tmux Python 3.12 ⚠️ On RunPod, the deadsnakes PPA is blocked — apt-get install python3.12 will fail with "Unable to locate package". Use UV to install Python 3.12 instead (UV is already installed above). Install Python 3.12 via UV uv python install 3.12 Make it the system default ln -sf \$(uv python find 3.12) /usr/local/bin/python3.12 && update-alternatives --install /usr/bin/python3 python3 /usr/local/bin/python3.12 1 && python3 --version ✅ Good Python 3.12.x ❌ Bad command not found → re-run uv python install 3.12 02 Clone & Build Clone Repository Run cd /root && git clone https://github.com/pearl-research-labs/pearl.git && cd pearl ✅ Good Cloning into 'pearl'... done. You are now in /root/pearl ❌ Bad fatal: repository not found → check internet ℹ️ All build commands must run from /root/pearl directory. Verify with: pwd → should show /root/pearl Build Blockchain Run (from /root/pearl) cd /root/pearl && task build:blockchain ✅ Good Build completes without errors ❌ Bad go: command not found → export PATH=\$PATH:/usr/local/go/bin Verify binaries exist ls -la /root/pearl/bin/pearld /root/pearl/bin/oyster /root/pearl/bin/prlctl ✅ Good All 3 files listed with size >0 ❌ Bad No such file → build failed, check task output for errors Build Miner (~20-25 minutes) ⚠️ This takes 20-25 minutes. Do NOT interrupt it! First run compiles CUDA kernels. Run (from /root/pearl) cd /root/pearl && export UV_CACHE_DIR=/workspace/.uv-cache && export HF_HOME=/workspace/.hf && task build:miner ✅ Good Installed 265 packages — vllm==0.20.0+cu129 in list ❌ Bad CUDA build failed → check nvidia-smi shows H100/H200 Verify venv exists ls /root/pearl/.venv/bin/vllm && ls /root/pearl/.venv/bin/pearl-gateway ✅ Good Both files listed — build successful ❌ Bad No such file → miner build failed, re-run task build:miner 02b Set Environment Variables (Persistent) 🚨 CRITICAL: Do this BEFORE starting the miner. Env vars must be in ~/.bashrc so they survive across shell sessions and tmux windows. If you only export them inline, the gateway will fail with "mining_address: Field required" because the vars don't reach the tmux session. Add all required env vars to ~/.bashrc now (you will update PEARLD_MINING_ADDRESS after Step 3): Add to ~/.bashrc cat >> ~/.bashrc ✅ Good Prints: 0 ❌ Bad Empty output → vars not set, re-run the cat command ⚠️ After generating your mining address in Step 3, update ~/.bashrc: replace PLACEHOLDER with your real address, then run source ~/.bashrc before starting the miner in Step 4. 03 Wallet & Node Setup Create Wallet Run cd /root/pearl && ./bin/oyster --create 🚨 CRITICAL: Write down the 12-word seed phrase! This is your ONLY backup. If you lose it you lose all mined PRL forever. When prompted, answer as follows: Prompt Answer Do you want to add a passphrase? No (just press Enter) — or set one you'll remember Do you have an existing seed phrase? No Seed phrase shown ⚠️ WRITE IT DOWN NOW — all 12 words in order Type OK to confirm OK Start tmux Sessions Run tmux new-session -d -s node && tmux new-session -d -s miner && tmux new-session -d -s loop ✅ Good tmux ls shows: node, miner, loop sessions ❌ Bad session exists → tmux kill-session -t node first Start Blockchain Node Run tmux send-keys -t node "cd /root/pearl && ./bin/pearld --rpcuser=rpcuser --rpcpass=rpcpass --rpclisten=0.0.0.0:44107 --txindex --notls" Enter Wait 30 seconds then verify: Verify cd /root/pearl && ./bin/prlctl -u rpcuser -P rpcpass -s localhost:44107 --notls getblockcount ✅ Good Returns a block number (e.g., 36000+) ❌ Bad connection refused → node not started, check tmux node session Get Mining Address Run /root/pearl/bin/oyster -u rpcuser -P pearl123 --noclienttls --noservertls --pearldusername=rpcuser --pearldpassword=rpcpass > /tmp/oyster.log 2>&1 & sleep 15 && /root/pearl/bin/prlctl -u rpcuser -P pearl123 -s localhost:44207 --wallet --notls getnewaddress ✅ Good Returns address starting with prl1p... ❌ Bad connection refused → oyster not ready, wait longer and retry ⚠️ SAVE THIS ADDRESS! You'll need it in every restart command. Also verify it with validateaddress below. 🚨 Now update your ~/.bashrc with the real address: sed -i 's/PEARLD_MINING_ADDRESS=PLACEHOLDER/PEARLD_MINING_ADDRESS=YOUR_ACTUAL_ADDRESS/' ~/.bashrc && source ~/.bashrc && echo \$PEARLD_MINING_ADDRESS — confirm it prints your address before proceeding. Verify Address is Yours Run (replace YOUR_ADDRESS) /root/pearl/bin/prlctl -u rpcuser -P pearl123 -s localhost:44207 --wallet --notls validateaddress YOUR_ADDRESS ✅ Good "ismine": true ❌ Bad "ismine": false → wrong address, generate a new one 04 Start Mining Start Gateway + vLLM (replace YOUR_MINING_ADDRESS) 🚨 Gateway and vLLM MUST start in the same tmux session! If separate, they won't connect. Also — delete any stale socket first: rm -f /tmp/pearlgw.sock 🚨 Use FULL PATHS to vllm and pearl-gateway — do NOT rely on venv activate inside tmux. The activate command often fails silently in tmux send-keys, causing "vllm: command not found". Run rm -f /tmp/pearlgw.sock && tmux kill-session -t miner 2>/dev/null; tmux new-session -d -s miner && tmux send-keys -t miner "cd /root/pearl && source ~/.bashrc && /root/pearl/.venv/bin/pearl-gateway start > /tmp/gateway.log 2>&1 & sleep 10 && /root/pearl/.venv/bin/vllm serve pearl-ai/Llama-3.3-70B-Instruct-pearl --host 0.0.0.0 --port 8000 --max-model-len 8192 --gpu-memory-utilization 0.9 --enforce-eager --data-parallel-size 2 --no-enable-prefix-caching --no-enable-chunked-prefill" Enter ℹ️ Gateway logs go to /tmp/gateway.log — this keeps the miner tmux session clean so vLLM output is visible. Check gateway: tail -5 /tmp/gateway.log ⚠️ vLLM takes 10-15 minutes to load the 70B model on first run (~140GB download). Subsequent runs use cached model from /workspace/.hf and load in ~2-3 minutes. Wait for Node to Sync Before vLLM Starts 🚨 If the node is still syncing when vLLM starts, it will crash with "mining_paused: no block template available". The node must be fully synced first. Check sync status: Check sync status cd /root/pearl && ./bin/prlctl -u rpcuser -P rpcpass -s localhost:44107 --notls getblockchaininfo 2>/dev/null | grep -E "blocks|headers" ✅ Synced blocks == headers (same number) ❌ Syncing headers > blocks → wait, re-check every 30 seconds Verify vLLM Loaded Check GPU Memory nvidia-smi --query-gpu=index,memory.used --format=csv,noheader ✅ Good 0, 132964 MiB / 1, 131397 MiB ❌ Bad 0, 4 MiB / 1, 4 MiB → still loading, wait Check Health curl -s http://localhost:8000/health && echo "READY" || echo "NOT READY" ✅ Good READY ❌ Bad NOT READY → still loading, wait and retry Start Request Loop 🚨 Prompts MUST be randomized! Same prompts = KV caching = ZERO MINING! 🚨 Use sleep 1 NOT wait ! Using wait causes GPU to drop to 0% between batches (burst/idle pattern). sleep 1 keeps requests continuously overlapping for 90%+ GPU utilization! 🚨 Use LONG prompts (~150+ tokens)! Short prompts produce small m values (m<1024) which fail the should_use_noisy_gemm() threshold check = NO MINING. Long prefill-heavy prompts achieve m=5000-8000+ for maximum hash rate. ⚠️ Send ALL requests to port 8000 ONLY. With DP=2, vLLM exposes a single port (8000). Port 8001 does NOT exist — requests there are dropped silently. Run tmux send-keys -t loop "COUNT=0; while true; do COUNT=\\\$((COUNT+1)); for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128; do curl -s http://localhost:8000/v1/chat/completions -H 'Content-Type: application/json' -d '{\\"model\\": \\"pearl-ai/Llama-3.3-70B-Instruct-pearl\\", \\"messages\\": [{\\"role\\": \\"user\\", \\"content\\": \\"Write a detailed comprehensive academic essay about topic \\\$COUNT variant \\\$i covering the following aspects in depth: historical background and origins dating back centuries, mathematical foundations and theoretical frameworks, scientific principles and empirical evidence, technological applications and modern implementations, economic implications and market dynamics, social and cultural impacts on society, philosophical interpretations and ethical considerations, future prospects and emerging research directions, comparative analysis with related fields, and practical case studies with real world examples.\\"}], \\"max_tokens\\": 1}' > /dev/null & done; sleep 1; done" Enter Verify Loop + Mining is Working (wait 2 minutes then run) Check GPU utilization went up nvidia-smi --query-gpu=index,utilization.gpu --format=csv,noheader ✅ Good Both GPUs at 90%+ ❌ Bad 0% → loop not sending requests, check tmux loop session Confirm NOISY_GEMM is firing tmux capture-pane -t miner -p -S -5000 | grep "NOISY_GEMM" | tail -3 ✅ Mining! NOISY_GEMM_CALLED: m=5000+ n=57344 k=8192 on BOTH workers ❌ Not Mining No output → use -S -5000 (larger buffer), or apply NOISY_GEMM debug patch from Section 08b ⚠️ NOISY_GEMM output goes to the tmux buffer. Always use -S -5000 (not -S -50) to look back far enough — the buffer fills with other logs quickly. Verify with Metrics Endpoint The vLLM metrics endpoint is the most reliable way to confirm everything is working correctly: Check requests running + cache hits curl -s http://localhost:8000/metrics | grep -E "num_requests_running|cache_hit" | grep -v "^#\\|reason\\|external\\|mm_cache" ✅ Healthy num_requests_running engine=0: 30-50, engine=1: 30-50 | cache_hit: 0.0 ❌ Problem num_requests_running: 0.0 → loop not sending | cache_hit > 0 → caching active, prompts not random enough ℹ️ All vLLM logs including NOISY_GEMM are also written to /tmp/vllm_live.log — useful for debugging when tmux buffer fills up. 05 Health Check & Diagnostics Master Health Check (paste this every time you reconnect) Full Diagnostic echo "=== TMUX ===" && tmux ls && echo "=== SOCKETS ===" && ss -x | grep pearlgw | wc -l && echo "=== VLLM ===" && pgrep -f "vllm serve" | wc -l && echo "=== GPU ===" && nvidia-smi --query-gpu=index,utilization.gpu,memory.used --format=csv,noheader && echo "=== MINING ADDRESS ===" && cat /proc/\$(pgrep -f "pearl-gateway" | head -1)/environ | tr '\\0' '\\n' | grep "MINING_ADDRESS" && echo "=== NOISY_GEMM ===" && tmux capture-pane -t miner -p -S -50 | grep "NOISY_GEMM" | tail -3 && echo "=== LOOP ===" && tmux capture-pane -t loop -p -S -3 | tail -2 && echo "=== BLOCKS ===" && tmux capture-pane -t miner -p -S -5000 | grep -i "block accepted\\|Block found\\|proof" Expected Healthy Values Check Healthy Value Action if Wrong TMUX SESSIONS miner, loop, node Recreate missing sessions SOCKETS 4 Restart miner — gateway/vLLM disconnected VLLM 1 Restart miner tmux session GPU utilization 90-98% both GPUs Fix loop: use sleep 1 + long prompts GPU power draw 600-690W each (near 700W TDP) Low power = GPU idle = loop not working GPU memory ~132GB each vLLM crashed — restart miner NOISY_GEMM m value 5000-8000+ Use longer prompts in loop NOISY_GEMM n value 57344 Must be 57344 — confirms DP mode working NOISY_GEMM workers Both Worker PIDs firing Only one firing = one GPU idle LOOP curl commands visible, many PIDs Restart loop tmux session MINING ADDRESS Your prl1p... address Kill gateway and restart with correct address CACHE HITS 0.0 Prompts too similar — randomize more 06 Troubleshooting 🔴 Problem: GPU shows 0% utilization persistently (confirmed on RunPod dashboard) This is NOT a sampling artifact if RunPod dashboard also shows 0%. Root cause is almost always the request loop — either using wait instead of sleep 1 , or short prompts that produce m values below the 1024 threshold. Diagnose — check requests actually running curl -s http://localhost:8000/metrics | grep "num_requests_running" | grep -v "^#\\|reason" | awk '{print \$2}' | tr '\\n' ' ' ✅ Good 30-50 requests running per engine ❌ Bad 0.0 0.0 → loop not running or requests completing too fast Fix: Kill loop, restart with sleep 1 (not wait ) and long prompts (~150+ tokens). See Step 4 loop command. 🔴 Problem: vLLM crashes with "NVCC compilation failed" DeepGEMM is trying to JIT-compile CUDA kernels and failing. Root cause: VLLM_USE_DEEP_GEMM env var is not set or not reaching the vLLM process. Verify env var is set echo \$VLLM_USE_DEEP_GEMM ✅ Good 0 ❌ Bad Empty → add to ~/.bashrc and source it, then kill miner session and recreate 🔴 Problem: vLLM crashes with "mining_paused: no block template available" The blockchain node is still syncing. vLLM starts but immediately crashes because there is no block to mine. Check sync status cd /root/pearl && ./bin/prlctl -u rpcuser -P rpcpass -s localhost:44107 --notls getblockchaininfo 2>/dev/null | grep -E "blocks|headers" Wait until blocks == headers before starting vLLM. Can take 5-15 minutes on first launch. 🔴 Problem: Gateway crashes with "mining_address: Field required" The PEARLD_MINING_ADDRESS env var is not reaching the gateway process. This happens when env vars are only exported inline rather than in ~/.bashrc, or when the miner tmux session was created before the vars were set. Fix echo \$PEARLD_MINING_ADDRESS If empty: add to ~/.bashrc, source it, then kill and recreate the miner tmux session before restarting. 🔴 Problem: "vllm: command not found" in miner tmux session source .venv/bin/activate often fails silently inside tmux send-keys, so vllm is not in PATH. Fix: Always use FULL PATHS: /root/pearl/.venv/bin/vllm and /root/pearl/.venv/bin/pearl-gateway instead of relying on venv activation. 🔴 Problem: Socket count is 0 after restart Stale socket file from previous run. Gateway creates /tmp/pearlgw.sock and won't overwrite it. Fix rm -f /tmp/pearlgw.sock && echo "cleared" Always delete the socket before restarting. Add to all restart procedures. Fix pkill -9 -f "pearl-gateway" && pkill -9 -f "vllm" && pkill -9 -f "EngineCore" && sleep 5 Then restart miner with full command from Step 4. 🔴 Problem: Socket count is 0 Gateway and vLLM are not connected. Happens when they start in separate sessions. Fix pkill -9 -f "pearl-gateway" && pkill -9 -f "vllm" && pkill -9 -f "EngineCore" && sleep 5 Restart BOTH gateway and vLLM together in the SAME miner session. 🔴 Problem: NOISY_GEMM n is 28672 (not 57344) TP mode is active instead of DP. Restart with --data-parallel-size 2 flag. Verify pgrep -f "vllm serve" | xargs -I{} cat /proc/{}/cmdline | tr '\\0' ' ' | grep "data-parallel" ✅ Good --data-parallel-size 2 visible ❌ Bad Not visible → restart with correct flag 🔴 Problem: No blocks found after hours Check difficulty — if >100,000 expect blocks every 4-8 hours with 2x H200 Verify prompts are randomized — same prompts = KV caching = no mining Check "Block accepted by node!" in miner logs Check explorer for your address 🔴 Problem: Wrong mining address in gateway Check actual address cat /proc/\$(pgrep -f "pearl-gateway" | head -1)/environ | tr '\\0' '\\n' | grep "MINING_ADDRESS" If wrong: pkill -f "pearl-gateway" then restart miner with correct PEARLD_MINING_ADDRESS. 🟡 Problem: vLLM keeps dying — add watchdog Create watchdog (replace YOUR_ADDRESS) cat > /root/pearl/watchdog.sh > /tmp/watchdog.log pkill -9 -f "pearl-gateway"; pkill -9 -f "vllm"; pkill -9 -f "EngineCore" sleep 10 cd /root/pearl && source .venv/bin/activate && \\ export RAYON_NUM_THREADS=96 PEARLD_RPC_URL=http://localhost:44107 \\ PEARLD_RPC_USER=rpcuser PEARLD_RPC_PASSWORD=rpcpass \\ PEARLD_MINING_ADDRESS=\$MINING_ADDRESS HF_HOME=/workspace/.hf \\ VLLM_USE_DEEP_GEMM=0 && \\ pearl-gateway start & sleep 10 && \\ vllm serve pearl-ai/Llama-3.3-70B-Instruct-pearl \\ --host 0.0.0.0 --port 8000 --max-model-len 8192 \\ --gpu-memory-utilization 0.9 --enforce-eager \\ --data-parallel-size 2 --no-enable-prefix-caching \\ --no-enable-chunked-prefill & sleep 900 fi sleep 60 done EOF chmod +x /root/pearl/watchdog.sh tmux new-session -d -s watchdog tmux send-keys -t watchdog "bash /root/pearl/watchdog.sh" Enter 07 Block Verification Check Logs for Block Activity Run tmux capture-pane -t miner -p -S -50000 | grep -i "block accepted\\|Block found\\|proof\\|submit" ✅ Block Found! Block accepted by node! — submission_service.py Block submission result: {'status': 'accepted'} ❌ No Output No blocks found yet — check difficulty and wait Check Explorer Open in browser https://explorer.pearlresearch.ai/address/YOUR_MINING_ADDRESS ✅ Good Shows balance and transaction history with PRL received ❌ Bad Address Not Found — no confirmed blocks yet (normal if new) 08 Quick Restart Reference Quick Status Check (paste after reconnecting) Run pgrep -f "vllm serve" | wc -l && ss -x | grep pearlgw | wc -l && nvidia-smi --query-gpu=utilization.gpu --format=csv,noheader ✅ Healthy 1 / 4 / 97% / 97% ❌ Dead 0 / 0 / 0% / 0% → do full restart below Full Clean Restart (address already in ~/.bashrc) Run pkill -9 -f "pearl-gateway"; pkill -9 -f "vllm"; pkill -9 -f "EngineCore"; pkill -9 -f "Worker"; sleep 3 && rm -f /tmp/pearlgw.sock && tmux kill-session -t miner 2>/dev/null; tmux new-session -d -s miner && tmux send-keys -t miner "cd /root/pearl && source ~/.bashrc && /root/pearl/.venv/bin/pearl-gateway start > /tmp/gateway.log 2>&1 & sleep 10 && /root/pearl/.venv/bin/vllm serve pearl-ai/Llama-3.3-70B-Instruct-pearl --host 0.0.0.0 --port 8000 --max-model-len 8192 --gpu-memory-utilization 0.9 --enforce-eager --data-parallel-size 2 --no-enable-prefix-caching --no-enable-chunked-prefill" Enter Restart Loop Only Run tmux send-keys -t loop C-c Then send the full loop command from Step 4. 08b Additional Critical Notes Oyster Wallet Keeps Dying — This is Normal! ℹ️ Oyster dies frequently. Mining does NOT need oyster running. Oyster is only needed to check balance or generate new addresses. You can ignore oyster dying. Only run oyster when needed for balance check /root/pearl/bin/oyster -u rpcuser -P pearl123 --noclienttls --noservertls --pearldusername=rpcuser --pearldpassword=rpcpass > /tmp/oyster.log 2>&1 & sleep 15 && /root/pearl/bin/prlctl -u rpcuser -P pearl123 -s localhost:44207 --wallet --notls getbalance Model Download (First Run Only) ⚠️ First time vLLM runs it downloads the 70B model (~140GB). This takes 15-30 extra minutes. Subsequent runs use cached model from /workspace/.hf Watch download progress tmux capture-pane -t miner -p -S -20 | grep -i "download\\|Downloading\\|fetching" Verify Mining is Actually Happening (Debug Patch) Add a print statement to confirm NOISY_GEMM is being called: Apply patch python3 -c " with open('/root/pearl/miner/vllm-miner/src/vllm_miner/config.py', 'r') as f: content = f.read() old = ' return (m >= min_m) and (n >= min_n) and (k >= min_k)' new = ''' result = (m >= min_m) and (n >= min_n) and (k >= min_k) if result: print(f\\"NOISY_GEMM_CALLED: m={m} n={n} k={k}\\", flush=True) return result''' content = content.replace(old, new) with open('/root/pearl/miner/vllm-miner/src/vllm_miner/config.py', 'w') as f: f.write(content) print('Patched!') " ℹ️ After applying patch, restart the miner. Then check: tmux capture-pane -t miner -p -S -50 | grep "NOISY_GEMM" | tail -3 ✅ Good: NOISY_GEMM_CALLED: m=5000+ n=57344 k=8192 ❌ Bad: No output → mining not happening OCR/Screenshot Address Warning 🚨 If you copy your mining address from a screenshot using OCR (Gemini, Google Lens, etc.) — NEVER trust it! Characters like 5/s, 0/O, m/n, l/1 are commonly confused. Always verify the address manually character by character or use the validateaddress command. Difficulty Context Difficulty Expected Block Time (2x H200) Status ~29,000 ~1 block/hour Early network (April 27, 2026) ~68,000 ~2 hours/block Day 3 ~115,000 ~4 hours/block Day 4 >150,000 6-8+ hours/block Highly competitive Check current difficulty cd /root/pearl && ./bin/prlctl -u rpcuser -P rpcpass -s localhost:44107 --notls getblockchaininfo 2>/dev/null | grep -E "blocks|difficulty" Separate vLLM tmux Session Warning 🚨 If you have a tmux session called "vllm" from a previous setup — it can cause confusion. Old Worker processes may still show NOISY_GEMM but be disconnected from the gateway. Always check sockets (must be 4) to confirm connection, not just NOISY_GEMM output. Kill stale vllm session if exists tmux kill-session -t vllm 2>/dev/null; echo "done" Gateway Debug Mode Add --debug flag to gateway for more verbose logs including block submissions: In the miner startup command, replace pearl-gateway start With pearl-gateway --debug start Full Loop Command (for Step 8 restarts) Restart loop tmux send-keys -t loop C-c && sleep 2 && pkill -f "curl.*localhost:8000" && sleep 2 && tmux send-keys -t loop "COUNT=0; while true; do COUNT=\\\$((COUNT+1)); for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128; do curl -s http://localhost:8000/v1/chat/completions -H 'Content-Type: application/json' -d '{\\"model\\": \\"pearl-ai/Llama-3.3-70B-Instruct-pearl\\", \\"messages\\": [{\\"role\\": \\"user\\", \\"content\\": \\"Write a detailed comprehensive academic essay about topic \\\$COUNT variant \\\$i covering the following aspects in depth: historical background and origins dating back centuries, mathematical foundations and theoretical frameworks, scientific principles and empirical evidence, technological applications and modern implementations, economic implications and market dynamics, social and cultural impacts on society, philosophical interpretations and ethical considerations, future prospects and emerging research directions, comparative analysis with related fields, and practical case studies with real world examples.\\"}], \\"max_tokens\\": 1}' > /dev/null & done; sleep 1; done" Enter Node Peer Count Check Check peers (need 8+) cd /root/pearl && ./bin/prlctl -u rpcuser -P rpcpass -s localhost:44107 --notls getpeerinfo 2>/dev/null | grep "addr" | wc -l ✅ Good 8+ peers ❌ Bad 0-2 peers → node not synced yet, wait longer 09 Key Lessons Learned Critical Mistake Consequence Fix Using \`wait\` in request loop GPU goes to 0% between batches — burst/idle pattern, very inefficient Use \`sleep 1\` instead — keeps requests continuously overlapping Sending requests to port 8001 DP=2 only exposes port 8000 — port 8001 requests are dropped Always send all requests to port 8000 only Using --tensor-parallel-size 2 Reduces n to 28672, less mining efficiency Use --data-parallel-size 2 Prefix caching enabled Same prompts cached — NO GEMM = NO MINING Always use --no-enable-prefix-caching Gateway in separate session from vLLM Socket not connected, env vars not inherited Start both in same tmux miner session Sending same prompt repeatedly KV cache kicks in, GEMM skipped entirely Randomize with COUNT and i variables config.yaml thresholds at 1 Overhead without benefit for our matrix sizes Keep at 1024 (default) Not verifying mining address Blocks could go to wrong wallet Always validateaddress + check /proc environ MINER_DEBUG env vars Don't reach EngineCore subprocess Use PEARL_LOG_LEVEL=DEBUG instead ✅ Proof of Working Setup: Confirmed mining with NOISY_GEMM_CALLED: m=8174, n=57344 on both workers. GPU 0: 96%, 690W. GPU 1: 97%, 689W. One confirmed block on explorer with 2884 PRL (May 1, 2026). Second miner confirmed operational May 2, 2026. Setup: RunPod 2x H200 SXM, CUDA 12.8, DP=2, 128 concurrent long-prompt requests, sleep 1 loop. 10 Important Gotchas & Edge Cases Password Confusion — Two Different Passwords! Service Username Password Port pearld node (prlctl) rpcuser rpcpass 44107 oyster wallet (prlctl --wallet) rpcuser pearl123 44207 🚨 Using wrong password is a common mistake! Node uses "rpcpass", wallet uses "pearl123" Normal Warning Messages (Not Errors — Ignore These) These are NORMAL — do not worry about them Error creating a default config file: open /root/.oyster/oyster.conf: no such file or directory Error creating a default config file: open /root/.pearld/pearld.conf: no such file or directory Warning: Running on mainnet with --noclienttls is not recommended Warning: Running on mainnet with --noservertls is not recommended Block Accepted ≠ Block Confirmed ⚠️ Seeing "Block accepted by node!" in logs does NOT guarantee the block makes it to the main chain. It can be orphaned if another miner found a block at the same height faster. The explorer is the ONLY ground truth for confirmed blocks and PRL balance. validateaddress ismine: true is NOT 100% Reliable ⚠️ We discovered that validateaddress can return ismine: true even with a slightly different address (possible OCR corruption). Always verify the address character by character manually — don't rely solely on ismine: true. sleep 10 Between Gateway and vLLM The startup command uses pearl-gateway start & sleep 10 && vllm serve ... The & runs gateway in background, sleep 10 gives it time to create the socket, then vLLM starts and connects to it. If vLLM starts before the socket exists, they won't connect. HuggingFace Token The pearl-ai model downloaded fine without an HF token in our setup. If you get auth errors: Set HF token if needed export HF_TOKEN=your_token_here vLLM Process Name vs api_server ℹ️ Some old diagnostic scripts use pgrep -f "api_server" to detect vLLM. This returns 0 even when vLLM IS running! Always use pgrep -f "vllm serve" instead. tmux Buffer Limitation By default tmux only stores a limited scroll buffer. Block activity messages from hours ago may not appear in tmux capture-pane . The explo
Mining Guide
Pearl Miner Setup Guide
Complete installation, configuration, health monitoring & troubleshooting
RunPod 2x H200CUDA 12.8+DP=2 ModeLlama 70B
Pearl Miner Setup Guide
Complete installation, configuration, health monitoring & troubleshooting
🚨 This guide was built and tested exclusively on RunPod 2x H200 SXM. Every command, expected output, VRAM value, power draw figure, and health check threshold in this guide is calibrated for that specific setup. If you use different hardware, commands will still work but expected output values will differ.
What This Guide Is Designed For
Component
This Guide's Setup
Notes
GPU
2x NVIDIA H200 SXM
Confirmed working. All expected values in this guide are for H200.
VRAM per GPU
143,771 MiB (~141GB)
After model load: ~132,964 MiB GPU0 / ~131,399 MiB GPU1
CUDA Version
12.8
Minimum: 12.4. Tested on 12.8.
Driver
570.211.01
Any 520+ should work
GPU Count
Exactly 2
Guide uses --data-parallel-size 2
System RAM
64GB+
Needed for build + model loading
Disk
300GB+ persistent
Model ~140GB + builds ~50GB + chain ~5GB
OS
Ubuntu 22.04
RunPod default image
Provider
RunPod
See Section 00a for other providers
GPU power at full mining
~690W each (near 700W TDP)
This is the health indicator — if power is 120W, mining is not happening
Other Hardware — Community Reports (Not Tested by This Guide)
⚠️ The following is based on Pearl Discord community reports — NOT verified by this guide. If you use different hardware, expected output values will differ from what this guide shows. Proceed with caution and adapt health check thresholds accordingly.
GPU
Community Status
Notes
H200 SXM ×2
✅ This guide — confirmed
Reference setup for this guide
H100 SXM ×2
✅ Community confirmed
Works. 80GB VRAM each. Adjust VRAM expectations in health checks.
H100 NVL ×1 + H200 ×1
⚠️ Community reported
Mixed setup. Some users got blocks.
Single H200
⚠️ Possible
Use --data-parallel-size 1, 64 requests. Lower hashrate.
A100 ×2
❌ Not recommended
Ampere architecture — Pearl kernel targets Hopper. May not compile.
RTX 4090 ×2
❌ Insufficient VRAM
24GB each = 48GB total. Not enough for 70B model.
Step 0 — Verify Your Pod Before Starting
Run these immediately after SSH-ing in. If any check fails, reprovision before continuing.
Check 1 — GPU model, CUDA, VRAM
nvidia-smi
✅ Good (H200)
2x H200, CUDA 12.8, Driver 570+, 143771 MiB each, 0 MiB used
❌ Bad
Wrong GPU, CUDA <12.4, only 1 GPU, or VRAM already used → reprovision
Check 2 — Disk space (need 300GB+ free)
df -h | sort -rh | head -8
✅ Good
300GB+ available on at least one partition
❌ Bad
Less than 300GB → expand disk before proceeding
Check 3 — RAM
free -h
✅ Good
64GB+ total RAM
❌ Bad
Under 64GB → may OOM during build or model load
Check 4 — OS
lsb_release -a 2>/dev/null || cat /etc/os-release | head -5
✅ Good
Ubuntu 22.04 LTS (Jammy)
⚠️ Untested
Ubuntu 20.04 — may work but not verified by this guide
Step 0b — Expose Port 44108 Before Deploying (Do This First!)
⚠️ Pearl's P2P port is 44108. If you expose it before deploying your pod, other nodes on the network can connect TO you (inbound connections), giving you more peers and faster block propagation. If you don't expose it, you'll be limited to ~16 outbound-only peers — which still works fine for mining but is not optimal.
ℹ️ On RunPod: before clicking Deploy, find the TCP Port Exposures field and add port 44108. This takes 5 seconds and costs nothing. If your pod is already running, it requires a full restart to add — only worth it at natural restart time.
Scenario
Peers
Impact
Port 44108 NOT exposed (RunPod default)
~16 outbound only
Works fine. Block propagation slightly slower.
Port 44108 exposed
Up to 200 inbound+outbound
Better connectivity, faster block propagation.
Discord reports of 200+ peers
200+
These users have inbound port exposed AND are on providers with open firewall.
✅ If you already deployed without exposing port 44108 — don't restart just for this. Wait until next natural restart and add it then. 16 peers does not meaningfully affect your mining rewards.
This guide was built and tested on RunPod. The core setup is identical across providers — only storage paths and a few installation details differ. Use this table to adapt the guide for your provider.
Provider
HF_HOME path
UV Cache path
Persistent storage
Notes
RunPod ✅ Tested
/workspace/.hf
/workspace/.uv-cache
/workspace
Deadsnakes PPA blocked — use UV for Python 3.12. Ubuntu 22.04.
Vast.ai
/root/.cache/huggingface or /workspace/.hf
/root/.cache/uv
/workspace (if attached)
Use Custom Template. Ubuntu 22.04 works. apt python3.12 may work via deadsnakes.
Lambda Labs
/home/ubuntu/.cache/huggingface
/home/ubuntu/.cache/uv
/home/ubuntu
Ubuntu 22.04. Python 3.12 via deadsnakes should work. Run as ubuntu not root.
CoreWeave
/mnt/data/.hf
/mnt/data/.uv-cache
/mnt/data
Kubernetes-based. Persistent volume must be mounted manually.
Paperspace
/notebooks/.hf
/notebooks/.uv-cache
/notebooks
Ubuntu 20.04/22.04. Python 3.12 via deadsnakes.
Any provider
Any path with 200GB+ free space
Any writable path
Check df -h for largest partition
Find largest partition: df -h | sort -rh | head -5
How to Adapt This Guide for Any Provider
Replace every occurrence of /workspace/.hf with your provider's persistent storage path, and /workspace/.uv-cache with the UV cache path. The two places these appear are:
ℹ️ UV-based Python install (Step 1) always works regardless of provider — it downloads a standalone CPython binary. Use it as the universal fallback if apt fails.
Pre-flight Check (run on any fresh pod)
Verify GPU + CUDA before starting
nvidia-smi && echo "CUDA OK" || echo "NO GPU DETECTED"
✅ Good
Shows H100/H200, CUDA 12.x, Driver 520+
❌ Bad
No GPU detected → wrong instance type, reprovision
Find largest storage partition (for HF_HOME)
df -h | sort -rh | head -5
Pick the partition with 300GB+ free space for HF_HOME. The 70B model needs ~140GB.
00 Quick Reference
Setting
Value
Why
Parallelism
--data-parallel-size 2
NOT tensor parallel — TP reduces m dimension
Prefix Caching
--no-enable-prefix-caching
MUST disable — caching = no GEMM = no mining
Chunked Prefill
--no-enable-chunked-prefill
Must disable for correct mining behavior
GPU Memory
--gpu-memory-utilization 0.9
Leave 10% headroom
Model Length
--max-model-len 8192
Fits in 80GB VRAM
Execution
--enforce-eager
Required for Pearl kernel
ZK Speed
export RAYON_NUM_THREADS=96
Faster proof generation
Deep GEMM
export VLLM_USE_DEEP_GEMM=0
Disable — conflicts with Pearl GEMM
Requests
128 concurrent long-prompt requests
Long prompts (~150+ tokens) needed for m≥5000
Loop pattern
sleep 1 (NOT wait)
wait causes GPU to idle between batches → 0% util
Request port
port 8000 ONLY
DP=2 exposes single port — port 8001 drops silently
Socket Count
4 ESTAB connections
2 per DP engine = 4 total when healthy
n value in NOISY_GEMM
57344
Confirms DP mode (TP gives 28672)
Node RPC
port 44107 (pearld)
pearl daemon
Wallet RPC
port 44207 (oyster)
wallet daemon
01 System Dependencies
ℹ️ Run each block separately. Verify output before moving to next step.
⚠️ On RunPod, the deadsnakes PPA is blocked — apt-get install python3.12 will fail with "Unable to locate package". Use UV to install Python 3.12 instead (UV is already installed above).
Installed 265 packages — vllm==0.20.0+cu129 in list
❌ Bad
CUDA build failed → check nvidia-smi shows H100/H200
Verify venv exists
ls /root/pearl/.venv/bin/vllm && ls /root/pearl/.venv/bin/pearl-gateway
✅ Good
Both files listed — build successful
❌ Bad
No such file → miner build failed, re-run task build:miner
02b Set Environment Variables (Persistent)
🚨 CRITICAL: Do this BEFORE starting the miner. Env vars must be in ~/.bashrc so they survive across shell sessions and tmux windows. If you only export them inline, the gateway will fail with "mining_address: Field required" because the vars don't reach the tmux session.
Add all required env vars to ~/.bashrc now (you will update PEARLD_MINING_ADDRESS after Step 3):
Empty output → vars not set, re-run the cat command
⚠️ After generating your mining address in Step 3, update ~/.bashrc: replace PLACEHOLDER with your real address, then run source ~/.bashrc before starting the miner in Step 4.
03 Wallet & Node Setup
Create Wallet
Run
cd /root/pearl && ./bin/oyster --create
🚨 CRITICAL: Write down the 12-word seed phrase! This is your ONLY backup. If you lose it you lose all mined PRL forever.
When prompted, answer as follows:
Prompt
Answer
Do you want to add a passphrase?
No (just press Enter) — or set one you'll remember
connection refused → oyster not ready, wait longer and retry
⚠️ SAVE THIS ADDRESS! You'll need it in every restart command. Also verify it with validateaddress below.
🚨 Now update your ~/.bashrc with the real address: sed -i 's/PEARLD_MINING_ADDRESS=PLACEHOLDER/PEARLD_MINING_ADDRESS=YOUR_ACTUAL_ADDRESS/' ~/.bashrc && source ~/.bashrc && echo $PEARLD_MINING_ADDRESS — confirm it prints your address before proceeding.
🚨 Gateway and vLLM MUST start in the same tmux session! If separate, they won't connect. Also — delete any stale socket first: rm -f /tmp/pearlgw.sock
🚨 Use FULL PATHS to vllm and pearl-gateway — do NOT rely on venv activate inside tmux. The activate command often fails silently in tmux send-keys, causing "vllm: command not found".
ℹ️ Gateway logs go to /tmp/gateway.log — this keeps the miner tmux session clean so vLLM output is visible. Check gateway: tail -5 /tmp/gateway.log
⚠️ vLLM takes 10-15 minutes to load the 70B model on first run (~140GB download). Subsequent runs use cached model from /workspace/.hf and load in ~2-3 minutes.
Wait for Node to Sync Before vLLM Starts
🚨 If the node is still syncing when vLLM starts, it will crash with "mining_paused: no block template available". The node must be fully synced first. Check sync status:
🚨 Prompts MUST be randomized! Same prompts = KV caching = ZERO MINING!
🚨 Use sleep 1 NOT wait! Using wait causes GPU to drop to 0% between batches (burst/idle pattern). sleep 1 keeps requests continuously overlapping for 90%+ GPU utilization!
🚨 Use LONG prompts (~150+ tokens)! Short prompts produce small m values (m<1024) which fail the should_use_noisy_gemm() threshold check = NO MINING. Long prefill-heavy prompts achieve m=5000-8000+ for maximum hash rate.
⚠️ Send ALL requests to port 8000 ONLY. With DP=2, vLLM exposes a single port (8000). Port 8001 does NOT exist — requests there are dropped silently.
Run
tmux send-keys -t loop "COUNT=0; while true; do COUNT=\$((COUNT+1)); for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128; do curl -s http://localhost:8000/v1/chat/completions -H 'Content-Type: application/json' -d '{\"model\": \"pearl-ai/Llama-3.3-70B-Instruct-pearl\", \"messages\": [{\"role\": \"user\", \"content\": \"Write a detailed comprehensive academic essay about topic \$COUNT variant \$i covering the following aspects in depth: historical background and origins dating back centuries, mathematical foundations and theoretical frameworks, scientific principles and empirical evidence, technological applications and modern implementations, economic implications and market dynamics, social and cultural impacts on society, philosophical interpretations and ethical considerations, future prospects and emerging research directions, comparative analysis with related fields, and practical case studies with real world examples.\"}], \"max_tokens\": 1}' > /dev/null & done; sleep 1; done" Enter
Verify Loop + Mining is Working (wait 2 minutes then run)
This is NOT a sampling artifact if RunPod dashboard also shows 0%. Root cause is almost always the request loop — either using wait instead of sleep 1, or short prompts that produce m values below the 1024 threshold.
Wait until blocks == headers before starting vLLM. Can take 5-15 minutes on first launch.
🔴 Problem: Gateway crashes with "mining_address: Field required"
The PEARLD_MINING_ADDRESS env var is not reaching the gateway process. This happens when env vars are only exported inline rather than in ~/.bashrc, or when the miner tmux session was created before the vars were set.
Fix
echo $PEARLD_MINING_ADDRESS
If empty: add to ~/.bashrc, source it, then kill and recreate the miner tmux session before restarting.
🔴 Problem: "vllm: command not found" in miner tmux session
source .venv/bin/activate often fails silently inside tmux send-keys, so vllm is not in PATH.
Fix: Always use FULL PATHS: /root/pearl/.venv/bin/vllm and /root/pearl/.venv/bin/pearl-gateway instead of relying on venv activation.
🔴 Problem: Socket count is 0 after restart
Stale socket file from previous run. Gateway creates /tmp/pearlgw.sock and won't overwrite it.
Fix
rm -f /tmp/pearlgw.sock && echo "cleared"
Always delete the socket before restarting. Add to all restart procedures.
The request loop can stall silently — curl jobs drop to 0, GPU goes to 0%, but vLLM stays running. This watchdog checks every 60 seconds and auto-restarts the loop if fewer than 10 curl jobs are running. Set this up on every miner.
Create loop watchdog
cat > /root/loop_watchdog.sh << 'EOF'
#!/bin/bash
while true; do
CURL_COUNT=$(pgrep -f "curl.*localhost:8000" | wc -l)
if [ "$CURL_COUNT" -lt 10 ]; then
echo "$(date) - Loop stalled (${CURL_COUNT} curl jobs), restarting..." >> /tmp/loop_watchdog.log
tmux send-keys -t loop C-c 2>/dev/null
sleep 2
pkill -f "curl.*localhost:8000" 2>/dev/null
sleep 2
tmux send-keys -t loop "COUNT=0; while true; do COUNT=\$((COUNT+1)); for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128; do curl -s http://localhost:8000/v1/chat/completions -H 'Content-Type: application/json' -d '{\\\"model\\\": \\\"pearl-ai/Llama-3.3-70B-Instruct-pearl\\\", \\\"messages\\\": [{\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"Write a detailed comprehensive academic essay about topic \$COUNT variant \$i covering the following aspects in depth: historical background and origins dating back centuries, mathematical foundations and theoretical frameworks, scientific principles and empirical evidence, technological applications and modern implementations, economic implications and market dynamics, social and cultural impacts on society, philosophical interpretations and ethical considerations, future prospects and emerging research directions, comparative analysis with related fields, and practical case studies with real world examples.\\\"}], \\\"max_tokens\\\": 1}' > /dev/null & done; sleep 1; done" Enter
echo "$(date) - Loop restarted" >> /tmp/loop_watchdog.log
fi
sleep 60
done
EOF
chmod +x /root/loop_watchdog.sh && tmux new-session -d -s watchdog && tmux send-keys -t watchdog "/root/loop_watchdog.sh" Enter && echo "✓ Loop watchdog running" && tmux ls | grep watchdog
ℹ️ Oyster dies frequently. Mining does NOT need oyster running. Oyster is only needed to check balance or generate new addresses. You can ignore oyster dying.
Add a print statement to confirm NOISY_GEMM is being called:
Apply patch
python3 -c "
with open('/root/pearl/miner/vllm-miner/src/vllm_miner/config.py', 'r') as f:
content = f.read()
old = ' return (m >= min_m) and (n >= min_n) and (k >= min_k)'
new = ''' result = (m >= min_m) and (n >= min_n) and (k >= min_k)
if result:
print(f\"NOISY_GEMM_CALLED: m={m} n={n} k={k}\", flush=True)
return result'''
content = content.replace(old, new)
with open('/root/pearl/miner/vllm-miner/src/vllm_miner/config.py', 'w') as f:
f.write(content)
print('Patched!')
"
ℹ️ After applying patch, restart the miner. Then check: tmux capture-pane -t miner -p -S -50 | grep "NOISY_GEMM" | tail -3
✅ Good: NOISY_GEMM_CALLED: m=5000+ n=57344 k=8192
❌ Bad: No output → mining not happening
OCR/Screenshot Address Warning
🚨 If you copy your mining address from a screenshot using OCR (Gemini, Google Lens, etc.) — NEVER trust it! Characters like 5/s, 0/O, m/n, l/1 are commonly confused. Always verify the address manually character by character or use the validateaddress command.
🚨 If you have a tmux session called "vllm" from a previous setup — it can cause confusion. Old Worker processes may still show NOISY_GEMM but be disconnected from the gateway. Always check sockets (must be 4) to confirm connection, not just NOISY_GEMM output.
GPU goes to 0% between batches — burst/idle pattern, very inefficient
Use `sleep 1` instead — keeps requests continuously overlapping
Sending requests to port 8001
DP=2 only exposes port 8000 — port 8001 requests are dropped
Always send all requests to port 8000 only
Using --tensor-parallel-size 2
Reduces n to 28672, less mining efficiency
Use --data-parallel-size 2
Prefix caching enabled
Same prompts cached — NO GEMM = NO MINING
Always use --no-enable-prefix-caching
Gateway in separate session from vLLM
Socket not connected, env vars not inherited
Start both in same tmux miner session
Sending same prompt repeatedly
KV cache kicks in, GEMM skipped entirely
Randomize with COUNT and i variables
config.yaml thresholds at 1
Overhead without benefit for our matrix sizes
Keep at 1024 (default)
Not verifying mining address
Blocks could go to wrong wallet
Always validateaddress + check /proc environ
MINER_DEBUG env vars
Don't reach EngineCore subprocess
Use PEARL_LOG_LEVEL=DEBUG instead
✅ Proof of Working Setup: Confirmed mining with NOISY_GEMM_CALLED: m=8174, n=57344 on both workers. GPU 0: 96%, 690W. GPU 1: 97%, 689W. One confirmed block on explorer with 2884 PRL (May 1, 2026). Second miner confirmed operational May 2, 2026. Setup: RunPod 2x H200 SXM, CUDA 12.8, DP=2, 128 concurrent long-prompt requests, sleep 1 loop.
10 Important Gotchas & Edge Cases
Password Confusion — Two Different Passwords!
Service
Username
Password
Port
pearld node (prlctl)
rpcuser
rpcpass
44107
oyster wallet (prlctl --wallet)
rpcuser
pearl123
44207
🚨 Using wrong password is a common mistake! Node uses "rpcpass", wallet uses "pearl123"
Normal Warning Messages (Not Errors — Ignore These)
These are NORMAL — do not worry about them
Error creating a default config file: open /root/.oyster/oyster.conf: no such file or directory
Error creating a default config file: open /root/.pearld/pearld.conf: no such file or directory
Warning: Running on mainnet with --noclienttls is not recommended
Warning: Running on mainnet with --noservertls is not recommended
Block Accepted ≠ Block Confirmed
⚠️ Seeing "Block accepted by node!" in logs does NOT guarantee the block makes it to the main chain. It can be orphaned if another miner found a block at the same height faster. The explorer is the ONLY ground truth for confirmed blocks and PRL balance.
validateaddress ismine: true is NOT 100% Reliable
⚠️ We discovered that validateaddress can return ismine: true even with a slightly different address (possible OCR corruption). Always verify the address character by character manually — don't rely solely on ismine: true.
The & runs gateway in background, sleep 10 gives it time to create the socket, then vLLM starts and connects to it. If vLLM starts before the socket exists, they won't connect.
HuggingFace Token
The pearl-ai model downloaded fine without an HF token in our setup. If you get auth errors:
Set HF token if needed
export HF_TOKEN=your_token_here
vLLM Process Name vs api_server
ℹ️ Some old diagnostic scripts use pgrep -f "api_server" to detect vLLM. This returns 0 even when vLLM IS running! Always use pgrep -f "vllm serve" instead.
tmux Buffer Limitation
By default tmux only stores a limited scroll buffer. Block activity messages from hours ago may not appear in tmux capture-pane. The explorer is more reliable for historical block confirmation.
Wallet Address from Same Seed
Running getnewaddress multiple times generates different addresses — all from the same seed phrase, all recoverable. But only one address is set as the mining address at a time. The second address generated (prl1p8jt0...) is a valid backup address from the same wallet.
Loop Stalls Silently — GPU Goes to 0%
The request loop can stall without any error message. Curl jobs drop to 0, GPU goes to 0%, but vLLM stays running and appears healthy. This happens because bash accumulates too many background jobs over time.
Signs: GPU 0% on RunPod dashboard, power draw drops to ~120W, NOISY_GEMM stops firing in tmux buffer, curl job count is 0.
Fix: Kill loop, restart it. Always set up the loop watchdog (Section 08b) to handle this automatically — it checks every 60 seconds and restarts if curl jobs drop below 10.
Unbalanced DP Engines — One Worker Firing Less
Sometimes requests distribute unevenly between the two DP engines — one engine gets 35 requests, the other gets 0-7. This shows as low m values on one Worker and lower GPU utilization. Root cause: loop stalled and restarted unevenly.
Fix: Restart the loop cleanly. Kill all curl jobs first, verify 0 remaining, then restart. The engines rebalance within the next batch.
ℹ️ Check balance with: curl -s http://localhost:8000/metrics | grep "num_requests_running" | grep -v "^#\|reason" | awk '{print $2}' | tr '\n' ' ' — both engines should show similar numbers.
Only 16 Peers — Discord Reports 200+
On RunPod (and most cloud providers), inbound connections are blocked by default. Your node can connect OUT to other peers but other nodes cannot connect IN to you. This limits you to ~8-16 outbound peers regardless of your --maxpeers setting.
The fix is exposing port 44108 before deploying your pod (see Step 0b). If already deployed, wait until next natural restart.
ℹ️ 16 peers is sufficient for mining. Block propagation works fine outbound-only. The difference between 16 and 200 peers is milliseconds of propagation time — negligible compared to the time between blocks.
⚠️ If you try to get a mining address before the node syncs, the address may be invalid. Wait at least 30-60 seconds after starting pearld and verify getblockcount returns a number before running getnewaddress.
Node Must Be Synced Before Starting vLLM
🚨 vLLM will crash immediately on startup with "mining_paused: no block template available" if the blockchain node is still syncing. Always verify blocks == headers before starting the miner. The node typically takes 5-15 minutes to sync on first launch.
Env Vars Must Be in ~/.bashrc — Not Just Exported Inline
Exporting vars inline in the tmux send-keys command is unreliable — the vars often don't reach subprocesses. Always add them to ~/.bashrc and use source ~/.bashrc in the miner startup. The gateway will fail with "mining_address: Field required" if PEARLD_MINING_ADDRESS is not in the environment.
Always Use Full Paths for vllm and pearl-gateway
Using source .venv/bin/activate inside tmux send-keys frequently fails silently, leaving vllm not in PATH and producing "vllm: command not found". Always use /root/pearl/.venv/bin/vllm and /root/pearl/.venv/bin/pearl-gateway explicitly.
Delete Stale Socket Before Every Restart
The gateway socket at /tmp/pearlgw.sock persists after the gateway dies. On restart, if the old socket file exists, the new gateway may fail or vLLM may connect to a dead socket. Always run rm -f /tmp/pearlgw.sock before restarting.
GPU 0% — Real Issue vs Sampling Artifact
⚠️ If nvidia-smi shows 0% GPU but you catch it in occasional bursts (35→0→35→0), that MAY be a sampling artifact from the batch/wait pattern. But if RunPod dashboard also shows 0% persistently AND power draw is ~120W (vs 690W when healthy), it is a REAL problem. The fix is always the loop: use sleep 1 + long prompts.
Short Prompts Kill Mining (m Value Too Low)
The should_use_noisy_gemm() function requires m ≥ 1024 (default threshold in config.yaml). Short prompts produce small batch sizes (m < 1024) and mining is skipped entirely. Always use long prefill-heavy prompts (~150+ tokens input, max_tokens=1). Target m=5000-8000+. Power draw is the quickest sanity check: 690W = mining, 120W = not mining.
Community Resources
☄️ Community Resources
Independent tools built by the Pearl mining community.