Pearl Miner Setup Guide Complete installation, configuration, health monitoring & troubleshooting RunPod 2x H200 CUDA 12.8+ DP=2 Mode Llama 70B Pearl Miner Setup Guide Complete installation, configuration, health monitoring & troubleshooting RunPod / Vast.ai / Lambda CUDA 12.4+ H100 / H200 Required DP=2 Mode Llama 70B Ubuntu 22.04 Contents — Hardware Requirements & Pod Verification (READ FIRST) 00a Cloud Provider Paths & Compatibility 00 Quick Reference (Key Settings) 01 System Dependencies 02 Clone & Build 02b Set Environment Variables (Persistent) 03 Wallet & Node Setup 04 Start Mining 05 Health Check & Diagnostics 06 Troubleshooting 07 Block Verification 08 Quick Restart Reference 08b Additional Critical Notes 09 Key Lessons Learned 10 Important Gotchas & Edge Cases ⚡ Hardware Requirements & Pod Verification 🚨 This guide was built and tested exclusively on RunPod 2x H200 SXM. Every command, expected output, VRAM value, power draw figure, and health check threshold in this guide is calibrated for that specific setup. If you use different hardware, commands will still work but expected output values will differ. What This Guide Is Designed For Component This Guide's Setup Notes GPU 2x NVIDIA H200 SXM Confirmed working. All expected values in this guide are for H200. VRAM per GPU 143,771 MiB (~141GB) After model load: ~132,964 MiB GPU0 / ~131,399 MiB GPU1 CUDA Version 12.8 Minimum: 12.4. Tested on 12.8. Driver 570.211.01 Any 520+ should work GPU Count Exactly 2 Guide uses --data-parallel-size 2 System RAM 64GB+ Needed for build + model loading Disk 300GB+ persistent Model ~140GB + builds ~50GB + chain ~5GB OS Ubuntu 22.04 RunPod default image Provider RunPod See Section 00a for other providers GPU power at full mining ~690W each (near 700W TDP) This is the health indicator — if power is 120W, mining is not happening Other Hardware — Community Reports (Not Tested by This Guide) ⚠️ The following is based on Pearl Discord community reports — NOT verified by this guide. If you use different hardware, expected output values will differ from what this guide shows. Proceed with caution and adapt health check thresholds accordingly. GPU Community Status Notes H200 SXM ×2 ✅ This guide — confirmed Reference setup for this guide H100 SXM ×2 ✅ Community confirmed Works. 80GB VRAM each. Adjust VRAM expectations in health checks. H100 NVL ×1 + H200 ×1 ⚠️ Community reported Mixed setup. Some users got blocks. Single H200 ⚠️ Possible Use --data-parallel-size 1, 64 requests. Lower hashrate. A100 ×2 ❌ Not recommended Ampere architecture — Pearl kernel targets Hopper. May not compile. RTX 4090 ×2 ❌ Insufficient VRAM 24GB each = 48GB total. Not enough for 70B model. Step 0 — Verify Your Pod Before Starting Run these immediately after SSH-ing in. If any check fails, reprovision before continuing. Check 1 — GPU model, CUDA, VRAM nvidia-smi ✅ Good (H200) 2x H200, CUDA 12.8, Driver 570+, 143771 MiB each, 0 MiB used ❌ Bad Wrong GPU, CUDA <12.4, only 1 GPU, or VRAM already used → reprovision Check 2 — Disk space (need 300GB+ free) df -h | sort -rh | head -8 ✅ Good 300GB+ available on at least one partition ❌ Bad Less than 300GB → expand disk before proceeding Check 3 — RAM free -h ✅ Good 64GB+ total RAM ❌ Bad Under 64GB → may OOM during build or model load Check 4 — OS lsb_release -a 2>/dev/null || cat /etc/os-release | head -5 ✅ Good Ubuntu 22.04 LTS (Jammy) ⚠️ Untested Ubuntu 20.04 — may work but not verified by this guide Check 5 — Internet curl -s --max-time 5 https://github.com > /dev/null && echo "GitHub OK" && curl -s --max-time 5 https://huggingface.co > /dev/null && echo "HuggingFace OK" ✅ Good GitHub OK / HuggingFace OK ❌ Bad Blocked → check provider firewall / outbound rules Full pre-flight one-liner echo "=== GPU ===" && nvidia-smi --query-gpu=name,memory.total,driver_version --format=csv,noheader && echo "=== DISK ===" && df -h | sort -rh | head -5 && echo "=== RAM ===" && free -h | grep Mem && echo "=== OS ===" && lsb_release -d 2>/dev/null && echo "=== NETWORK ===" && curl -s --max-time 5 https://github.com > /dev/null && echo "GitHub OK" || echo "GitHub BLOCKED" ✅ Only proceed to Step 01 if: 2x H200 (or compatible GPU), CUDA 12.4+, 300GB+ disk, 64GB+ RAM, Ubuntu 22.04, GitHub reachable. 01 System Dependencies 02 Clone & Build 02b Set Environment Variables (Persistent) 03 Wallet & Node Setup 04 Start Mining 05 Health Check & Diagnostics 06 Troubleshooting 07 Block Verification 08 Quick Restart Reference 08b Additional Critical Notes (Watchdog, Debug, Gotchas) 09 Key Lessons Learned 10 Important Gotchas & Edge Cases 00a Cloud Provider Paths & Compatibility This guide was built and tested on RunPod. The core setup is identical across providers — only storage paths and a few installation details differ. Use this table to adapt the guide for your provider. Provider HF_HOME path UV Cache path Persistent storage Notes RunPod ✅ Tested /workspace/.hf /workspace/.uv-cache /workspace Deadsnakes PPA blocked — use UV for Python 3.12. Ubuntu 22.04. Vast.ai /root/.cache/huggingface or /workspace/.hf /root/.cache/uv /workspace (if attached) Use Custom Template. Ubuntu 22.04 works. apt python3.12 may work via deadsnakes. Lambda Labs /home/ubuntu/.cache/huggingface /home/ubuntu/.cache/uv /home/ubuntu Ubuntu 22.04. Python 3.12 via deadsnakes should work. Run as ubuntu not root. CoreWeave /mnt/data/.hf /mnt/data/.uv-cache /mnt/data Kubernetes-based. Persistent volume must be mounted manually. Paperspace /notebooks/.hf /notebooks/.uv-cache /notebooks Ubuntu 20.04/22.04. Python 3.12 via deadsnakes. Any provider Any path with 200GB+ free space Any writable path Check df -h for largest partition Find largest partition: df -h | sort -rh | head -5 How to Adapt This Guide for Any Provider Replace every occurrence of /workspace/.hf with your provider's persistent storage path, and /workspace/.uv-cache with the UV cache path. The two places these appear are: 1. In ~/.bashrc export HF_HOME=/YOUR_PROVIDER_PATH/.hf 2. In the build:miner command (Step 2) cd /root/pearl && export UV_CACHE_DIR=/YOUR_PROVIDER_PATH/.uv-cache && export HF_HOME=/YOUR_PROVIDER_PATH/.hf && task build:miner Python 3.12 Installation by Provider Provider Python 3.12 Method Command RunPod apt blocked — use UV uv python install 3.12 Vast.ai Try apt first, fallback to UV apt-get install -y python3.12 || uv python install 3.12 Lambda / Paperspace apt via deadsnakes PPA add-apt-repository ppa:deadsnakes/ppa && apt-get install -y python3.12 Any provider (universal) UV always works uv python install 3.12 ℹ️ UV-based Python install (Step 1) always works regardless of provider — it downloads a standalone CPython binary. Use it as the universal fallback if apt fails. Pre-flight Check (run on any fresh pod) Verify GPU + CUDA before starting nvidia-smi && echo "CUDA OK" || echo "NO GPU DETECTED" ✅ Good Shows H100/H200, CUDA 12.x, Driver 520+ ❌ Bad No GPU detected → wrong instance type, reprovision Find largest storage partition (for HF_HOME) df -h | sort -rh | head -5 Pick the partition with 300GB+ free space for HF_HOME. The 70B model needs ~140GB. 00 Quick Reference Setting Value Why Parallelism --data-parallel-size 2 NOT tensor parallel — TP reduces m dimension Prefix Caching --no-enable-prefix-caching MUST disable — caching = no GEMM = no mining Chunked Prefill --no-enable-chunked-prefill Must disable for correct mining behavior GPU Memory --gpu-memory-utilization 0.9 Leave 10% headroom Model Length --max-model-len 8192 Fits in 80GB VRAM Execution --enforce-eager Required for Pearl kernel ZK Speed export RAYON_NUM_THREADS=96 Faster proof generation Deep GEMM export VLLM_USE_DEEP_GEMM=0 Disable — conflicts with Pearl GEMM Requests 128 concurrent long-prompt requests Long prompts (~150+ tokens) needed for m≥5000 Loop pattern sleep 1 (NOT wait) wait causes GPU to idle between batches → 0% util Request port port 8000 ONLY DP=2 exposes single port — port 8001 drops silently Socket Count 4 ESTAB connections 2 per DP engine = 4 total when healthy n value in NOISY_GEMM 57344 Confirms DP mode (TP gives 28672) Node RPC port 44107 (pearld) pearl daemon Wallet RPC port 44207 (oyster) wallet daemon 01 System Dependencies ℹ️ Run each block separately. Verify output before moving to next step. Go Language Run wget -q https://go.dev/dl/go1.24.2.linux-amd64.tar.gz && tar -C /usr/local -xzf go1.24.2.linux-amd64.tar.gz && export PATH=\$PATH:/usr/local/go/bin && echo 'export PATH=\$PATH:/usr/local/go/bin' >> ~/.bashrc Verify go version ✅ Good go version go1.24.2 linux/amd64 ❌ Bad command not found → re-run wget/tar command Rust Toolchain Run curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y && source ~/.cargo/env Verify rustc --version ✅ Good rustc 1.xx.x (xxxxxxx YYYY-MM-DD) ❌ Bad command not found → source ~/.cargo/env UV Package Manager Run curl -LsSf https://astral.sh/uv/install.sh | sh && source \$HOME/.local/bin/env ✅ Good uv 0.x.x ❌ Bad command not found → run: source \$HOME/.local/bin/env Taskfile Run sh -c "\$(curl --location https://taskfile.dev/install.sh)" -- -d -b /usr/local/bin ✅ Good Task version: vx.x.x ❌ Bad permission denied → check /usr/local/bin permissions tmux Run apt-get update -qq && apt-get install -y tmux Python 3.12 ⚠️ On RunPod, the deadsnakes PPA is blocked — apt-get install python3.12 will fail with "Unable to locate package". Use UV to install Python 3.12 instead (UV is already installed above). Install Python 3.12 via UV uv python install 3.12 Make it the system default ln -sf \$(uv python find 3.12) /usr/local/bin/python3.12 && update-alternatives --install /usr/bin/python3 python3 /usr/local/bin/python3.12 1 && python3 --version ✅ Good Python 3.12.x ❌ Bad command not found → re-run uv python install 3.12 02 Clone & Build Clone Repository Run cd /root && git clone https://github.com/pearl-research-labs/pearl.git && cd pearl ✅ Good Cloning into 'pearl'... done. You are now in /root/pearl ❌ Bad fatal: repository not found → check internet ℹ️ All build commands must run from /root/pearl directory. Verify with: pwd → should show /root/pearl Build Blockchain Run (from /root/pearl) cd /root/pearl && task build:blockchain ✅ Good Build completes without errors ❌ Bad go: command not found → export PATH=\$PATH:/usr/local/go/bin Verify binaries exist ls -la /root/pearl/bin/pearld /root/pearl/bin/oyster /root/pearl/bin/prlctl ✅ Good All 3 files listed with size >0 ❌ Bad No such file → build failed, check task output for errors Build Miner (~20-25 minutes) ⚠️ This takes 20-25 minutes. Do NOT interrupt it! First run compiles CUDA kernels. Run (from /root/pearl) cd /root/pearl && export UV_CACHE_DIR=/workspace/.uv-cache && export HF_HOME=/workspace/.hf && task build:miner ✅ Good Installed 265 packages — vllm==0.20.0+cu129 in list ❌ Bad CUDA build failed → check nvidia-smi shows H100/H200 Verify venv exists ls /root/pearl/.venv/bin/vllm && ls /root/pearl/.venv/bin/pearl-gateway ✅ Good Both files listed — build successful ❌ Bad No such file → miner build failed, re-run task build:miner 02b Set Environment Variables (Persistent) 🚨 CRITICAL: Do this BEFORE starting the miner. Env vars must be in ~/.bashrc so they survive across shell sessions and tmux windows. If you only export them inline, the gateway will fail with "mining_address: Field required" because the vars don't reach the tmux session. Add all required env vars to ~/.bashrc now (you will update PEARLD_MINING_ADDRESS after Step 3): Add to ~/.bashrc cat >> ~/.bashrc ✅ Good Prints: 0 ❌ Bad Empty output → vars not set, re-run the cat command ⚠️ After generating your mining address in Step 3, update ~/.bashrc: replace PLACEHOLDER with your real address, then run source ~/.bashrc before starting the miner in Step 4. 03 Wallet & Node Setup Create Wallet Run cd /root/pearl && ./bin/oyster --create 🚨 CRITICAL: Write down the 12-word seed phrase! This is your ONLY backup. If you lose it you lose all mined PRL forever. When prompted, answer as follows: Prompt Answer Do you want to add a passphrase? No (just press Enter) — or set one you'll remember Do you have an existing seed phrase? No Seed phrase shown ⚠️ WRITE IT DOWN NOW — all 12 words in order Type OK to confirm OK Start tmux Sessions Run tmux new-session -d -s node && tmux new-session -d -s miner && tmux new-session -d -s loop ✅ Good tmux ls shows: node, miner, loop sessions ❌ Bad session exists → tmux kill-session -t node first Start Blockchain Node Run tmux send-keys -t node "cd /root/pearl && ./bin/pearld --rpcuser=rpcuser --rpcpass=rpcpass --rpclisten=0.0.0.0:44107 --txindex --notls" Enter Wait 30 seconds then verify: Verify cd /root/pearl && ./bin/prlctl -u rpcuser -P rpcpass -s localhost:44107 --notls getblockcount ✅ Good Returns a block number (e.g., 36000+) ❌ Bad connection refused → node not started, check tmux node session Get Mining Address Run /root/pearl/bin/oyster -u rpcuser -P pearl123 --noclienttls --noservertls --pearldusername=rpcuser --pearldpassword=rpcpass > /tmp/oyster.log 2>&1 & sleep 15 && /root/pearl/bin/prlctl -u rpcuser -P pearl123 -s localhost:44207 --wallet --notls getnewaddress ✅ Good Returns address starting with prl1p... ❌ Bad connection refused → oyster not ready, wait longer and retry ⚠️ SAVE THIS ADDRESS! You'll need it in every restart command. Also verify it with validateaddress below. 🚨 Now update your ~/.bashrc with the real address: sed -i 's/PEARLD_MINING_ADDRESS=PLACEHOLDER/PEARLD_MINING_ADDRESS=YOUR_ACTUAL_ADDRESS/' ~/.bashrc && source ~/.bashrc && echo \$PEARLD_MINING_ADDRESS — confirm it prints your address before proceeding. Verify Address is Yours Run (replace YOUR_ADDRESS) /root/pearl/bin/prlctl -u rpcuser -P pearl123 -s localhost:44207 --wallet --notls validateaddress YOUR_ADDRESS ✅ Good "ismine": true ❌ Bad "ismine": false → wrong address, generate a new one 04 Start Mining Start Gateway + vLLM (replace YOUR_MINING_ADDRESS) 🚨 Gateway and vLLM MUST start in the same tmux session! If separate, they won't connect. Also — delete any stale socket first: rm -f /tmp/pearlgw.sock 🚨 Use FULL PATHS to vllm and pearl-gateway — do NOT rely on venv activate inside tmux. The activate command often fails silently in tmux send-keys, causing "vllm: command not found". Run rm -f /tmp/pearlgw.sock && tmux kill-session -t miner 2>/dev/null; tmux new-session -d -s miner && tmux send-keys -t miner "cd /root/pearl && source ~/.bashrc && /root/pearl/.venv/bin/pearl-gateway start > /tmp/gateway.log 2>&1 & sleep 10 && /root/pearl/.venv/bin/vllm serve pearl-ai/Llama-3.3-70B-Instruct-pearl --host 0.0.0.0 --port 8000 --max-model-len 8192 --gpu-memory-utilization 0.9 --enforce-eager --data-parallel-size 2 --no-enable-prefix-caching --no-enable-chunked-prefill" Enter ℹ️ Gateway logs go to /tmp/gateway.log — this keeps the miner tmux session clean so vLLM output is visible. Check gateway: tail -5 /tmp/gateway.log ⚠️ vLLM takes 10-15 minutes to load the 70B model on first run (~140GB download). Subsequent runs use cached model from /workspace/.hf and load in ~2-3 minutes. Wait for Node to Sync Before vLLM Starts 🚨 If the node is still syncing when vLLM starts, it will crash with "mining_paused: no block template available". The node must be fully synced first. Check sync status: Check sync status cd /root/pearl && ./bin/prlctl -u rpcuser -P rpcpass -s localhost:44107 --notls getblockchaininfo 2>/dev/null | grep -E "blocks|headers" ✅ Synced blocks == headers (same number) ❌ Syncing headers > blocks → wait, re-check every 30 seconds Verify vLLM Loaded Check GPU Memory nvidia-smi --query-gpu=index,memory.used --format=csv,noheader ✅ Good 0, 132964 MiB / 1, 131397 MiB ❌ Bad 0, 4 MiB / 1, 4 MiB → still loading, wait Check Health curl -s http://localhost:8000/health && echo "READY" || echo "NOT READY" ✅ Good READY ❌ Bad NOT READY → still loading, wait and retry Start Request Loop 🚨 Prompts MUST be randomized! Same prompts = KV caching = ZERO MINING! 🚨 Use sleep 1 NOT wait ! Using wait causes GPU to drop to 0% between batches (burst/idle pattern). sleep 1 keeps requests continuously overlapping for 90%+ GPU utilization! 🚨 Use LONG prompts (~150+ tokens)! Short prompts produce small m values (m<1024) which fail the should_use_noisy_gemm() threshold check = NO MINING. Long prefill-heavy prompts achieve m=5000-8000+ for maximum hash rate. ⚠️ Send ALL requests to port 8000 ONLY. With DP=2, vLLM exposes a single port (8000). Port 8001 does NOT exist — requests there are dropped silently. Run tmux send-keys -t loop "COUNT=0; while true; do COUNT=\\\$((COUNT+1)); for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128; do curl -s http://localhost:8000/v1/chat/completions -H 'Content-Type: application/json' -d '{\\"model\\": \\"pearl-ai/Llama-3.3-70B-Instruct-pearl\\", \\"messages\\": [{\\"role\\": \\"user\\", \\"content\\": \\"Write a detailed comprehensive academic essay about topic \\\$COUNT variant \\\$i covering the following aspects in depth: historical background and origins dating back centuries, mathematical foundations and theoretical frameworks, scientific principles and empirical evidence, technological applications and modern implementations, economic implications and market dynamics, social and cultural impacts on society, philosophical interpretations and ethical considerations, future prospects and emerging research directions, comparative analysis with related fields, and practical case studies with real world examples.\\"}], \\"max_tokens\\": 1}' > /dev/null & done; sleep 1; done" Enter Verify Loop + Mining is Working (wait 2 minutes then run) Check GPU utilization went up nvidia-smi --query-gpu=index,utilization.gpu --format=csv,noheader ✅ Good Both GPUs at 90%+ ❌ Bad 0% → loop not sending requests, check tmux loop session Confirm NOISY_GEMM is firing tmux capture-pane -t miner -p -S -5000 | grep "NOISY_GEMM" | tail -3 ✅ Mining! NOISY_GEMM_CALLED: m=5000+ n=57344 k=8192 on BOTH workers ❌ Not Mining No output → use -S -5000 (larger buffer), or apply NOISY_GEMM debug patch from Section 08b ⚠️ NOISY_GEMM output goes to the tmux buffer. Always use -S -5000 (not -S -50) to look back far enough — the buffer fills with other logs quickly. Verify with Metrics Endpoint The vLLM metrics endpoint is the most reliable way to confirm everything is working correctly: Check requests running + cache hits curl -s http://localhost:8000/metrics | grep -E "num_requests_running|cache_hit" | grep -v "^#\\|reason\\|external\\|mm_cache" ✅ Healthy num_requests_running engine=0: 30-50, engine=1: 30-50 | cache_hit: 0.0 ❌ Problem num_requests_running: 0.0 → loop not sending | cache_hit > 0 → caching active, prompts not random enough ℹ️ All vLLM logs including NOISY_GEMM are also written to /tmp/vllm_live.log — useful for debugging when tmux buffer fills up. 05 Health Check & Diagnostics Master Health Check (paste this every time you reconnect) Full Diagnostic echo "=== TMUX ===" && tmux ls && echo "=== SOCKETS ===" && ss -x | grep pearlgw | wc -l && echo "=== VLLM ===" && pgrep -f "vllm serve" | wc -l && echo "=== GPU ===" && nvidia-smi --query-gpu=index,utilization.gpu,memory.used --format=csv,noheader && echo "=== MINING ADDRESS ===" && cat /proc/\$(pgrep -f "pearl-gateway" | head -1)/environ | tr '\\0' '\\n' | grep "MINING_ADDRESS" && echo "=== NOISY_GEMM ===" && tmux capture-pane -t miner -p -S -50 | grep "NOISY_GEMM" | tail -3 && echo "=== LOOP ===" && tmux capture-pane -t loop -p -S -3 | tail -2 && echo "=== BLOCKS ===" && tmux capture-pane -t miner -p -S -5000 | grep -i "block accepted\\|Block found\\|proof" Expected Healthy Values Check Healthy Value Action if Wrong TMUX SESSIONS miner, loop, node Recreate missing sessions SOCKETS 4 Restart miner — gateway/vLLM disconnected VLLM 1 Restart miner tmux session GPU utilization 90-98% both GPUs Fix loop: use sleep 1 + long prompts GPU power draw 600-690W each (near 700W TDP) Low power = GPU idle = loop not working GPU memory ~132GB each vLLM crashed — restart miner NOISY_GEMM m value 5000-8000+ Use longer prompts in loop NOISY_GEMM n value 57344 Must be 57344 — confirms DP mode working NOISY_GEMM workers Both Worker PIDs firing Only one firing = one GPU idle LOOP curl commands visible, many PIDs Restart loop tmux session MINING ADDRESS Your prl1p... address Kill gateway and restart with correct address CACHE HITS 0.0 Prompts too similar — randomize more 06 Troubleshooting 🔴 Problem: GPU shows 0% utilization persistently (confirmed on RunPod dashboard) This is NOT a sampling artifact if RunPod dashboard also shows 0%. Root cause is almost always the request loop — either using wait instead of sleep 1 , or short prompts that produce m values below the 1024 threshold. Diagnose — check requests actually running curl -s http://localhost:8000/metrics | grep "num_requests_running" | grep -v "^#\\|reason" | awk '{print \$2}' | tr '\\n' ' ' ✅ Good 30-50 requests running per engine ❌ Bad 0.0 0.0 → loop not running or requests completing too fast Fix: Kill loop, restart with sleep 1 (not wait ) and long prompts (~150+ tokens). See Step 4 loop command. 🔴 Problem: vLLM crashes with "NVCC compilation failed" DeepGEMM is trying to JIT-compile CUDA kernels and failing. Root cause: VLLM_USE_DEEP_GEMM env var is not set or not reaching the vLLM process. Verify env var is set echo \$VLLM_USE_DEEP_GEMM ✅ Good 0 ❌ Bad Empty → add to ~/.bashrc and source it, then kill miner session and recreate 🔴 Problem: vLLM crashes with "mining_paused: no block template available" The blockchain node is still syncing. vLLM starts but immediately crashes because there is no block to mine. Check sync status cd /root/pearl && ./bin/prlctl -u rpcuser -P rpcpass -s localhost:44107 --notls getblockchaininfo 2>/dev/null | grep -E "blocks|headers" Wait until blocks == headers before starting vLLM. Can take 5-15 minutes on first launch. 🔴 Problem: Gateway crashes with "mining_address: Field required" The PEARLD_MINING_ADDRESS env var is not reaching the gateway process. This happens when env vars are only exported inline rather than in ~/.bashrc, or when the miner tmux session was created before the vars were set. Fix echo \$PEARLD_MINING_ADDRESS If empty: add to ~/.bashrc, source it, then kill and recreate the miner tmux session before restarting. 🔴 Problem: "vllm: command not found" in miner tmux session source .venv/bin/activate often fails silently inside tmux send-keys, so vllm is not in PATH. Fix: Always use FULL PATHS: /root/pearl/.venv/bin/vllm and /root/pearl/.venv/bin/pearl-gateway instead of relying on venv activation. 🔴 Problem: Socket count is 0 after restart Stale socket file from previous run. Gateway creates /tmp/pearlgw.sock and won't overwrite it. Fix rm -f /tmp/pearlgw.sock && echo "cleared" Always delete the socket before restarting. Add to all restart procedures. Fix pkill -9 -f "pearl-gateway" && pkill -9 -f "vllm" && pkill -9 -f "EngineCore" && sleep 5 Then restart miner with full command from Step 4. 🔴 Problem: Socket count is 0 Gateway and vLLM are not connected. Happens when they start in separate sessions. Fix pkill -9 -f "pearl-gateway" && pkill -9 -f "vllm" && pkill -9 -f "EngineCore" && sleep 5 Restart BOTH gateway and vLLM together in the SAME miner session. 🔴 Problem: NOISY_GEMM n is 28672 (not 57344) TP mode is active instead of DP. Restart with --data-parallel-size 2 flag. Verify pgrep -f "vllm serve" | xargs -I{} cat /proc/{}/cmdline | tr '\\0' ' ' | grep "data-parallel" ✅ Good --data-parallel-size 2 visible ❌ Bad Not visible → restart with correct flag 🔴 Problem: No blocks found after hours Check difficulty — if >100,000 expect blocks every 4-8 hours with 2x H200 Verify prompts are randomized — same prompts = KV caching = no mining Check "Block accepted by node!" in miner logs Check explorer for your address 🔴 Problem: Wrong mining address in gateway Check actual address cat /proc/\$(pgrep -f "pearl-gateway" | head -1)/environ | tr '\\0' '\\n' | grep "MINING_ADDRESS" If wrong: pkill -f "pearl-gateway" then restart miner with correct PEARLD_MINING_ADDRESS. 🟡 Problem: vLLM keeps dying — add watchdog Create watchdog (replace YOUR_ADDRESS) cat > /root/pearl/watchdog.sh > /tmp/watchdog.log pkill -9 -f "pearl-gateway"; pkill -9 -f "vllm"; pkill -9 -f "EngineCore" sleep 10 cd /root/pearl && source .venv/bin/activate && \\ export RAYON_NUM_THREADS=96 PEARLD_RPC_URL=http://localhost:44107 \\ PEARLD_RPC_USER=rpcuser PEARLD_RPC_PASSWORD=rpcpass \\ PEARLD_MINING_ADDRESS=\$MINING_ADDRESS HF_HOME=/workspace/.hf \\ VLLM_USE_DEEP_GEMM=0 && \\ pearl-gateway start & sleep 10 && \\ vllm serve pearl-ai/Llama-3.3-70B-Instruct-pearl \\ --host 0.0.0.0 --port 8000 --max-model-len 8192 \\ --gpu-memory-utilization 0.9 --enforce-eager \\ --data-parallel-size 2 --no-enable-prefix-caching \\ --no-enable-chunked-prefill & sleep 900 fi sleep 60 done EOF chmod +x /root/pearl/watchdog.sh tmux new-session -d -s watchdog tmux send-keys -t watchdog "bash /root/pearl/watchdog.sh" Enter 07 Block Verification Check Logs for Block Activity Run tmux capture-pane -t miner -p -S -50000 | grep -i "block accepted\\|Block found\\|proof\\|submit" ✅ Block Found! Block accepted by node! — submission_service.py Block submission result: {'status': 'accepted'} ❌ No Output No blocks found yet — check difficulty and wait Check Explorer Open in browser https://explorer.pearlresearch.ai/address/YOUR_MINING_ADDRESS ✅ Good Shows balance and transaction history with PRL received ❌ Bad Address Not Found — no confirmed blocks yet (normal if new) 08 Quick Restart Reference Quick Status Check (paste after reconnecting) Run pgrep -f "vllm serve" | wc -l && ss -x | grep pearlgw | wc -l && nvidia-smi --query-gpu=utilization.gpu --format=csv,noheader ✅ Healthy 1 / 4 / 97% / 97% ❌ Dead 0 / 0 / 0% / 0% → do full restart below Full Clean Restart (address already in ~/.bashrc) Run pkill -9 -f "pearl-gateway"; pkill -9 -f "vllm"; pkill -9 -f "EngineCore"; pkill -9 -f "Worker"; sleep 3 && rm -f /tmp/pearlgw.sock && tmux kill-session -t miner 2>/dev/null; tmux new-session -d -s miner && tmux send-keys -t miner "cd /root/pearl && source ~/.bashrc && /root/pearl/.venv/bin/pearl-gateway start > /tmp/gateway.log 2>&1 & sleep 10 && /root/pearl/.venv/bin/vllm serve pearl-ai/Llama-3.3-70B-Instruct-pearl --host 0.0.0.0 --port 8000 --max-model-len 8192 --gpu-memory-utilization 0.9 --enforce-eager --data-parallel-size 2 --no-enable-prefix-caching --no-enable-chunked-prefill" Enter Restart Loop Only Run tmux send-keys -t loop C-c Then send the full loop command from Step 4. 08b Additional Critical Notes Oyster Wallet Keeps Dying — This is Normal! ℹ️ Oyster dies frequently. Mining does NOT need oyster running. Oyster is only needed to check balance or generate new addresses. You can ignore oyster dying. Only run oyster when needed for balance check /root/pearl/bin/oyster -u rpcuser -P pearl123 --noclienttls --noservertls --pearldusername=rpcuser --pearldpassword=rpcpass > /tmp/oyster.log 2>&1 & sleep 15 && /root/pearl/bin/prlctl -u rpcuser -P pearl123 -s localhost:44207 --wallet --notls getbalance Model Download (First Run Only) ⚠️ First time vLLM runs it downloads the 70B model (~140GB). This takes 15-30 extra minutes. Subsequent runs use cached model from /workspace/.hf Watch download progress tmux capture-pane -t miner -p -S -20 | grep -i "download\\|Downloading\\|fetching" Verify Mining is Actually Happening (Debug Patch) Add a print statement to confirm NOISY_GEMM is being called: Apply patch python3 -c " with open('/root/pearl/miner/vllm-miner/src/vllm_miner/config.py', 'r') as f: content = f.read() old = ' return (m >= min_m) and (n >= min_n) and (k >= min_k)' new = ''' result = (m >= min_m) and (n >= min_n) and (k >= min_k) if result: print(f\\"NOISY_GEMM_CALLED: m={m} n={n} k={k}\\", flush=True) return result''' content = content.replace(old, new) with open('/root/pearl/miner/vllm-miner/src/vllm_miner/config.py', 'w') as f: f.write(content) print('Patched!') " ℹ️ After applying patch, restart the miner. Then check: tmux capture-pane -t miner -p -S -50 | grep "NOISY_GEMM" | tail -3 ✅ Good: NOISY_GEMM_CALLED: m=5000+ n=57344 k=8192 ❌ Bad: No output → mining not happening OCR/Screenshot Address Warning 🚨 If you copy your mining address from a screenshot using OCR (Gemini, Google Lens, etc.) — NEVER trust it! Characters like 5/s, 0/O, m/n, l/1 are commonly confused. Always verify the address manually character by character or use the validateaddress command. Difficulty Context Difficulty Expected Block Time (2x H200) Status ~29,000 ~1 block/hour Early network (April 27, 2026) ~68,000 ~2 hours/block Day 3 ~115,000 ~4 hours/block Day 4 >150,000 6-8+ hours/block Highly competitive Check current difficulty cd /root/pearl && ./bin/prlctl -u rpcuser -P rpcpass -s localhost:44107 --notls getblockchaininfo 2>/dev/null | grep -E "blocks|difficulty" Separate vLLM tmux Session Warning 🚨 If you have a tmux session called "vllm" from a previous setup — it can cause confusion. Old Worker processes may still show NOISY_GEMM but be disconnected from the gateway. Always check sockets (must be 4) to confirm connection, not just NOISY_GEMM output. Kill stale vllm session if exists tmux kill-session -t vllm 2>/dev/null; echo "done" Gateway Debug Mode Add --debug flag to gateway for more verbose logs including block submissions: In the miner startup command, replace pearl-gateway start With pearl-gateway --debug start Full Loop Command (for Step 8 restarts) Restart loop tmux send-keys -t loop C-c && sleep 2 && pkill -f "curl.*localhost:8000" && sleep 2 && tmux send-keys -t loop "COUNT=0; while true; do COUNT=\\\$((COUNT+1)); for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128; do curl -s http://localhost:8000/v1/chat/completions -H 'Content-Type: application/json' -d '{\\"model\\": \\"pearl-ai/Llama-3.3-70B-Instruct-pearl\\", \\"messages\\": [{\\"role\\": \\"user\\", \\"content\\": \\"Write a detailed comprehensive academic essay about topic \\\$COUNT variant \\\$i covering the following aspects in depth: historical background and origins dating back centuries, mathematical foundations and theoretical frameworks, scientific principles and empirical evidence, technological applications and modern implementations, economic implications and market dynamics, social and cultural impacts on society, philosophical interpretations and ethical considerations, future prospects and emerging research directions, comparative analysis with related fields, and practical case studies with real world examples.\\"}], \\"max_tokens\\": 1}' > /dev/null & done; sleep 1; done" Enter Node Peer Count Check Check peers (need 8+) cd /root/pearl && ./bin/prlctl -u rpcuser -P rpcpass -s localhost:44107 --notls getpeerinfo 2>/dev/null | grep "addr" | wc -l ✅ Good 8+ peers ❌ Bad 0-2 peers → node not synced yet, wait longer 09 Key Lessons Learned Critical Mistake Consequence Fix Using \`wait\` in request loop GPU goes to 0% between batches — burst/idle pattern, very inefficient Use \`sleep 1\` instead — keeps requests continuously overlapping Sending requests to port 8001 DP=2 only exposes port 8000 — port 8001 requests are dropped Always send all requests to port 8000 only Using --tensor-parallel-size 2 Reduces n to 28672, less mining efficiency Use --data-parallel-size 2 Prefix caching enabled Same prompts cached — NO GEMM = NO MINING Always use --no-enable-prefix-caching Gateway in separate session from vLLM Socket not connected, env vars not inherited Start both in same tmux miner session Sending same prompt repeatedly KV cache kicks in, GEMM skipped entirely Randomize with COUNT and i variables config.yaml thresholds at 1 Overhead without benefit for our matrix sizes Keep at 1024 (default) Not verifying mining address Blocks could go to wrong wallet Always validateaddress + check /proc environ MINER_DEBUG env vars Don't reach EngineCore subprocess Use PEARL_LOG_LEVEL=DEBUG instead ✅ Proof of Working Setup: Confirmed mining with NOISY_GEMM_CALLED: m=8174, n=57344 on both workers. GPU 0: 96%, 690W. GPU 1: 97%, 689W. One confirmed block on explorer with 2884 PRL (May 1, 2026). Second miner confirmed operational May 2, 2026. Setup: RunPod 2x H200 SXM, CUDA 12.8, DP=2, 128 concurrent long-prompt requests, sleep 1 loop. 10 Important Gotchas & Edge Cases Password Confusion — Two Different Passwords! Service Username Password Port pearld node (prlctl) rpcuser rpcpass 44107 oyster wallet (prlctl --wallet) rpcuser pearl123 44207 🚨 Using wrong password is a common mistake! Node uses "rpcpass", wallet uses "pearl123" Normal Warning Messages (Not Errors — Ignore These) These are NORMAL — do not worry about them Error creating a default config file: open /root/.oyster/oyster.conf: no such file or directory Error creating a default config file: open /root/.pearld/pearld.conf: no such file or directory Warning: Running on mainnet with --noclienttls is not recommended Warning: Running on mainnet with --noservertls is not recommended Block Accepted ≠ Block Confirmed ⚠️ Seeing "Block accepted by node!" in logs does NOT guarantee the block makes it to the main chain. It can be orphaned if another miner found a block at the same height faster. The explorer is the ONLY ground truth for confirmed blocks and PRL balance. validateaddress ismine: true is NOT 100% Reliable ⚠️ We discovered that validateaddress can return ismine: true even with a slightly different address (possible OCR corruption). Always verify the address character by character manually — don't rely solely on ismine: true. sleep 10 Between Gateway and vLLM The startup command uses pearl-gateway start & sleep 10 && vllm serve ... The & runs gateway in background, sleep 10 gives it time to create the socket, then vLLM starts and connects to it. If vLLM starts before the socket exists, they won't connect. HuggingFace Token The pearl-ai model downloaded fine without an HF token in our setup. If you get auth errors: Set HF token if needed export HF_TOKEN=your_token_here vLLM Process Name vs api_server ℹ️ Some old diagnostic scripts use pgrep -f "api_server" to detect vLLM. This returns 0 even when vLLM IS running! Always use pgrep -f "vllm serve" instead. tmux Buffer Limitation By default tmux only stores a limited scroll buffer. Block activity messages from hours ago may not appear in tmux capture-pane . The explo

Mining Guide

Pearl Miner Setup Guide

Complete installation, configuration, health monitoring & troubleshooting

RunPod 2x H200 CUDA 12.8+ DP=2 Mode Llama 70B

Pearl Miner Setup Guide

Complete installation, configuration, health monitoring & troubleshooting

RunPod / Vast.ai / Lambda CUDA 12.4+ H100 / H200 Required DP=2 Mode Llama 70B Ubuntu 22.04

— Hardware Requirements & Pod Verification (READ FIRST) 00a Cloud Provider Paths & Compatibility 00 Quick Reference (Key Settings) 01 System Dependencies 02 Clone & Build 02b Set Environment Variables (Persistent) 03 Wallet & Node Setup 04 Start Mining 05 Health Check & Diagnostics 06 Troubleshooting 07 Block Verification 08 Quick Restart Reference 08b Additional Critical Notes 09 Key Lessons Learned 10 Important Gotchas & Edge Cases

⚡ Hardware Requirements & Pod Verification

🚨 This guide was built and tested exclusively on RunPod 2x H200 SXM. Every command, expected output, VRAM value, power draw figure, and health check threshold in this guide is calibrated for that specific setup. If you use different hardware, commands will still work but expected output values will differ.

What This Guide Is Designed For

Component	This Guide's Setup	Notes
GPU	2x NVIDIA H200 SXM	Confirmed working. All expected values in this guide are for H200.
VRAM per GPU	143,771 MiB (~141GB)	After model load: ~132,964 MiB GPU0 / ~131,399 MiB GPU1
CUDA Version	12.8	Minimum: 12.4. Tested on 12.8.
Driver	570.211.01	Any 520+ should work
GPU Count	Exactly 2	Guide uses --data-parallel-size 2
System RAM	64GB+	Needed for build + model loading
Disk	300GB+ persistent	Model ~140GB + builds ~50GB + chain ~5GB
OS	Ubuntu 22.04	RunPod default image
Provider	RunPod	See Section 00a for other providers
GPU power at full mining	~690W each (near 700W TDP)	This is the health indicator — if power is 120W, mining is not happening

Other Hardware — Community Reports (Not Tested by This Guide)

⚠️ The following is based on Pearl Discord community reports — NOT verified by this guide. If you use different hardware, expected output values will differ from what this guide shows. Proceed with caution and adapt health check thresholds accordingly.

GPU	Community Status	Notes
H200 SXM ×2	✅ This guide — confirmed	Reference setup for this guide
H100 SXM ×2	✅ Community confirmed	Works. 80GB VRAM each. Adjust VRAM expectations in health checks.
H100 NVL ×1 + H200 ×1	⚠️ Community reported	Mixed setup. Some users got blocks.
Single H200	⚠️ Possible	Use --data-parallel-size 1, 64 requests. Lower hashrate.
A100 ×2	❌ Not recommended	Ampere architecture — Pearl kernel targets Hopper. May not compile.
RTX 4090 ×2	❌ Insufficient VRAM	24GB each = 48GB total. Not enough for 70B model.

Step 0 — Verify Your Pod Before Starting

Run these immediately after SSH-ing in. If any check fails, reprovision before continuing.

Check 1 — GPU model, CUDA, VRAM

nvidia-smi

✅ Good (H200)

2x H200, CUDA 12.8, Driver 570+, 143771 MiB each, 0 MiB used

❌ Bad

Wrong GPU, CUDA <12.4, only 1 GPU, or VRAM already used → reprovision

Check 2 — Disk space (need 300GB+ free)

df -h | sort -rh | head -8

✅ Good

300GB+ available on at least one partition

❌ Bad

Less than 300GB → expand disk before proceeding

Check 3 — RAM

free -h

✅ Good

64GB+ total RAM

❌ Bad

Under 64GB → may OOM during build or model load

Check 4 — OS

lsb_release -a 2>/dev/null || cat /etc/os-release | head -5

✅ Good

Ubuntu 22.04 LTS (Jammy)

⚠️ Untested

Ubuntu 20.04 — may work but not verified by this guide

Check 5 — Internet

curl -s --max-time 5 https://github.com > /dev/null && echo "GitHub OK" && curl -s --max-time 5 https://huggingface.co > /dev/null && echo "HuggingFace OK"

✅ Good

GitHub OK / HuggingFace OK

❌ Bad

Blocked → check provider firewall / outbound rules

Step 0b — Expose Port 44108 Before Deploying (Do This First!)

⚠️ Pearl's P2P port is 44108. If you expose it before deploying your pod, other nodes on the network can connect TO you (inbound connections), giving you more peers and faster block propagation. If you don't expose it, you'll be limited to ~16 outbound-only peers — which still works fine for mining but is not optimal.

ℹ️ On RunPod: before clicking Deploy, find the TCP Port Exposures field and add port 44108. This takes 5 seconds and costs nothing. If your pod is already running, it requires a full restart to add — only worth it at natural restart time.

Scenario	Peers	Impact
Port 44108 NOT exposed (RunPod default)	~16 outbound only	Works fine. Block propagation slightly slower.
Port 44108 exposed	Up to 200 inbound+outbound	Better connectivity, faster block propagation.
Discord reports of 200+ peers	200+	These users have inbound port exposed AND are on providers with open firewall.

✅ If you already deployed without exposing port 44108 — don't restart just for this. Wait until next natural restart and add it then. 16 peers does not meaningfully affect your mining rewards.

Full pre-flight one-liner

echo "=== GPU ===" && nvidia-smi --query-gpu=name,memory.total,driver_version --format=csv,noheader && echo "=== DISK ===" && df -h | sort -rh | head -5 && echo "=== RAM ===" && free -h | grep Mem && echo "=== OS ===" && lsb_release -d 2>/dev/null && echo "=== NETWORK ===" && curl -s --max-time 5 https://github.com > /dev/null && echo "GitHub OK" || echo "GitHub BLOCKED"

✅ Only proceed to Step 01 if: 2x H200 (or compatible GPU), CUDA 12.4+, 300GB+ disk, 64GB+ RAM, Ubuntu 22.04, GitHub reachable.

01 System Dependencies 02 Clone & Build 02b Set Environment Variables (Persistent) 03 Wallet & Node Setup 04 Start Mining 05 Health Check & Diagnostics 06 Troubleshooting 07 Block Verification 08 Quick Restart Reference 08b Additional Critical Notes (Watchdog, Debug, Gotchas) 09 Key Lessons Learned 10 Important Gotchas & Edge Cases

00a Cloud Provider Paths & Compatibility

This guide was built and tested on RunPod. The core setup is identical across providers — only storage paths and a few installation details differ. Use this table to adapt the guide for your provider.

Provider	HF_HOME path	UV Cache path	Persistent storage	Notes
RunPod ✅ Tested	`/workspace/.hf`	`/workspace/.uv-cache`	`/workspace`	Deadsnakes PPA blocked — use UV for Python 3.12. Ubuntu 22.04.
Vast.ai	`/root/.cache/huggingface` or `/workspace/.hf`	`/root/.cache/uv`	`/workspace` (if attached)	Use Custom Template. Ubuntu 22.04 works. apt python3.12 may work via deadsnakes.
Lambda Labs	`/home/ubuntu/.cache/huggingface`	`/home/ubuntu/.cache/uv`	`/home/ubuntu`	Ubuntu 22.04. Python 3.12 via deadsnakes should work. Run as ubuntu not root.
CoreWeave	`/mnt/data/.hf`	`/mnt/data/.uv-cache`	`/mnt/data`	Kubernetes-based. Persistent volume must be mounted manually.
Paperspace	`/notebooks/.hf`	`/notebooks/.uv-cache`	`/notebooks`	Ubuntu 20.04/22.04. Python 3.12 via deadsnakes.
Any provider	Any path with 200GB+ free space	Any writable path	Check df -h for largest partition	Find largest partition: `df -h \| sort -rh \| head -5`

How to Adapt This Guide for Any Provider

Replace every occurrence of /workspace/.hf with your provider's persistent storage path, and /workspace/.uv-cache with the UV cache path. The two places these appear are:

1. In ~/.bashrc

export HF_HOME=/YOUR_PROVIDER_PATH/.hf

2. In the build:miner command (Step 2)

cd /root/pearl && export UV_CACHE_DIR=/YOUR_PROVIDER_PATH/.uv-cache && export HF_HOME=/YOUR_PROVIDER_PATH/.hf && task build:miner

Python 3.12 Installation by Provider

Provider	Python 3.12 Method	Command
RunPod	apt blocked — use UV	`uv python install 3.12`
Vast.ai	Try apt first, fallback to UV	`apt-get install -y python3.12 \|\| uv python install 3.12`
Lambda / Paperspace	apt via deadsnakes PPA	`add-apt-repository ppa:deadsnakes/ppa && apt-get install -y python3.12`
Any provider (universal)	UV always works	`uv python install 3.12`

ℹ️ UV-based Python install (Step 1) always works regardless of provider — it downloads a standalone CPython binary. Use it as the universal fallback if apt fails.

Pre-flight Check (run on any fresh pod)

Verify GPU + CUDA before starting

nvidia-smi && echo "CUDA OK" || echo "NO GPU DETECTED"

✅ Good

Shows H100/H200, CUDA 12.x, Driver 520+

❌ Bad

No GPU detected → wrong instance type, reprovision

Find largest storage partition (for HF_HOME)

df -h | sort -rh | head -5

Pick the partition with 300GB+ free space for HF_HOME. The 70B model needs ~140GB.

00 Quick Reference

Setting	Value	Why
Parallelism	`--data-parallel-size 2`	NOT tensor parallel — TP reduces m dimension
Prefix Caching	`--no-enable-prefix-caching`	MUST disable — caching = no GEMM = no mining
Chunked Prefill	`--no-enable-chunked-prefill`	Must disable for correct mining behavior
GPU Memory	`--gpu-memory-utilization 0.9`	Leave 10% headroom
Model Length	`--max-model-len 8192`	Fits in 80GB VRAM
Execution	`--enforce-eager`	Required for Pearl kernel
ZK Speed	`export RAYON_NUM_THREADS=96`	Faster proof generation
Deep GEMM	`export VLLM_USE_DEEP_GEMM=0`	Disable — conflicts with Pearl GEMM
Requests	`128 concurrent long-prompt requests`	Long prompts (~150+ tokens) needed for m≥5000
Loop pattern	`sleep 1` (NOT wait)	wait causes GPU to idle between batches → 0% util
Request port	`port 8000 ONLY`	DP=2 exposes single port — port 8001 drops silently
Socket Count	4 ESTAB connections	2 per DP engine = 4 total when healthy
n value in NOISY_GEMM	57344	Confirms DP mode (TP gives 28672)
Node RPC	port 44107 (pearld)	pearl daemon
Wallet RPC	port 44207 (oyster)	wallet daemon

01 System Dependencies

ℹ️ Run each block separately. Verify output before moving to next step.

Go Language

Run

wget -q https://go.dev/dl/go1.24.2.linux-amd64.tar.gz && tar -C /usr/local -xzf go1.24.2.linux-amd64.tar.gz && export PATH=$PATH:/usr/local/go/bin && echo 'export PATH=$PATH:/usr/local/go/bin' >> ~/.bashrc

Verify

go version

✅ Good

go version go1.24.2 linux/amd64

❌ Bad

command not found → re-run wget/tar command

Rust Toolchain

Run

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y && source ~/.cargo/env

Verify

rustc --version

✅ Good

rustc 1.xx.x (xxxxxxx YYYY-MM-DD)

❌ Bad

command not found → source ~/.cargo/env

UV Package Manager

Run

curl -LsSf https://astral.sh/uv/install.sh | sh && source $HOME/.local/bin/env

✅ Good

uv 0.x.x

❌ Bad

command not found → run: source $HOME/.local/bin/env

Taskfile

Run

sh -c "$(curl --location https://taskfile.dev/install.sh)" -- -d -b /usr/local/bin

✅ Good

Task version: vx.x.x

❌ Bad

permission denied → check /usr/local/bin permissions

tmux

Run

apt-get update -qq && apt-get install -y tmux

Python 3.12

⚠️ On RunPod, the deadsnakes PPA is blocked — apt-get install python3.12 will fail with "Unable to locate package". Use UV to install Python 3.12 instead (UV is already installed above).

Install Python 3.12 via UV

uv python install 3.12

Make it the system default

ln -sf $(uv python find 3.12) /usr/local/bin/python3.12 && update-alternatives --install /usr/bin/python3 python3 /usr/local/bin/python3.12 1 && python3 --version

✅ Good

Python 3.12.x

❌ Bad

command not found → re-run uv python install 3.12

02 Clone & Build

Clone Repository

Run

cd /root && git clone https://github.com/pearl-research-labs/pearl.git && cd pearl

✅ Good

Cloning into 'pearl'... done. You are now in /root/pearl

❌ Bad

fatal: repository not found → check internet

ℹ️ All build commands must run from /root/pearl directory. Verify with: pwd → should show /root/pearl

Build Blockchain

Run (from /root/pearl)

cd /root/pearl && task build:blockchain

✅ Good

Build completes without errors

❌ Bad

go: command not found → export PATH=$PATH:/usr/local/go/bin

Verify binaries exist

ls -la /root/pearl/bin/pearld /root/pearl/bin/oyster /root/pearl/bin/prlctl

✅ Good

All 3 files listed with size >0

❌ Bad

No such file → build failed, check task output for errors

Build Miner (~20-25 minutes)

⚠️ This takes 20-25 minutes. Do NOT interrupt it! First run compiles CUDA kernels.

Run (from /root/pearl)

cd /root/pearl && export UV_CACHE_DIR=/workspace/.uv-cache && export HF_HOME=/workspace/.hf && task build:miner

✅ Good

Installed 265 packages — vllm==0.20.0+cu129 in list

❌ Bad

CUDA build failed → check nvidia-smi shows H100/H200

Verify venv exists

ls /root/pearl/.venv/bin/vllm && ls /root/pearl/.venv/bin/pearl-gateway

✅ Good

Both files listed — build successful

❌ Bad

No such file → miner build failed, re-run task build:miner

02b Set Environment Variables (Persistent)

🚨 CRITICAL: Do this BEFORE starting the miner. Env vars must be in ~/.bashrc so they survive across shell sessions and tmux windows. If you only export them inline, the gateway will fail with "mining_address: Field required" because the vars don't reach the tmux session.

Add all required env vars to ~/.bashrc now (you will update PEARLD_MINING_ADDRESS after Step 3):

Add to ~/.bashrc

cat >> ~/.bashrc << 'EOF'
export PEARLD_RPC_URL=http://localhost:44107
export PEARLD_RPC_USER=rpcuser
export PEARLD_RPC_PASSWORD=rpcpass
export PEARLD_MINING_ADDRESS=PLACEHOLDER
export HF_HOME=/workspace/.hf
export VLLM_USE_DEEP_GEMM=0
export RAYON_NUM_THREADS=96
EOF
source ~/.bashrc && echo $VLLM_USE_DEEP_GEMM

✅ Good

Prints: 0

❌ Bad

Empty output → vars not set, re-run the cat command

⚠️ After generating your mining address in Step 3, update ~/.bashrc: replace PLACEHOLDER with your real address, then run source ~/.bashrc before starting the miner in Step 4.

03 Wallet & Node Setup

Create Wallet

Run

cd /root/pearl && ./bin/oyster --create

🚨 CRITICAL: Write down the 12-word seed phrase! This is your ONLY backup. If you lose it you lose all mined PRL forever.

When prompted, answer as follows:

Prompt	Answer
Do you want to add a passphrase?	No (just press Enter) — or set one you'll remember
Do you have an existing seed phrase?	No
Seed phrase shown	⚠️ WRITE IT DOWN NOW — all 12 words in order
Type OK to confirm	OK

Start tmux Sessions

Run

tmux new-session -d -s node && tmux new-session -d -s miner && tmux new-session -d -s loop

✅ Good

tmux ls shows: node, miner, loop sessions

❌ Bad

session exists → tmux kill-session -t node first

Start Blockchain Node

Run

tmux send-keys -t node "cd /root/pearl && ./bin/pearld --rpcuser=rpcuser --rpcpass=rpcpass --rpclisten=0.0.0.0:44107 --txindex --notls --maxpeers=200" Enter

Wait 30 seconds then verify:

Verify

cd /root/pearl && ./bin/prlctl -u rpcuser -P rpcpass -s localhost:44107 --notls getblockcount

✅ Good

Returns a block number (e.g., 36000+)

❌ Bad

connection refused → node not started, check tmux node session

Get Mining Address

Run

/root/pearl/bin/oyster -u rpcuser -P pearl123 --noclienttls --noservertls --pearldusername=rpcuser --pearldpassword=rpcpass > /tmp/oyster.log 2>&1 & sleep 15 && /root/pearl/bin/prlctl -u rpcuser -P pearl123 -s localhost:44207 --wallet --notls getnewaddress

✅ Good

Returns address starting with prl1p...

❌ Bad

connection refused → oyster not ready, wait longer and retry

⚠️ SAVE THIS ADDRESS! You'll need it in every restart command. Also verify it with validateaddress below.

🚨 Now update your ~/.bashrc with the real address:

sed -i 's/PEARLD_MINING_ADDRESS=PLACEHOLDER/PEARLD_MINING_ADDRESS=YOUR_ACTUAL_ADDRESS/' ~/.bashrc && source ~/.bashrc && echo $PEARLD_MINING_ADDRESS

— confirm it prints your address before proceeding.

Verify Address is Yours

Run (replace YOUR_ADDRESS)

/root/pearl/bin/prlctl -u rpcuser -P pearl123 -s localhost:44207 --wallet --notls validateaddress YOUR_ADDRESS

✅ Good

"ismine": true

❌ Bad

"ismine": false → wrong address, generate a new one

04 Start Mining

Start Gateway + vLLM (replace YOUR_MINING_ADDRESS)

🚨 Gateway and vLLM MUST start in the same tmux session! If separate, they won't connect. Also — delete any stale socket first: rm -f /tmp/pearlgw.sock

🚨 Use FULL PATHS to vllm and pearl-gateway — do NOT rely on venv activate inside tmux. The activate command often fails silently in tmux send-keys, causing "vllm: command not found".

Run

rm -f /tmp/pearlgw.sock && tmux kill-session -t miner 2>/dev/null; tmux new-session -d -s miner && tmux send-keys -t miner "cd /root/pearl && source ~/.bashrc && /root/pearl/.venv/bin/pearl-gateway start > /tmp/gateway.log 2>&1 & sleep 10 && /root/pearl/.venv/bin/vllm serve pearl-ai/Llama-3.3-70B-Instruct-pearl --host 0.0.0.0 --port 8000 --max-model-len 8192 --gpu-memory-utilization 0.9 --enforce-eager --data-parallel-size 2 --no-enable-prefix-caching --no-enable-chunked-prefill" Enter

ℹ️ Gateway logs go to /tmp/gateway.log — this keeps the miner tmux session clean so vLLM output is visible. Check gateway: tail -5 /tmp/gateway.log

⚠️ vLLM takes 10-15 minutes to load the 70B model on first run (~140GB download). Subsequent runs use cached model from /workspace/.hf and load in ~2-3 minutes.

Wait for Node to Sync Before vLLM Starts

🚨 If the node is still syncing when vLLM starts, it will crash with "mining_paused: no block template available". The node must be fully synced first. Check sync status:

Check sync status

cd /root/pearl && ./bin/prlctl -u rpcuser -P rpcpass -s localhost:44107 --notls getblockchaininfo 2>/dev/null | grep -E "blocks|headers"

✅ Synced

blocks == headers (same number)

❌ Syncing

headers > blocks → wait, re-check every 30 seconds

Verify vLLM Loaded

Check GPU Memory

nvidia-smi --query-gpu=index,memory.used --format=csv,noheader

✅ Good

0, 132964 MiB / 1, 131397 MiB

❌ Bad

0, 4 MiB / 1, 4 MiB → still loading, wait

Check Health

curl -s http://localhost:8000/health && echo "READY" || echo "NOT READY"

✅ Good

READY

❌ Bad

NOT READY → still loading, wait and retry

Start Request Loop

🚨 Prompts MUST be randomized! Same prompts = KV caching = ZERO MINING!

🚨 Use sleep 1 NOT wait! Using wait causes GPU to drop to 0% between batches (burst/idle pattern). sleep 1 keeps requests continuously overlapping for 90%+ GPU utilization!

🚨 Use LONG prompts (~150+ tokens)! Short prompts produce small m values (m<1024) which fail the should_use_noisy_gemm() threshold check = NO MINING. Long prefill-heavy prompts achieve m=5000-8000+ for maximum hash rate.

⚠️ Send ALL requests to port 8000 ONLY. With DP=2, vLLM exposes a single port (8000). Port 8001 does NOT exist — requests there are dropped silently.

Run

tmux send-keys -t loop "COUNT=0; while true; do COUNT=\$((COUNT+1)); for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128; do curl -s http://localhost:8000/v1/chat/completions -H 'Content-Type: application/json' -d '{\"model\": \"pearl-ai/Llama-3.3-70B-Instruct-pearl\", \"messages\": [{\"role\": \"user\", \"content\": \"Write a detailed comprehensive academic essay about topic \$COUNT variant \$i covering the following aspects in depth: historical background and origins dating back centuries, mathematical foundations and theoretical frameworks, scientific principles and empirical evidence, technological applications and modern implementations, economic implications and market dynamics, social and cultural impacts on society, philosophical interpretations and ethical considerations, future prospects and emerging research directions, comparative analysis with related fields, and practical case studies with real world examples.\"}], \"max_tokens\": 1}' > /dev/null & done; sleep 1; done" Enter

Verify Loop + Mining is Working (wait 2 minutes then run)

Check GPU utilization went up

nvidia-smi --query-gpu=index,utilization.gpu --format=csv,noheader

✅ Good

Both GPUs at 90%+

❌ Bad

0% → loop not sending requests, check tmux loop session

Confirm NOISY_GEMM is firing

tmux capture-pane -t miner -p -S -5000 | grep "NOISY_GEMM" | tail -3

✅ Mining!

NOISY_GEMM_CALLED: m=5000+ n=57344 k=8192 on BOTH workers

❌ Not Mining

No output → use -S -5000 (larger buffer), or apply NOISY_GEMM debug patch from Section 08b

⚠️ NOISY_GEMM output goes to the tmux buffer. Always use -S -5000 (not -S -50) to look back far enough — the buffer fills with other logs quickly.

Verify with Metrics Endpoint

The vLLM metrics endpoint is the most reliable way to confirm everything is working correctly:

Check requests running + cache hits

curl -s http://localhost:8000/metrics | grep -E "num_requests_running|cache_hit" | grep -v "^#\|reason\|external\|mm_cache"

✅ Healthy

num_requests_running engine=0: 30-50, engine=1: 30-50 | cache_hit: 0.0

❌ Problem

num_requests_running: 0.0 → loop not sending | cache_hit > 0 → caching active, prompts not random enough

ℹ️ All vLLM logs including NOISY_GEMM are also written to /tmp/vllm_live.log — useful for debugging when tmux buffer fills up.

05 Health Check & Diagnostics

Master Health Check (paste this every time you reconnect)

Full Diagnostic

echo "=== TMUX ===" && tmux ls && echo "=== SOCKETS ===" && ss -x | grep pearlgw | wc -l && echo "=== VLLM ===" && pgrep -f "vllm serve" | wc -l && echo "=== GPU ===" && nvidia-smi --query-gpu=index,utilization.gpu,memory.used,power.draw --format=csv,noheader && echo "=== MINING ADDRESS ===" && cat /proc/$(pgrep -f "pearl-gateway" | head -1)/environ | tr '\0' '\n' | grep "MINING_ADDRESS" && echo "=== NOISY_GEMM ===" && tmux capture-pane -t miner -p -S -5000 | grep "NOISY_GEMM" | grep "n=57344" | tail -3 && echo "=== LOOP ===" && tmux capture-pane -t loop -p -S -3 | tail -2 && echo "=== CURL JOBS ===" && pgrep -f "curl.*localhost:8000" | wc -l && echo "=== REQUESTS RUNNING ===" && curl -s http://localhost:8000/metrics | grep "num_requests_running" | grep -v "^#\|reason" | awk '{print $2}' | tr '\n' ' ' && echo "" && echo "=== CACHE HITS ===" && curl -s http://localhost:8000/metrics | grep "cache_hit" | grep -v "^#\|external\|mm_cache" | awk '{print $2}' | tr '\n' ' ' && echo "" && echo "=== PEERS ===" && cd /root/pearl && ./bin/prlctl -u rpcuser -P rpcpass -s localhost:44107 --notls getpeerinfo 2>/dev/null | grep "addr" | wc -l && echo "=== BLOCK COUNT ===" && ./bin/prlctl -u rpcuser -P rpcpass -s localhost:44107 --notls getblockcount 2>/dev/null && echo "=== WATCHDOG ===" && cat /tmp/loop_watchdog.log 2>/dev/null || echo "No restarts yet" && echo "=== BLOCKS ===" && tmux capture-pane -t miner -p -S -5000 | grep -i "block accepted\|Block found\|proof"

Expected Healthy Values

Check	Healthy Value	Action if Wrong
TMUX SESSIONS	miner, loop, node, watchdog	Recreate missing sessions
SOCKETS	4	Restart miner — gateway/vLLM disconnected
PEERS	8-16 on RunPod (normal)	16 = normal without exposed port 44108. Not a problem. See Step 0b.
VLLM	1	Restart miner tmux session
GPU utilization	90-98% both GPUs	Restart loop — use sleep 1 + long prompts
GPU power draw	600-690W each (near 700W TDP)	Low power = GPU idle = loop stalled
GPU memory	~132GB each	vLLM crashed — restart miner
NOISY_GEMM m value	5000-8000+	Use longer prompts in loop. Low m = less mining throughput.
NOISY_GEMM n value	57344	Must be 57344 — confirms DP mode working
NOISY_GEMM workers	Both Worker PIDs firing	Only one firing = one GPU idle — restart loop
CURL JOBS	500+ jobs in flight	Under 10 = loop stalled — watchdog should auto-restart
REQUESTS RUNNING	30-50 per engine (balanced)	0 on one engine = unbalanced — restart loop
CACHE HITS	0.0	Prompts too similar — randomize more
LOOP	Many PIDs visible, large count number	Restart loop or check watchdog log
MINING ADDRESS	Your prl1p... address	Kill gateway and restart with correct address in ~/.bashrc
WATCHDOG	No restarts yet / shows timestamps	Not running → set up loop watchdog (Section 08b)

06 Troubleshooting

🔴 Problem: GPU shows 0% utilization persistently (confirmed on RunPod dashboard)

This is NOT a sampling artifact if RunPod dashboard also shows 0%. Root cause is almost always the request loop — either using wait instead of sleep 1, or short prompts that produce m values below the 1024 threshold.

Diagnose — check requests actually running

curl -s http://localhost:8000/metrics | grep "num_requests_running" | grep -v "^#\|reason" | awk '{print $2}' | tr '\n' ' '

✅ Good

30-50 requests running per engine

❌ Bad

0.0 0.0 → loop not running or requests completing too fast

Fix: Kill loop, restart with sleep 1 (not wait) and long prompts (~150+ tokens). See Step 4 loop command.

🔴 Problem: vLLM crashes with "NVCC compilation failed"

DeepGEMM is trying to JIT-compile CUDA kernels and failing. Root cause: VLLM_USE_DEEP_GEMM env var is not set or not reaching the vLLM process.

Verify env var is set

echo $VLLM_USE_DEEP_GEMM

✅ Good

❌ Bad

Empty → add to ~/.bashrc and source it, then kill miner session and recreate

🔴 Problem: vLLM crashes with "mining_paused: no block template available"

The blockchain node is still syncing. vLLM starts but immediately crashes because there is no block to mine.

Check sync status

cd /root/pearl && ./bin/prlctl -u rpcuser -P rpcpass -s localhost:44107 --notls getblockchaininfo 2>/dev/null | grep -E "blocks|headers"

Wait until blocks == headers before starting vLLM. Can take 5-15 minutes on first launch.

🔴 Problem: Gateway crashes with "mining_address: Field required"

The PEARLD_MINING_ADDRESS env var is not reaching the gateway process. This happens when env vars are only exported inline rather than in ~/.bashrc, or when the miner tmux session was created before the vars were set.

Fix

echo $PEARLD_MINING_ADDRESS

If empty: add to ~/.bashrc, source it, then kill and recreate the miner tmux session before restarting.

🔴 Problem: "vllm: command not found" in miner tmux session

source .venv/bin/activate often fails silently inside tmux send-keys, so vllm is not in PATH.

Fix: Always use FULL PATHS: /root/pearl/.venv/bin/vllm and /root/pearl/.venv/bin/pearl-gateway instead of relying on venv activation.

🔴 Problem: Socket count is 0 after restart

Stale socket file from previous run. Gateway creates /tmp/pearlgw.sock and won't overwrite it.

Fix

rm -f /tmp/pearlgw.sock && echo "cleared"

Always delete the socket before restarting. Add to all restart procedures.

Fix

pkill -9 -f "pearl-gateway" && pkill -9 -f "vllm" && pkill -9 -f "EngineCore" && sleep 5

Then restart miner with full command from Step 4.

🔄 Loop Watchdog — Recommended for All Setups

The request loop can stall silently — curl jobs drop to 0, GPU goes to 0%, but vLLM stays running. This watchdog checks every 60 seconds and auto-restarts the loop if fewer than 10 curl jobs are running. Set this up on every miner.

Create loop watchdog

cat > /root/loop_watchdog.sh << 'EOF'
#!/bin/bash
while true; do
  CURL_COUNT=$(pgrep -f "curl.*localhost:8000" | wc -l)
  if [ "$CURL_COUNT" -lt 10 ]; then
    echo "$(date) - Loop stalled (${CURL_COUNT} curl jobs), restarting..." >> /tmp/loop_watchdog.log
    tmux send-keys -t loop C-c 2>/dev/null
    sleep 2
    pkill -f "curl.*localhost:8000" 2>/dev/null
    sleep 2
    tmux send-keys -t loop "COUNT=0; while true; do COUNT=\$((COUNT+1)); for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128; do curl -s http://localhost:8000/v1/chat/completions -H 'Content-Type: application/json' -d '{\\\"model\\\": \\\"pearl-ai/Llama-3.3-70B-Instruct-pearl\\\", \\\"messages\\\": [{\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"Write a detailed comprehensive academic essay about topic \$COUNT variant \$i covering the following aspects in depth: historical background and origins dating back centuries, mathematical foundations and theoretical frameworks, scientific principles and empirical evidence, technological applications and modern implementations, economic implications and market dynamics, social and cultural impacts on society, philosophical interpretations and ethical considerations, future prospects and emerging research directions, comparative analysis with related fields, and practical case studies with real world examples.\\\"}], \\\"max_tokens\\\": 1}' > /dev/null & done; sleep 1; done" Enter
    echo "$(date) - Loop restarted" >> /tmp/loop_watchdog.log
  fi
  sleep 60
done
EOF
chmod +x /root/loop_watchdog.sh && tmux new-session -d -s watchdog && tmux send-keys -t watchdog "/root/loop_watchdog.sh" Enter && echo "✓ Loop watchdog running" && tmux ls | grep watchdog

✅ Good

watchdog: 1 windows (created ...)

❌ Bad

session already exists → kill existing: tmux kill-session -t watchdog, then retry

Check watchdog log anytime

cat /tmp/loop_watchdog.log 2>/dev/null || echo "No restarts yet"

07 Block Verification

Check Logs for Block Activity

Run

tmux capture-pane -t miner -p -S -50000 | grep -i "block accepted\|Block found\|proof\|submit"

✅ Block Found!

Block accepted by node! — submission_service.py
Block submission result: {'status': 'accepted'}

❌ No Output

No blocks found yet — check difficulty and wait

Check Explorer

Open in browser

https://explorer.pearlresearch.ai/address/YOUR_MINING_ADDRESS

✅ Good

Shows balance and transaction history with PRL received

❌ Bad

Address Not Found — no confirmed blocks yet (normal if new)

08 Quick Restart Reference

Quick Status Check (paste after reconnecting)

Run

pgrep -f "vllm serve" | wc -l && ss -x | grep pearlgw | wc -l && nvidia-smi --query-gpu=utilization.gpu --format=csv,noheader

✅ Healthy

1 / 4 / 97% / 97%

❌ Dead

0 / 0 / 0% / 0% → do full restart below

Full Clean Restart (address already in ~/.bashrc)

Run

pkill -9 -f "pearl-gateway"; pkill -9 -f "vllm"; pkill -9 -f "EngineCore"; pkill -9 -f "Worker"; sleep 3 && rm -f /tmp/pearlgw.sock && tmux kill-session -t miner 2>/dev/null; tmux new-session -d -s miner && tmux send-keys -t miner "cd /root/pearl && source ~/.bashrc && /root/pearl/.venv/bin/pearl-gateway start > /tmp/gateway.log 2>&1 & sleep 10 && /root/pearl/.venv/bin/vllm serve pearl-ai/Llama-3.3-70B-Instruct-pearl --host 0.0.0.0 --port 8000 --max-model-len 8192 --gpu-memory-utilization 0.9 --enforce-eager --data-parallel-size 2 --no-enable-prefix-caching --no-enable-chunked-prefill" Enter

Restart Loop Only

Run

tmux send-keys -t loop C-c

Then send the full loop command from Step 4.

08b Additional Critical Notes

Oyster Wallet Keeps Dying — This is Normal!

ℹ️ Oyster dies frequently. Mining does NOT need oyster running. Oyster is only needed to check balance or generate new addresses. You can ignore oyster dying.

Only run oyster when needed for balance check

/root/pearl/bin/oyster -u rpcuser -P pearl123 --noclienttls --noservertls --pearldusername=rpcuser --pearldpassword=rpcpass > /tmp/oyster.log 2>&1 & sleep 15 && /root/pearl/bin/prlctl -u rpcuser -P pearl123 -s localhost:44207 --wallet --notls getbalance

Model Download (First Run Only)

⚠️ First time vLLM runs it downloads the 70B model (~140GB). This takes 15-30 extra minutes. Subsequent runs use cached model from /workspace/.hf

Watch download progress

tmux capture-pane -t miner -p -S -20 | grep -i "download\|Downloading\|fetching"

Verify Mining is Actually Happening (Debug Patch)

Add a print statement to confirm NOISY_GEMM is being called:

Apply patch

python3 -c "
with open('/root/pearl/miner/vllm-miner/src/vllm_miner/config.py', 'r') as f:
    content = f.read()
old = '        return (m >= min_m) and (n >= min_n) and (k >= min_k)'
new = '''        result = (m >= min_m) and (n >= min_n) and (k >= min_k)
        if result:
            print(f\"NOISY_GEMM_CALLED: m={m} n={n} k={k}\", flush=True)
        return result'''
content = content.replace(old, new)
with open('/root/pearl/miner/vllm-miner/src/vllm_miner/config.py', 'w') as f:
    f.write(content)
print('Patched!')
"

ℹ️ After applying patch, restart the miner. Then check: tmux capture-pane -t miner -p -S -50 | grep "NOISY_GEMM" | tail -3
✅ Good: NOISY_GEMM_CALLED: m=5000+ n=57344 k=8192
❌ Bad: No output → mining not happening

OCR/Screenshot Address Warning

🚨 If you copy your mining address from a screenshot using OCR (Gemini, Google Lens, etc.) — NEVER trust it! Characters like 5/s, 0/O, m/n, l/1 are commonly confused. Always verify the address manually character by character or use the validateaddress command.

Difficulty Context

Difficulty	Expected Block Time (2x H200)	Status
~29,000	~1 block/hour	Early network (April 27, 2026)
~68,000	~2 hours/block	Day 3
~115,000	~4 hours/block	Day 4
>150,000	6-8+ hours/block	Highly competitive

Check current difficulty

cd /root/pearl && ./bin/prlctl -u rpcuser -P rpcpass -s localhost:44107 --notls getblockchaininfo 2>/dev/null | grep -E "blocks|difficulty"

Separate vLLM tmux Session Warning

🚨 If you have a tmux session called "vllm" from a previous setup — it can cause confusion. Old Worker processes may still show NOISY_GEMM but be disconnected from the gateway. Always check sockets (must be 4) to confirm connection, not just NOISY_GEMM output.

Kill stale vllm session if exists

tmux kill-session -t vllm 2>/dev/null; echo "done"

Gateway Debug Mode

Add --debug flag to gateway for more verbose logs including block submissions:

In the miner startup command, replace

pearl-gateway start

With

pearl-gateway --debug start

Full Loop Command (for Step 8 restarts)

Restart loop

tmux send-keys -t loop C-c && sleep 2 && pkill -f "curl.*localhost:8000" && sleep 2 && tmux send-keys -t loop "COUNT=0; while true; do COUNT=\$((COUNT+1)); for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128; do curl -s http://localhost:8000/v1/chat/completions -H 'Content-Type: application/json' -d '{\"model\": \"pearl-ai/Llama-3.3-70B-Instruct-pearl\", \"messages\": [{\"role\": \"user\", \"content\": \"Write a detailed comprehensive academic essay about topic \$COUNT variant \$i covering the following aspects in depth: historical background and origins dating back centuries, mathematical foundations and theoretical frameworks, scientific principles and empirical evidence, technological applications and modern implementations, economic implications and market dynamics, social and cultural impacts on society, philosophical interpretations and ethical considerations, future prospects and emerging research directions, comparative analysis with related fields, and practical case studies with real world examples.\"}], \"max_tokens\": 1}' > /dev/null & done; sleep 1; done" Enter

Node Peer Count Check

Check peers (need 8+)

cd /root/pearl && ./bin/prlctl -u rpcuser -P rpcpass -s localhost:44107 --notls getpeerinfo 2>/dev/null | grep "addr" | wc -l

✅ Good

8+ peers

❌ Bad

0-2 peers → node not synced yet, wait longer

09 Key Lessons Learned

Critical Mistake	Consequence	Fix
Using `wait` in request loop	GPU goes to 0% between batches — burst/idle pattern, very inefficient	Use `sleep 1` instead — keeps requests continuously overlapping
Sending requests to port 8001	DP=2 only exposes port 8000 — port 8001 requests are dropped	Always send all requests to port 8000 only
Using --tensor-parallel-size 2	Reduces n to 28672, less mining efficiency	Use --data-parallel-size 2
Prefix caching enabled	Same prompts cached — NO GEMM = NO MINING	Always use --no-enable-prefix-caching
Gateway in separate session from vLLM	Socket not connected, env vars not inherited	Start both in same tmux miner session
Sending same prompt repeatedly	KV cache kicks in, GEMM skipped entirely	Randomize with COUNT and i variables
config.yaml thresholds at 1	Overhead without benefit for our matrix sizes	Keep at 1024 (default)
Not verifying mining address	Blocks could go to wrong wallet	Always validateaddress + check /proc environ
MINER_DEBUG env vars	Don't reach EngineCore subprocess	Use PEARL_LOG_LEVEL=DEBUG instead

✅ Proof of Working Setup: Confirmed mining with NOISY_GEMM_CALLED: m=8174, n=57344 on both workers. GPU 0: 96%, 690W. GPU 1: 97%, 689W. One confirmed block on explorer with 2884 PRL (May 1, 2026). Second miner confirmed operational May 2, 2026. Setup: RunPod 2x H200 SXM, CUDA 12.8, DP=2, 128 concurrent long-prompt requests, sleep 1 loop.

10 Important Gotchas & Edge Cases

Password Confusion — Two Different Passwords!

Service	Username	Password	Port
pearld node (prlctl)	rpcuser	rpcpass	44107
oyster wallet (prlctl --wallet)	rpcuser	pearl123	44207

🚨 Using wrong password is a common mistake! Node uses "rpcpass", wallet uses "pearl123"

Normal Warning Messages (Not Errors — Ignore These)

These are NORMAL — do not worry about them

Error creating a default config file: open /root/.oyster/oyster.conf: no such file or directory
Error creating a default config file: open /root/.pearld/pearld.conf: no such file or directory
Warning: Running on mainnet with --noclienttls is not recommended
Warning: Running on mainnet with --noservertls is not recommended

Block Accepted ≠ Block Confirmed

⚠️ Seeing "Block accepted by node!" in logs does NOT guarantee the block makes it to the main chain. It can be orphaned if another miner found a block at the same height faster. The explorer is the ONLY ground truth for confirmed blocks and PRL balance.

validateaddress ismine: true is NOT 100% Reliable

⚠️ We discovered that validateaddress can return ismine: true even with a slightly different address (possible OCR corruption). Always verify the address character by character manually — don't rely solely on ismine: true.

sleep 10 Between Gateway and vLLM

The startup command uses pearl-gateway start & sleep 10 && vllm serve ...

The & runs gateway in background, sleep 10 gives it time to create the socket, then vLLM starts and connects to it. If vLLM starts before the socket exists, they won't connect.

HuggingFace Token

The pearl-ai model downloaded fine without an HF token in our setup. If you get auth errors:

Set HF token if needed

export HF_TOKEN=your_token_here

vLLM Process Name vs api_server

ℹ️ Some old diagnostic scripts use pgrep -f "api_server" to detect vLLM. This returns 0 even when vLLM IS running! Always use pgrep -f "vllm serve" instead.

tmux Buffer Limitation

By default tmux only stores a limited scroll buffer. Block activity messages from hours ago may not appear in tmux capture-pane. The explorer is more reliable for historical block confirmation.

Wallet Address from Same Seed

Running getnewaddress multiple times generates different addresses — all from the same seed phrase, all recoverable. But only one address is set as the mining address at a time. The second address generated (prl1p8jt0...) is a valid backup address from the same wallet.

Loop Stalls Silently — GPU Goes to 0%

The request loop can stall without any error message. Curl jobs drop to 0, GPU goes to 0%, but vLLM stays running and appears healthy. This happens because bash accumulates too many background jobs over time.

Signs: GPU 0% on RunPod dashboard, power draw drops to ~120W, NOISY_GEMM stops firing in tmux buffer, curl job count is 0.

Fix: Kill loop, restart it. Always set up the loop watchdog (Section 08b) to handle this automatically — it checks every 60 seconds and restarts if curl jobs drop below 10.

Unbalanced DP Engines — One Worker Firing Less

Sometimes requests distribute unevenly between the two DP engines — one engine gets 35 requests, the other gets 0-7. This shows as low m values on one Worker and lower GPU utilization. Root cause: loop stalled and restarted unevenly.

Fix: Restart the loop cleanly. Kill all curl jobs first, verify 0 remaining, then restart. The engines rebalance within the next batch.

ℹ️ Check balance with:

curl -s http://localhost:8000/metrics | grep "num_requests_running" | grep -v "^#\|reason" | awk '{print $2}' | tr '\n' ' '

— both engines should show similar numbers.

Only 16 Peers — Discord Reports 200+

On RunPod (and most cloud providers), inbound connections are blocked by default. Your node can connect OUT to other peers but other nodes cannot connect IN to you. This limits you to ~8-16 outbound peers regardless of your --maxpeers setting.

The fix is exposing port 44108 before deploying your pod (see Step 0b). If already deployed, wait until next natural restart.

ℹ️ 16 peers is sufficient for mining. Block propagation works fine outbound-only. The difference between 16 and 200 peers is milliseconds of propagation time — negligible compared to the time between blocks.

⚠️ If you try to get a mining address before the node syncs, the address may be invalid. Wait at least 30-60 seconds after starting pearld and verify getblockcount returns a number before running getnewaddress.

Node Must Be Synced Before Starting vLLM

🚨 vLLM will crash immediately on startup with "mining_paused: no block template available" if the blockchain node is still syncing. Always verify blocks == headers before starting the miner. The node typically takes 5-15 minutes to sync on first launch.

Env Vars Must Be in ~/.bashrc — Not Just Exported Inline

Exporting vars inline in the tmux send-keys command is unreliable — the vars often don't reach subprocesses. Always add them to ~/.bashrc and use source ~/.bashrc in the miner startup. The gateway will fail with "mining_address: Field required" if PEARLD_MINING_ADDRESS is not in the environment.

Always Use Full Paths for vllm and pearl-gateway

Using source .venv/bin/activate inside tmux send-keys frequently fails silently, leaving vllm not in PATH and producing "vllm: command not found". Always use /root/pearl/.venv/bin/vllm and /root/pearl/.venv/bin/pearl-gateway explicitly.

Delete Stale Socket Before Every Restart

The gateway socket at /tmp/pearlgw.sock persists after the gateway dies. On restart, if the old socket file exists, the new gateway may fail or vLLM may connect to a dead socket. Always run rm -f /tmp/pearlgw.sock before restarting.

GPU 0% — Real Issue vs Sampling Artifact

⚠️ If nvidia-smi shows 0% GPU but you catch it in occasional bursts (35→0→35→0), that MAY be a sampling artifact from the batch/wait pattern. But if RunPod dashboard also shows 0% persistently AND power draw is ~120W (vs 690W when healthy), it is a REAL problem. The fix is always the loop: use sleep 1 + long prompts.

Short Prompts Kill Mining (m Value Too Low)

The should_use_noisy_gemm() function requires m ≥ 1024 (default threshold in config.yaml). Short prompts produce small batch sizes (m < 1024) and mining is skipped entirely. Always use long prefill-heavy prompts (~150+ tokens input, max_tokens=1). Target m=5000-8000+. Power draw is the quickest sanity check: 690W = mining, 120W = not mining.

GPU Mining Pearl Network

🤖 Use this guide with any AI assistant

Pearl Miner Setup Guide

Pearl Miner Setup Guide

Contents

⚡ Hardware Requirements & Pod Verification

What This Guide Is Designed For

Other Hardware — Community Reports (Not Tested by This Guide)

Step 0 — Verify Your Pod Before Starting

Step 0b — Expose Port 44108 Before Deploying (Do This First!)

00a Cloud Provider Paths & Compatibility

How to Adapt This Guide for Any Provider

Python 3.12 Installation by Provider

Pre-flight Check (run on any fresh pod)

00 Quick Reference

01 System Dependencies

Go Language

Rust Toolchain

UV Package Manager

Taskfile

tmux

Python 3.12

02 Clone & Build

Clone Repository

Build Blockchain

Build Miner (~20-25 minutes)

02b Set Environment Variables (Persistent)

03 Wallet & Node Setup

Create Wallet

Start tmux Sessions

Start Blockchain Node

Get Mining Address

Verify Address is Yours

04 Start Mining

Start Gateway + vLLM (replace YOUR_MINING_ADDRESS)

Wait for Node to Sync Before vLLM Starts

Verify vLLM Loaded

Start Request Loop

Verify Loop + Mining is Working (wait 2 minutes then run)

Verify with Metrics Endpoint

05 Health Check & Diagnostics

Master Health Check (paste this every time you reconnect)

Expected Healthy Values

06 Troubleshooting

🔴 Problem: GPU shows 0% utilization persistently (confirmed on RunPod dashboard)

🔴 Problem: vLLM crashes with "NVCC compilation failed"

🔴 Problem: vLLM crashes with "mining_paused: no block template available"

🔴 Problem: Gateway crashes with "mining_address: Field required"

🔴 Problem: "vllm: command not found" in miner tmux session

🔴 Problem: Socket count is 0 after restart

🔴 Problem: Socket count is 0

🔴 Problem: NOISY_GEMM n is 28672 (not 57344)

🔴 Problem: No blocks found after hours

🔴 Problem: Wrong mining address in gateway

🟡 Problem: vLLM keeps dying — add miner watchdog

🔄 Loop Watchdog — Recommended for All Setups

07 Block Verification

Check Logs for Block Activity

Check Explorer

08 Quick Restart Reference

Quick Status Check (paste after reconnecting)

Full Clean Restart (address already in ~/.bashrc)

Restart Loop Only

08b Additional Critical Notes

Oyster Wallet Keeps Dying — This is Normal!

Model Download (First Run Only)

Verify Mining is Actually Happening (Debug Patch)

OCR/Screenshot Address Warning

Difficulty Context

Separate vLLM tmux Session Warning

Gateway Debug Mode

Full Loop Command (for Step 8 restarts)

Node Peer Count Check

09 Key Lessons Learned

10 Important Gotchas & Edge Cases

Password Confusion — Two Different Passwords!

Normal Warning Messages (Not Errors — Ignore These)

Block Accepted ≠ Block Confirmed

validateaddress ismine: true is NOT 100% Reliable

sleep 10 Between Gateway and vLLM