fix: bulletproof competitor scraper — 4-tier fallback chain
Tier 1-3: HTTP with Chrome/Firefox/Safari UAs + full browser headers Tier 4: Gemini + Google Search grounding (bypasses everything) - Dead URLs (404): skips straight to Gemini, finds product via Google - Cloudflare/CAPTCHA: detected and routed to Gemini - JS-rendered pages: Gemini reads them via Google's infrastructure - Updated default competitor URL to Vitabiotics (works direct) Tested against: - H&B dead URL (404) → Gemini found full product data - Boots (Cloudflare) → Gemini returned £4.00, 4.6★, 8 bullets - Vitabiotics → direct Chrome scrape, 9 bullets - Amazon (CAPTCHA) → Gemini grounding fallback
This commit is contained in:
@@ -80,7 +80,7 @@
|
||||
<div class="input-row">
|
||||
<div class="input-group">
|
||||
<label>COMPETITOR PRODUCT URL</label>
|
||||
<input type="url" id="demoB-url" placeholder="https://www.competitor.com/product..." value="https://www.hollandandbarrett.com/shop/product/holland-barrett-vitamin-d3-tablets-25ug-1000-i-u--60001496">
|
||||
<input type="url" id="demoB-url" placeholder="https://www.competitor.com/product..." value="https://www.vitabiotics.com/products/ultra-vitamin-d-1000iu">
|
||||
</div>
|
||||
<button class="btn-gen blue" id="demoB-btn" onclick="runDemoB()">🔍 X-Ray This Competitor</button>
|
||||
</div>
|
||||
|
||||
Reference in New Issue
Block a user