Apify vs Xooriq 2026: why data engineers are migrating to shared-cache
Apify books $21M ARR scraping 50k+ websites daily. But if you're a data engineer building B2B lead pipelines, the unit economics are catastrophic. This post shows the math — compute units, cache architecture, MCP, maintenance — that justifies migrating to Xooriq.
1. Apify pricing model: the compute-unit trap
Apify operates on a compute-units-based model. 1 CU = 1 GB-hour of processing. The Starter plan ($49/mo) includes 19 CU. Each additional CU costs $0.40. Source: Apify Pricing.
To scale a global B2B pipeline, the numbers get bad fast. An average actor consumes:
- Receita Federal CNPJ: ~1.5 CU per 1,000 records (HTML parsing + rate-limit retries)
- LinkedIn (community actor): ~3 CU per 1,000 profiles + residential proxy ($50/mo extra)
- Site institutional + decision-maker email: ~2 CU per 1,000 pages
For the full database of 13.3M Brazilian CNPJs (exactly what Xooriq offers), you'd need ~26,000 CU. Without bulk discounts: US$ 10,400 just in compute. Plus $50 proxy, $100 storage, and 40 hours/month of engineer time fixing broken actors.
2. Why community actors break so often
Apify has ~50,000 actors in the Apify Store. ~80% are community-built (not official). In tests we ran in April 2026:
Audit of 12 CNPJ actors (Apr 2026)
- 7 broken — Brazilian Receita Federal changed HTML in March 2026, actors not updated
- 3 partial — return only company name + CNPJ, missing capital or partners
- 2 working — but with 30%+ error rate on MEI (micro-entrepreneur) CNPJs
Apify doesn't guarantee maintenance. You buy the actor, and when it breaks, you open a GitHub issue with the author (who may never respond). For serious businesses, this is purchased technical debt.
3. Xooriq FoxScraper: shared-cache as unfair advantage
Xooriq uses FoxScraper — Central Fox Tecnologia's proprietary engine with shared PostgreSQL cache across customers. The fundamental design:
# Apify: each customer pays for the same crawl
Customer A runs actor → burns CU → gets data
Customer B runs actor → burns CU → gets data (SAME data)
Customer C runs actor → burns CU → gets data (SAME data)
Apify total cost = 3× compute
# Xooriq: 1 crawl, N queries
FoxScraper runs 1× → PostgreSQL cache (Fernet AES-128)
Customer A query → cache hit → $0.0006 latency
Customer B query → cache hit → $0.0006 latency
Customer C query → cache hit → $0.0006 latency
Marginal cost = O(1) per queryThat's what lets Xooriq charge $89/month flat with 87% operational margin — while a typical Apify reseller (someone offering Brazilian B2B data using Apify actors) operates at ~40% margin and must charge $400+/month.
4. Side-by-side comparison
| Dimension | Apify | Xooriq |
|---|---|---|
| Billing model | Compute units ($/CU) | Flat $89/month |
| Cost 10k leads/month | ~US$ 1,799 (with dev-time) | $89 (no dev) |
| Pre-curated company database | ❌ You crawl | ✅ 13.3M companies |
| Native MCP server | ❌ Third-party wrapper | ✅ mcp.xooriq.com |
| Maintenance when site changes | ❌ You fix it | ✅ Central Fox Tecnologia (24h SLA) |
| LGPD/GDPR compliance | ❌ Customer is controller | ✅ Xooriq is processor + DPA |
| Fiscal formats | ⚠️ Fragmented actors | ✅ CNPJ/GSTIN/EIN/VAT |
| Shared cache | ❌ Each customer pays alone | ✅ $0.0006/query |
5. When Apify still makes sense
We're not absolutists. Apify is the better choice when:
- You need to scrape a single exotic site Xooriq doesn't cover
- You're a growth agency with dedicated data engineers maintaining actors
- Low volume (< 1,000 leads/month) where the Starter plan doesn't blow through CU
6. Verified sources
- GetLatka — Apify $21M ARR
- Apify Pricing official
- Apify Compute Units docs
- Bright Data vs Apify analysis
Stop paying compute. Xooriq is $89/month flat.
13.3M companies · Native MCP · Shared cache · LGPD-by-design · No dev-time.
See Apify → Xooriq stack comparison