- Python 99.3%
- Shell 0.7%
| cron | ||
| tests | ||
| .gitignore | ||
| check_updates.py | ||
| ensure_shim.py | ||
| LICENSE | ||
| README.md | ||
| requirements-dev.txt | ||
| shim.py | ||
| start_shim.sh | ||
camofox-firecrawl-shim
A small Firecrawl-compatible shim that lets Hermes route web_extract, web_search, and web_crawl through a local Camofox server without modifying Hermes source.
This repository contains only the shim code, compatibility tests, helper scripts, and setup instructions. It does not include a Camofox installation, browser binaries, vendored dependencies, or host-specific credentials.
What this solves
Hermes browser automation can already target Camofox through CAMOFOX_URL, but Hermes web tools still speak to a Firecrawl backend. This shim bridges that gap by exposing the subset of the Firecrawl v2 API that Hermes actually uses and fulfilling it with a local Camofox instance.
In practice this gives you a setup like:
- Hermes browser tools -> Camofox
- Hermes web tools -> this shim -> Camofox
No Hermes source patching required.
Implemented endpoints
The shim currently implements:
GET /healthGET /v2/healthPOST /v2/scrapePOST /v2/searchPOST /v2/mapPOST /v2/crawlGET /v2/crawl/:idDELETE /v2/crawl/:id
That is enough for the validated Hermes paths:
web_extractweb_searchweb_mapweb_crawl
Repository layout
shim.py— Firecrawl-compatible shim serverensure_shim.py— local launcher/health-check wrappercheck_updates.py— lightweight version drift report for the shim stackstart_shim.sh— canonical manual startup wrappertests/test_shim_compat.py— SDK-level compatibility tests against a fake Camofox server and local fixture sitecron/ensure_camofox_firecrawl_shim.py— copy of the launcher script suitable for Hermes cron script execution
Requirements
You need:
- Python 3.11+
- a working local Camofox server
- Hermes configured to use a Firecrawl backend
pandocavailable for HTML -> Markdown conversion- optional: Hermes cron support for self-healing restart behavior
For tests, you also need:
pytestfirecrawl-py
Install Camofox separately
This repository does not install Camofox for you.
A typical source checkout flow looks like:
git clone https://github.com/jo-inc/camofox-browser /opt/hermes-runtime/camofox-browser
cd /opt/hermes-runtime/camofox-browser
npm install
If you are in a rootless Linux runtime, you may also need to stage shared libraries and expose them through LD_LIBRARY_PATH before starting the Camofox server.
Example runtime defaults used by this shim:
- Camofox server dir:
/opt/hermes-runtime/camofox-browser - Camofox URL:
http://127.0.0.1:9377 - shim URL:
http://127.0.0.1:33879 - rootless browser libs:
/opt/hermes-runtime/camofox-deps/root/usr/lib/x86_64-linux-gnu
Treat those as examples, not hard requirements. All important paths are configurable by environment variables.
Manual shim startup
Prefer the wrapper so manual runs default to the canonical maintained endpoint:
./start_shim.sh
Equivalent direct launch:
CAMOFOX_FIRECRAWL_SHIM_HOST=127.0.0.1 \
CAMOFOX_FIRECRAWL_SHIM_PORT=33879 \
python3 shim.py
Default listen address:
127.0.0.1:33879
Shim configuration
Core settings:
CAMOFOX_URL— upstream Camofox URL, defaulthttp://127.0.0.1:9377CAMOFOX_FIRECRAWL_SHIM_HOST— bind host, default127.0.0.1CAMOFOX_FIRECRAWL_SHIM_PORT— bind port, default33879CAMOFOX_FIRECRAWL_SHIM_TIMEOUT— upstream request timeout, default60CAMOFOX_FIRECRAWL_SHIM_WAIT_TIMEOUT_MS— scrape/search wait timeout, default12000CAMOFOX_FIRECRAWL_CRAWL_WAIT_TIMEOUT_MS— crawl page wait timeout, default4000CAMOFOX_FIRECRAWL_CRAWL_TIME_BUDGET_SECONDS— overall crawl time budget, default75CAMOFOX_FIRECRAWL_SEARCH_URL— search page template, default DuckDuckGo HTML searchCAMOFOX_FIRECRAWL_PANDOC_BIN— pandoc launcher, default/opt/hermes-runtime/tools/mise/use-mise.shCAMOFOX_FIRECRAWL_PANDOC_TIMEOUT— pandoc timeout in seconds, default30
Lazy-start settings for Camofox:
CAMOFOX_SERVER_DIRCAMOFOX_SERVER_COMMANDCAMOFOX_SERVER_START_TIMEOUTCAMOFOX_SERVER_LOGCAMOFOX_SERVER_ERROR_LOGCAMOFOX_SERVER_LD_LIBRARY_PATH
Point Hermes at the shim
Set Hermes so its Firecrawl client talks to the shim instead of a separate Firecrawl instance.
Typical environment values:
CAMOFOX_URL=http://127.0.0.1:9377
FIRECRAWL_API_URL=http://127.0.0.1:33879
If you are changing Hermes config programmatically, use Hermes' supported config writer rather than directly editing protected environment files.
Self-healing restart behavior
This setup uses two layers:
- the shim lazy-starts Camofox if the upstream browser server is missing
- a small launcher script can restart the shim itself if the shim is down
That is what ensure_shim.py is for.
Run it locally:
python3 ensure_shim.py
Behavior:
- if the shim is already healthy, it exits cleanly
- if the shim is missing, it starts
python3 shim.py - it waits until
/healthreports success
Hermes cron setup
If you want the shim to recover automatically after environment restarts, place the launcher script in your Hermes scripts directory and schedule it with Hermes cron.
This repository includes a cron-suitable copy at:
cron/ensure_camofox_firecrawl_shim.py
A practical Hermes cron job is:
- name:
ensure camofox firecrawl shim - schedule:
every 1m - script:
ensure_camofox_firecrawl_shim.py
The logic is simple:
- shim up -> do nothing
- shim down -> start it
- shim up but Camofox down -> the shim can lazy-start Camofox on demand
Update checks
check_updates.py reports:
- the current Camofox checkout tag and commit
- the remote default-branch head commit
- the latest remote tag
npm outdated --jsonfrom the Camofox checkout- installed Firecrawl Python SDK version vs latest PyPI version
Run it with:
python3 check_updates.py
It does not auto-upgrade anything. It is just an inspection/reporting tool.
Tests
Run the compatibility suite in an environment that has pytest and firecrawl-py installed:
pytest -q tests/test_shim_compat.py
What the tests verify:
- Firecrawl SDK compatibility for
scrape(),search(),map(), andcrawl() - redirect unwrapping and ad filtering for search results
- crawl traversal semantics
- shim behavior against a deterministic fake Camofox server and local fixture pages
Design notes
Markdown conversion
The shim uses pandoc for HTML -> Markdown conversion and falls back to plain text extraction if pandoc fails or times out.
Search backend
The default search implementation uses a browser-rendered DuckDuckGo HTML search page through Camofox and normalizes results into a Firecrawl-like response.
Crawl model
Crawls are asynchronous jobs:
POST /v2/crawlreturns a job id quickly- a worker thread performs traversal and scraping
GET /v2/crawl/:idreturns status and resultsDELETE /v2/crawl/:idremoves job state
Known limitations
- Hermes still blocks private/internal URLs before the request reaches the shim
- Google SERP access is still affected by your egress IP / proxy quality
- This shim targets the Firecrawl API surface Hermes uses today, not full Firecrawl parity
- If Hermes or the Firecrawl SDK changes its required API shape, the shim may need updates
Publishing and safety notes
This repository is intended to be publishable without credentials.
It should contain:
- shim source code
- compatibility tests
- launcher and maintenance scripts
- setup documentation
It should not contain:
- browser binaries
- staged system libraries
- local logs
- cache directories
- tokens, passwords, or environment dumps
Suggested bootstrap sequence
- Install and verify Camofox separately
- Start the Camofox server and confirm
http://127.0.0.1:9377/health - Start this shim and confirm
http://127.0.0.1:33879/health - Point Hermes
FIRECRAWL_API_URLat the shim - Run at least one Hermes-side extract/search validation
- Install the cron launcher if you want restart resilience
License
MIT