A practical guide to running unbroker locally or on an isolated server, with consent, local dossiers, browser/email modes, DROP, human fallback, and re-scans.
How to Self-Host unbroker for Data Broker Removal
unbroker is a Hermes Agent skill for finding exposed personal information on broker and people-search sites, filing removals where possible, recording outcomes, and checking later. The useful self-host pattern is simple: keep the dossier local, record consent, disclose only required fields, and treat every removal as unconfirmed until a re-scan verifies it.
Primary sources
- unbroker README on GitHub
- unbroker SKILL.md
- California DROP official page
- California Data Broker Registry
- BADBOOL broker opt-out list
What you are hosting
The upstream skill splits the work in two. A Python CLI owns deterministic state: setup, dossiers, broker data, planning, ledger, drafts, reports, verification polling, and the next action queue. The agent handles scanning, browser steps, email work, and scheduled re-checks.
That split is the reason it is interesting. You are not asking a model to invent a privacy cleanup process. You are asking it to follow a queue and write down what happened.
Local Mac or VPS
| Host | Good for | Watch |
|---|---|---|
| Local Mac Studio | Personal runs, browser/email sessions, local dossiers, lower hosting surface | Machine must stay awake for long runs and scheduled re-checks |
| Temporary VPS | Clean isolated run, separate IP, no local machine dependency | Encrypt storage, avoid backups, restrict SSH, delete workspace after report |
| Shared hosted app | Convenience | Highest PII custody burden. Do not start here. |
Basic run shape
Install the skill through Hermes, then start a fresh session and drive the workflow from the skill directory.
hermes skills install official/security/unbrokerIn the unbroker skill directory, the upstream docs use the Python CLI as the driver:
PDD="python3 scripts/pdd.py"The exact commands can change upstream. Check the official README and SKILL.md before a real run. The stable operating pattern is:
- Run setup and doctor.
- Create a subject only after consent.
- Refresh broker data.
- Generate the next action queue.
- Execute scan, opt-out, email, browser, or verification tasks.
- Record each result in the ledger.
- Repeat until there is nothing useful to do now.
- Present the human-task digest and schedule re-checks.
$PDD setup --auto
$PDD doctor
$PDD intake --full-name "Example Person" --email "example@example.com" --city "City" --state "CA" --consent
$PDD refresh-brokers
$PDD next <subject_id>
$PDD tasks <subject_id>
$PDD status <subject_id>Consent and intake
The intake data is sensitive by design. It can include names, aliases, emails, phones, current location, and prior locations. Keep it narrow. If a broker cannot search by a field, do not collect it just because it feels complete.
- Record consent before any scan.
- Use a separate workspace per subject.
- Use opaque subject IDs, not names in folder paths.
- Prefer local storage and at-rest encryption where available.
- Do not store government ID unless a specific human-only task requires it.
Email and browser mode
Removal flows often need email verification. The upstream skill supports browser-mode email and programmatic SMTP/IMAP. Browser mode avoids storing a mail password, but it requires a logged-in browser session. Programmatic email is more automated, but the credential surface is larger.
| Mode | Use when | Risk |
|---|---|---|
| Browser webmail | You can drive a logged-in browser and do not want stored mail credentials | Needs local browser control and careful session separation |
| SMTP and IMAP | You want hands-off sending and verification polling | Stores or exposes mail credentials to the run environment |
| Draft only | You want no mail automation | More manual work and slower verification |
DROP first for California
California DROP changes the route for California residents. The state tool lets a resident submit a single deletion request to registered brokers. Starting August 1, 2026, data brokers must process DROP deletion requests on the state schedule.
That means a good California run should not only crawl people-search sites. It should file or track DROP, then work people-search parent clusters, standalone opt-outs, email rights requests, blocked tasks, and re-scans.
What to measure
The value of a run is not only "submitted forms." The useful data is the removal graph.
| Output | Why it matters |
|---|---|
| Broker exposure found | Shows which sites actually expose the person. |
| Removal route used | Builds a reusable playbook for future runs. |
| Fields disclosed | Measures the privacy cost of each opt-out. |
| Human fallback reason | Shows what cannot be safely automated. |
| Verification state | Stops cosmetic removal claims from becoming false reports. |
| Next re-check date | Turns one-time cleanup into monitoring. |
What not to do
- Do not run scans on a person who did not authorize it.
- Do not treat "hidden from free search" as "deleted."
- Do not defeat hard anti-bot gates.
- Do not keep dossiers longer than the run or monitoring plan requires.
- Do not report public-record removal unless the source record actually changed.
For the broader product angle, read Data Broker Exposure Is the Privacy Work Queue. For the measurement model, read What Data Broker Removal Runs Should Teach Cunicula. For the structured Cunicula record, see unbroker.
Frequently Asked Questions
Can unbroker be self-hosted?
Yes. unbroker is an optional Hermes Agent skill with a deterministic Python CLI. It can run from a local machine or isolated server, with local dossiers, a ledger, email/browser steps, and scheduled re-scans.
Should unbroker run on a local Mac or a VPS?
A local Mac is the cleaner default because personal identifiers stay closer to the operator and browser/email sessions are easier to control. A VPS can work for isolated temporary runs, but it needs tighter storage, access, logging, and deletion discipline.