SearXNG: Search Without Tracking or Profiling
This is not a step-by-step guide. It explains why a self-hosted metasearch engine improves privacy structurally — not as a promise, but through technical architecture. Topics: how it works, privacy advantages, hosting stack, global configuration, and deployment as a company-wide search engine.
What Is a Metasearch Engine?
A metasearch engine is not a search engine in the classical sense — it does not crawl the web or build its own index. Instead, it sends a query simultaneously to multiple existing search engines, aggregates their results, and returns them in a unified view. The user talks to SearXNG, not to Google or Bing.
SearXNG supports over 70 actively configured sources in its default setup — the full list includes up to 242 search services. These range from general web search engines to specialised sources for news, code, maps, images, and academic literature.
An advantage that is often overlooked: not every search engine indexes the same content. Google ranks pages by its own signals, Brave searches an independent index it crawls itself, DuckDuckGo draws from different sources. Querying several simultaneously gives broader result coverage — results missing from one provider appear through another.
The key point: SearXNG has no direct relationship with any single provider. Queries are forwarded in aggregated, anonymised form — the underlying search provider sees only the SearXNG server's request, not the user behind it.
Using only Google gives one company full visibility into every search query. Using SearXNG with six configured sources distributes requests simultaneously — no single provider sees the complete search behaviour.
The Privacy Problem with Commercial Search Engines
Search engines like Google and Bing offer their services for free — the business model is based on personalised advertising profiles. Data collection goes well beyond the obvious search history.
Complete search history and temporal context of all queries; click patterns and dwell-time-based interest profiles; digital fingerprinting derived from screen resolution, installed fonts, browser configuration, and device properties; tracking pixels in linked pages that transmit data even without a click.
Digital fingerprinting is the most persistent problem: unlike cookies, it cannot be cleared by deleting browser storage. It creates a cross-device, cross-session profile that persists indefinitely. In December 2024, Google announced it would further loosen its fingerprinting restrictions.
For organisations, there is an additional dimension: employees using commercial search engines hand their research activity to external third parties. This conflicts with the data minimisation principle under Art. 5 GDPR — even when the information being searched is not sensitive in itself.
SearXNG breaks this model structurally: no logging, no tracking pixels, no fingerprinting, no user profiles, no forwarding of cookies to underlying search engines. Queries are proxied through the SearXNG server — search engines see its IP address, not the user's.
SearXNG: Open Source, No Tracking, No Profiling
SearXNG is a fork of the original Searx project, actively maintained by the community. The licence is AGPL-3.0 — the source code is fully auditable. Hidden data collection is structurally impossible.
Technical Privacy Features
- POST instead of GET: Search queries are sent via HTTP POST, not GET. Queries do not appear in the address bar, browser history, or server access logs.
- Image proxy: Images in search results are proxied through the SearXNG server. The user's browser makes no direct connection to third-party image CDNs — the user's IP remains hidden.
- tracker_url_remover: A plugin that strips UTM parameters and tracking tokens from all result URLs before they are shown to the user.
- No ads, no sponsored results: Pure search results, no monetisation layer.
- Tor instances: Public .onion instances exist for use cases requiring maximum anonymity.
Public instances are listed at searx.space. For maximum control — especially in an enterprise context — a self-hosted, internally accessible instance is the more secure choice.
Hosting Architecture: Proxmox, Debian, Docker
One concrete way to run SearXNG: a Debian VM on a Proxmox hypervisor, with a Docker Compose stack comprising three services.
Proxmox Host
└── Debian VM
└── Docker Compose Stack
├── Caddy (Reverse Proxy) ← TLS, security headers, compression
├── SearXNG (locally accessible) ← metasearch engine, Prometheus metrics
└── Valkey (Redis-compatible cache) ← query cache, snapshots every 30s
Why This Combination
- Proxmox as hypervisor: OS-level isolation between services; snapshots and backups in seconds; foundation for further homelab growth.
- Debian VM: Minimal, stable, well-documented — a proven base for Docker workloads.
- Docker Compose: Reproducible and portable; all three services
are defined in a single
docker-compose.ymland managed together. - Caddy as reverse proxy: Automatic TLS (when publicly accessible), security headers (X-Frame-Options DENY, X-Content-Type-Options nosniff, Referrer-Policy no-referrer), zstd/gzip compression.
- Valkey (Redis fork): Caches search queries so repeated searches are answered without an upstream call; persistence snapshots every 30 seconds.
Data Flow
How a search query travels through the stack:
Valkey sits between SearXNG and the search engines. On a cache hit, the return path ends at Valkey — no search engines are contacted at all. On a cache miss, Valkey stores the aggregated results before they travel back through SearXNG and Caddy to the browser.
Configuration: Global Settings and Branding
The central configuration file for SearXNG is settings.yml. It is the
single source of truth for all users of the instance — no per-device configuration,
no browser extensions required.
Unlike a browser plugin that each user must individually install and configure,
settings.yml takes effect immediately for all devices accessing
the instance. Privacy settings are therefore not dependent on individual
user discipline.
Key Settings at a Glance
- instance_name: The name of the instance — appears in the browser tab and as the page title. Can be set to the company name to create a branded internal search engine.
- method: POST: Search queries are transmitted via HTTP POST. Queries do not appear in server logs or browser history.
- image_proxy: true: All images in search results are proxied through SearXNG — the browser makes no direct connection to image hosts.
- infinite_scroll (plugin): The next results page loads automatically as the user scrolls — no pagination click required.
- tracker_url_remover (plugin): UTM parameters and tracking tokens are stripped from all result URLs.
- calculator, unit_converter, hash_plugin, hostnames: Additional active plugins — arithmetic directly in search, unit conversion, MD5/SHA256 on demand, prominent domain display in results.
Configured Search Sources
This setup has six sources active simultaneously: Bing, Brave, DuckDuckGo, Google, Startpage, and Yahoo. Their results are merged and deduplicated — no single provider determines the ranking alone.
SearXNG as a Company-Wide Search Engine
The most straightforward application for SMEs: a single SearXNG instance deployed as the default search engine for all workstations — via browser policy or MDM. Every employee searches privately without any individual setup.
- Company branding: Set
instance_nameto the company name — employees see a company-branded search engine. - GDPR-compliant by design: No employee search queries leave company infrastructure (when restricted to VPN/intranet); data minimisation under Art. 5 GDPR is satisfied automatically.
- No licence costs: AGPL-3.0, self-hosted, runs on a single small VM.
- Centrally managed: Settings in
settings.ymltake effect immediately for all users — no rollouts, no device configuration.
Monitoring Without Profiling
SearXNG supports Prometheus metrics: query volume, response times per search source, error rates — aggregated and anonymous. Not a single search query's content is stored. Operators see whether the instance is running stably, not what was searched.
For comparison: a SearXNG instance on a EUR 3–5/month VPS or an internal VM replaces paid enterprise search subscriptions while eliminating employee search profiling by commercial providers.
Network Security: WireGuard and Tailscale
In this setup, SearXNG is not publicly accessible — access is restricted to members of the WireGuard or Tailscale network. This is a deliberate choice.
Even though SearXNG logs nothing, a passive network observer could detect that a client regularly contacts a search service. WireGuard and Tailscale eliminate this visibility: all traffic between client and server is encrypted — ISPs and network monitors see only encrypted data packets.
WireGuard and Tailscale Overview
- WireGuard: A lean, modern VPN protocol; integrated directly into the Linux kernel; minimal attack surface; excellent performance.
- Tailscale: Built on WireGuard; zero-config deployment per device; mesh network without a central VPN server; end-to-end encrypted — not visible even to Tailscale itself.
- Headscale: A self-hosted alternative to Tailscale's control server — full control, no dependency on third-party infrastructure. Ideal for privacy-conscious organisations.
The result is a layered privacy architecture: SearXNG (no logging) at the application layer, WireGuard/Tailscale (encrypted transport) at the network layer, and Proxmox isolation at the infrastructure layer.
Concrete setup guides for WireGuard and Headscale are planned as standalone howtos. This article describes how the components work together — not the installation process.
Summary
- SearXNG aggregates over 70 sources simultaneously — with no data collection of its own, no tracking pixels, no fingerprinting.
- Commercial search engines build user profiles. Digital fingerprinting is persistent and cannot be cleared — SearXNG breaks this model structurally.
- settings.yml sets global defaults. Infinite scroll, image proxy, POST method, branding — all centrally configured, effective immediately for every user.
- Ideal for SMEs. A company-wide search engine with no licence costs, GDPR-compliant by design, with optional monitoring.
- WireGuard and Tailscale strengthen privacy at the network layer — search traffic remains encrypted between client and server.
This article is an overview — a production setup has more moving parts. For specific questions about configuration, deployment, or enterprise use, reach me via the contact page.