Freorit

Open Source & Security Practitioner
Privacy Docker Selfhosting GDPR Overview approx. 12 min read Beginner to Intermediate Updated: 2025

SearXNG: Search Without Tracking or Profiling

This is not a step-by-step guide. It explains why a self-hosted metasearch engine improves privacy structurally — not as a promise, but through technical architecture. Topics: how it works, privacy advantages, hosting stack, global configuration, and deployment as a company-wide search engine.

What Is a Metasearch Engine?

A metasearch engine is not a search engine in the classical sense — it does not crawl the web or build its own index. Instead, it sends a query simultaneously to multiple existing search engines, aggregates their results, and returns them in a unified view. The user talks to SearXNG, not to Google or Bing.

SearXNG supports over 70 actively configured sources in its default setup — the full list includes up to 242 search services. These range from general web search engines to specialised sources for news, code, maps, images, and academic literature.

An advantage that is often overlooked: not every search engine indexes the same content. Google ranks pages by its own signals, Brave searches an independent index it crawls itself, DuckDuckGo draws from different sources. Querying several simultaneously gives broader result coverage — results missing from one provider appear through another.

The key point: SearXNG has no direct relationship with any single provider. Queries are forwarded in aggregated, anonymised form — the underlying search provider sees only the SearXNG server's request, not the user behind it.

Why This Matters

Using only Google gives one company full visibility into every search query. Using SearXNG with six configured sources distributes requests simultaneously — no single provider sees the complete search behaviour.

The Privacy Problem with Commercial Search Engines

Search engines like Google and Bing offer their services for free — the business model is based on personalised advertising profiles. Data collection goes well beyond the obvious search history.

What Commercial Search Engines Know About You

Complete search history and temporal context of all queries; click patterns and dwell-time-based interest profiles; digital fingerprinting derived from screen resolution, installed fonts, browser configuration, and device properties; tracking pixels in linked pages that transmit data even without a click.

Digital fingerprinting is the most persistent problem: unlike cookies, it cannot be cleared by deleting browser storage. It creates a cross-device, cross-session profile that persists indefinitely. In December 2024, Google announced it would further loosen its fingerprinting restrictions.

For organisations, there is an additional dimension: employees using commercial search engines hand their research activity to external third parties. This conflicts with the data minimisation principle under Art. 5 GDPR — even when the information being searched is not sensitive in itself.

SearXNG breaks this model structurally: no logging, no tracking pixels, no fingerprinting, no user profiles, no forwarding of cookies to underlying search engines. Queries are proxied through the SearXNG server — search engines see its IP address, not the user's.

SearXNG: Open Source, No Tracking, No Profiling

SearXNG is a fork of the original Searx project, actively maintained by the community. The licence is AGPL-3.0 — the source code is fully auditable. Hidden data collection is structurally impossible.

Technical Privacy Features

Public instances are listed at searx.space. For maximum control — especially in an enterprise context — a self-hosted, internally accessible instance is the more secure choice.

Hosting Architecture: Proxmox, Debian, Docker

One concrete way to run SearXNG: a Debian VM on a Proxmox hypervisor, with a Docker Compose stack comprising three services.

Proxmox Host
  └── Debian VM
└── Docker Compose Stack
  ├── Caddy (Reverse Proxy)             ← TLS, security headers, compression
  ├── SearXNG (locally accessible)      ← metasearch engine, Prometheus metrics
  └── Valkey (Redis-compatible cache)   ← query cache, snapshots every 30s

Why This Combination

Data Flow

How a search query travels through the stack:

SearXNG data flow: request and return route via Valkey cache Browser Browser (Client) on the internal network Caddy Caddy · Reverse Proxy TLS · Security Headers Browser->Caddy  HTTP · request Caddy->Browser search results   SearXNG SearXNG POST · Image Proxy · Prometheus Caddy->SearXNG  proxy (local) SearXNG->Caddy response   Valkey Valkey Cache  (Redis fork) Hit: direct return · Snapshots SearXNG->Valkey  cache lookup Valkey->SearXNG hit / stored   Engines Bing · DuckDuckGo · Startpage Brave · Google · Yahoo Valkey->Engines  POST (cache miss) Engines->Valkey stored  

Valkey sits between SearXNG and the search engines. On a cache hit, the return path ends at Valkey — no search engines are contacted at all. On a cache miss, Valkey stores the aggregated results before they travel back through SearXNG and Caddy to the browser.

Configuration: Global Settings and Branding

The central configuration file for SearXNG is settings.yml. It is the single source of truth for all users of the instance — no per-device configuration, no browser extensions required.

settings.yml: One Setting Applies to Everyone

Unlike a browser plugin that each user must individually install and configure, settings.yml takes effect immediately for all devices accessing the instance. Privacy settings are therefore not dependent on individual user discipline.

Key Settings at a Glance

Configured Search Sources

This setup has six sources active simultaneously: Bing, Brave, DuckDuckGo, Google, Startpage, and Yahoo. Their results are merged and deduplicated — no single provider determines the ranking alone.

SearXNG as a Company-Wide Search Engine

The most straightforward application for SMEs: a single SearXNG instance deployed as the default search engine for all workstations — via browser policy or MDM. Every employee searches privately without any individual setup.

A Company-Wide Search Engine Without Tracking
  • Company branding: Set instance_name to the company name — employees see a company-branded search engine.
  • GDPR-compliant by design: No employee search queries leave company infrastructure (when restricted to VPN/intranet); data minimisation under Art. 5 GDPR is satisfied automatically.
  • No licence costs: AGPL-3.0, self-hosted, runs on a single small VM.
  • Centrally managed: Settings in settings.yml take effect immediately for all users — no rollouts, no device configuration.

Monitoring Without Profiling

SearXNG supports Prometheus metrics: query volume, response times per search source, error rates — aggregated and anonymous. Not a single search query's content is stored. Operators see whether the instance is running stably, not what was searched.

For comparison: a SearXNG instance on a EUR 3–5/month VPS or an internal VM replaces paid enterprise search subscriptions while eliminating employee search profiling by commercial providers.

Network Security: WireGuard and Tailscale

In this setup, SearXNG is not publicly accessible — access is restricted to members of the WireGuard or Tailscale network. This is a deliberate choice.

Even though SearXNG logs nothing, a passive network observer could detect that a client regularly contacts a search service. WireGuard and Tailscale eliminate this visibility: all traffic between client and server is encrypted — ISPs and network monitors see only encrypted data packets.

WireGuard and Tailscale Overview

The result is a layered privacy architecture: SearXNG (no logging) at the application layer, WireGuard/Tailscale (encrypted transport) at the network layer, and Proxmox isolation at the infrastructure layer.

WireGuard and Tailscale in Separate Guides

Concrete setup guides for WireGuard and Headscale are planned as standalone howtos. This article describes how the components work together — not the installation process.

Summary

  • SearXNG aggregates over 70 sources simultaneously — with no data collection of its own, no tracking pixels, no fingerprinting.
  • Commercial search engines build user profiles. Digital fingerprinting is persistent and cannot be cleared — SearXNG breaks this model structurally.
  • settings.yml sets global defaults. Infinite scroll, image proxy, POST method, branding — all centrally configured, effective immediately for every user.
  • Ideal for SMEs. A company-wide search engine with no licence costs, GDPR-compliant by design, with optional monitoring.
  • WireGuard and Tailscale strengthen privacy at the network layer — search traffic remains encrypted between client and server.
Questions About a Specific Setup?

This article is an overview — a production setup has more moving parts. For specific questions about configuration, deployment, or enterprise use, reach me via the contact page.

Open-Source Projects Used