
Homelab Network Architecture: VLANs and DNS-Based Policy Routing
How I segment a homelab with VLANs and use DNS queries to route traffic through country-specific VPN tunnels — all driven by a YAML intent file and Jinja2 templates.
Most homelabs start as a flat network. A NAS, a few Docker containers, some IoT devices, maybe a Proxmox host — all sharing the same subnet, the same default gateway, the same path to the internet. It works fine until you start asking questions like "why can my smart plug talk to my backup server?" or "can I route BBC iPlayer through a UK exit without affecting everything else on this VLAN?"
These are the questions that turned my flat network into a VLAN-segmented, policy-routed setup running on an OpenWRT instance inside a Proxmox LXC. The result is a system where adding a new domain to a specific VPN tunnel is a one-line YAML change, and the entire network configuration is rendered from templates — no manual edits on the router, ever.
This post walks through the architecture: why VLANs, how DNS-based policy routing works at a packet level, and the intent-driven system that ties it all together.
The VLAN Layout
The network is segmented into seven VLANs, each with a dedicated purpose and a default egress path. The OpenWRT instance acts as the internal gateway, routing between VLANs and making egress decisions.
| VLAN | Subnet | Purpose | Default Egress |
|---|---|---|---|
| 12 — INFRA | 172.16.12.0/24 | Core services (DNS, proxy, NAS, monitoring) | WAN |
| 20 — LAN | 172.16.20.0/24 | Workstations and general clients | VPN (UK) |
| 30 — VPN UK | 172.16.30.0/24 | Dedicated UK VPN clients | VPN (UK) |
| 31 — VPN US | 172.16.31.0/24 | Dedicated US VPN clients | VPN (US) |
| 32 — VPN CH | 172.16.32.0/24 | Dedicated CH VPN clients | VPN (CH) |
| 40 — IOT | 172.16.40.0/24 | IoT devices (isolated, no inter-VLAN) | WAN |
| 90 — LAB | 172.16.90.0/24 | Testing and experiments | WAN |
The physical topology is simple: a single trunk link from Proxmox to the OpenWRT LXC carries all VLANs as tagged traffic, with INFRA as the native untagged VLAN. OpenWRT sees each VLAN as an eth0.X interface and handles all inter-VLAN routing, DHCP, and DNS.
Each VPN VLAN has a corresponding WireGuard tunnel. VLANs 30, 31, and 32 have dedicated country-specific exits — any device placed on VLAN 30 will have all its internet traffic routed through the UK WireGuard tunnel by default. This is source-based routing, and it works well for devices that need all traffic through a specific country.
But source-based routing is a blunt instrument. What if a device on the main LAN (VLAN 20) needs BBC iPlayer through the UK tunnel but everything else direct? What if an infrastructure service on VLAN 12 needs to reach a specific external API through Switzerland? You can't solve this with source-based rules alone — you need per-destination routing. And that's where DNS-based policy routing comes in.
The Problem: Per-Domain Routing
Source-based VPN routing works like this: traffic from 172.16.30.0/24 gets an ip rule that sends it to routing table 100, which has a default route via the UK WireGuard tunnel. Clean, predictable, and entirely inflexible.
The moment you need a single device to route some traffic through a VPN and the rest directly, source-based rules fall apart. You'd need to either move that device to a VPN VLAN (losing direct access to everything else) or start maintaining static routes for every destination IP (which change constantly for CDN-heavy services).
DNS-based policy routing sidesteps this entirely. Instead of routing based on who is sending traffic, it routes based on where the traffic is going — and it learns those destinations from DNS queries in real time. When a client resolves bbc.co.uk, the router learns the resulting IP addresses and marks subsequent packets to those IPs for VPN routing. No static route maintenance, no per-device configuration.
The key insight is that DNS resolution happens before the first packet is sent. By the time a client makes an HTTP request to a resolved IP, the router already knows which routing table that IP belongs to.
How DNS-Based Policy Routing Works
The system operates across three layers, each handling a different part of the routing decision. Understanding each layer separately makes the whole system approachable.
Layer 1: DNS Resolution and nftset Tagging
dnsmasq on the OpenWRT router is the DNS resolver for all VLANs. When it resolves a domain that appears in the intent configuration, it uses the nftset directive to dynamically add the resolved IP to an nftables set.
# dnsmasq nftset directives (auto-generated from intent YAML)
nftset=/bbc.co.uk/4#inet#fw4#uk_dst4
nftset=/ssl.bbc.co.uk/4#inet#fw4#uk_dst4
nftset=/hulu.com/4#inet#fw4#us_dst4
nftset=/srf.ch/4#inet#fw4#ch_dst4When a client queries bbc.co.uk and dnsmasq resolves it to, say, 151.101.0.81, that IP is immediately added to the uk_dst4 nftables set. The set has a configurable timeout (6 hours by default), after which entries age out — so stale CDN IPs don't persist indefinitely.
This is the entire mechanism by which the router "learns" which IPs should be VPN-routed. No static IP lists, no manual updates — just DNS resolution triggering set membership.
Layer 2: Packet Marking in nftables
With the nftables sets populated by dnsmasq, the next layer marks packets based on destination set membership. This happens in a prerouting mangle chain:
# Internal destinations are never policy-routed
set rfc1918_dst4 {
type ipv4_addr
flags interval
elements = { 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16 }
}
# Intent destination sets (populated by dnsmasq)
set uk_dst4 {
type ipv4_addr
flags interval, timeout
timeout 6h
}
chain pbr_intent_prerouting {
type filter hook prerouting priority mangle; policy accept;
# Hard exemption: RFC1918 destinations stay local
iifname "eth0.20" ip daddr @rfc1918_dst4 return
# Default: no mark (0x0)
iifname "eth0.20" meta mark set 0x00000000
# WAN-exempt override (captive portals, connectivity checks)
iifname "eth0.20" ip daddr @wan_dst4 meta mark set 0x000000FE
# Geo overrides (last writer wins)
iifname "eth0.20" ip daddr @uk_dst4 meta mark set 0x00000064
iifname "eth0.20" ip daddr @us_dst4 meta mark set 0x00000065
iifname "eth0.20" ip daddr @ch_dst4 meta mark set 0x00000066
}The ordering matters. RFC1918 destinations are exempted first (inter-VLAN traffic must never enter a VPN tunnel). Then traffic gets a default zero mark. WAN-exempt domains get a special mark (0xFE) that ensures they bypass VPN even for VLANs whose default egress is a VPN tunnel. Finally, geo-specific marks override everything else.
Layer 3: Policy Routing via ip rule
The final layer maps fwmarks to routing tables:
# ip rule list (abbreviated)
9983: fwmark 0xfe lookup main # WAN-exempt → direct
9984: from 172.16.20.0/24 lookup 100 # VLAN 20 default → UK
9985: from 172.16.30.0/24 lookup 100 # VLAN 30 → UK
9986: from 172.16.31.0/24 lookup 101 # VLAN 31 → US
9987: from 172.16.32.0/24 lookup 102 # VLAN 32 → CH
9990: fwmark 0x64 lookup 100 # UK intent mark → UK table
9991: fwmark 0x65 lookup 101 # US intent mark → US table
9992: fwmark 0x66 lookup 102 # CH intent mark → CH tablePriority ordering is critical. The WAN-exempt fwmark rule at priority 9983 sits above the source-based VPN rules (9984–9987). This means that even for VLAN 30 — which defaults all traffic to the UK VPN — domains marked as WAN-exempt (like captive portal detection endpoints) will still go direct. Without this ordering, devices on VPN VLANs would break on captive portal networks.
The intent fwmark rules (9990–9992) sit below the source-based rules, acting as overrides for traffic that dnsmasq has specifically tagged. A packet from VLAN 12 (INFRA, which defaults to WAN) going to an IP in uk_dst4 will be marked 0x64 by nftables, skip the source-based rules (no match for VLAN 12 → VPN), and hit the fwmark rule at priority 9990, routing it through the UK tunnel.
Each routing table has a simple default route:
# Table 100 (UK)
default via WireGuard vpn_uk interface
172.16.0.0/12 via main table # keep internal traffic local
192.168.3.0/24 via upstream GW # upstream LAN accessIntent-Based Domain Routing
The entire per-domain routing system is driven by a single YAML file. Adding a domain to a VPN tunnel is a one-line change:
intent:
nft_family: inet
nft_table: fw4
apply_iifnames:
- eth0.12 # INFRA
- eth0.20 # LAN
- eth0.30 # VPN UK
- eth0.40 # IOT
- eth0.90 # LAB
set_timeout: 6h
regions:
wan:
set_name: wan_dst4
fwmark_hex: "0x000000FE"
domains:
- msftconnecttest.com
- connectivity-check.ubuntu.com
- captive.apple.com
uk:
set_name: uk_dst4
fwmark_hex: "0x00000064"
domains:
- streaming-service.co.uk
- cdn.streaming-service.co.uk
us:
set_name: us_dst4
fwmark_hex: "0x00000065"
domains:
- us-streaming.com
- api.us-streaming.com
ch:
set_name: ch_dst4
fwmark_hex: "0x00000066"
domains:
- news-site.ch
- indexer-api.example.comEach region defines an nftables set name, a fwmark value, and a list of domains. The apply_iifnames field controls which VLAN interfaces are subject to intent routing — you wouldn't typically include the VPN-dedicated VLANs here since they already route everything through VPN by default, but you can if you want WAN-exempt overrides on those VLANs.
The wan region is special. It uses a non-zero fwmark (0xFE) specifically so the WAN-exempt ip rule can sit above the source-based VPN rules. If it used 0x0 (unmarked), there'd be no way to distinguish "intentionally WAN-exempt" from "just unmarked" — and the source-based rules would catch it.
Want to route a new streaming service through the US tunnel? Add one line under us.domains, push to git, and CI/CD renders and deploys the updated dnsmasq and nftables config. No SSH, no manual edits, no restart scripts.
Template-Driven Configuration
Every config file on the router is generated by Jinja2 templates consuming two YAML sources: the network plan (VLANs, services, WireGuard tunnels) and the intent domains file.
The render pipeline is a Python script that loads both YAML files, enriches the data (calculating DHCP ranges, prefix lengths, and deriving cross-references), and renders templates:
templates/
├── network.j2 → /etc/config/network
├── firewall.j2 → /etc/config/firewall
├── dhcp.j2 → /etc/config/dhcp
├── dnsmasq.d/
│ ├── 02-local-dns.conf.j2 → /etc/dnsmasq.d/02-local-dns.conf
│ └── 90-intent-domains.conf.j2 → /etc/dnsmasq.d/90-intent-domains.conf
└── nftables.d/
├── 90-pbr-intent.nft.j2 → /etc/nftables.d/90-pbr-intent.nft
└── 91-wan-egress-guard.nft.j2 → /etc/nftables.d/91-wan-egress-guard.nft
The templates themselves are straightforward Jinja2 loops. The dnsmasq intent template, for instance, iterates over all regions and domains to produce the nftset directives:
{% for region_name, region in regions | dictsort %}
{% for d in region.domains | default([]) | sort %}
nftset=/{{ d }}/4#{{ fam }}#{{ table }}#{{ region.set_name }}
{% endfor %}
{% endfor %}On deploy, rendered configs are SCP'd to the router into a staging directory, validated with UCI syntax checks, and then atomically installed. Services restart in a specific order — network first, then dnsmasq, then firewall — because DNS needs working interfaces before it can bind, and the firewall needs both.
The CI/CD pipeline triggers on any push to main that touches the YAML sources, templates, or render scripts. A commit to intent-domains.yml adding a single domain will render all configs, SCP them to the router, and apply them within the same pipeline run.
Safety Rails
Running DNS-based routing on a real network means you need guardrails. Three mechanisms prevent traffic from ending up somewhere it shouldn't.
WAN Egress Guard
The most important safety rail: an nftables forward chain that prevents VPN-marked traffic from leaking to WAN if the VPN tunnel goes down.
chain pbr_wan_egress_guard {
type filter hook forward priority filter; policy accept;
# Allow WAN-exempt traffic
iifname "eth0.20" oifname "eth0.3" meta mark 0x000000FE accept
# Drop anything else with a non-zero mark heading to WAN
iifname "eth0.20" oifname "eth0.3" meta mark != 0x00000000 counter drop
}If the UK WireGuard tunnel drops and the kernel falls through to the default route (WAN), this chain catches VPN-marked packets and drops them rather than letting them exit unencrypted. The counter is essential for debugging — a rising drop count tells you a tunnel is down.
DNS Enforcement
All VLANs are forced to use the OpenWRT router as their DNS resolver. Direct DNS to the WAN (public resolvers like 8.8.8.8) is blocked by firewall rules, and direct DNS to the upstream DNS servers is also blocked — clients must go through dnsmasq so the nftset tagging works.
# Firewall rules (per VLAN)
Block-lan_main-DNS-to-WAN → REJECT
Block-lan_main-DNS-to-Upstream → REJECT
Without DNS enforcement, a client using 1.1.1.1 as its DNS resolver would bypass dnsmasq entirely, the nftsets would never be populated for that client's traffic, and intent routing would silently fail. DNS enforcement makes the system airtight.
RFC1918 Exemption
The nftables prerouting chain has a hard return rule for RFC1918 destinations before any marking happens. Inter-VLAN traffic (e.g., a client on VLAN 20 accessing a NAS on VLAN 12) must never enter a VPN tunnel, even if the packet's source VLAN has a default VPN egress.
This is backed up by dedicated ip rule entries at priority 49–56 that send RFC1918-destined traffic to the main routing table before any VPN rules are evaluated:
# Priority 49–56: internal bypass (before intent marks at 9983+)
49: from 172.16.12.0/24 to 172.16.0.0/12 lookup main
50: from 172.16.20.0/24 to 172.16.0.0/12 lookup mainTwo layers of protection for the same thing — belt and braces.
Lessons Learned
This system has been running in production for several months, and a few non-obvious issues came up during development.
nftset timeout tuning. The default 6-hour timeout works for most services, but CDN-heavy domains (streaming services) rotate IPs frequently. Too short a timeout means DNS lookups that happen during a stream won't keep the set populated, and mid-session IP changes can break routing. Too long and stale IPs accumulate. Six hours is a reasonable middle ground, but it's worth monitoring set sizes with nft list set inet fw4 uk_dst4 to catch bloat.
Firewall forwarding rules for intent traffic. This was the most confusing issue to debug. Intent-routed traffic from VLAN 20 destined for the UK VPN needs a config forwarding rule from lan_main to vpn_uk in the firewall config. Without it, the OpenWRT firewall (fw4) rejects the forwarded packet at the zone level, and the symptom is bizarre: curl fails in 0ms with a TCP RST that the router itself generates. The clue is that nothing appears on a tcpdump of the VPN interface — because the packet never makes it past the firewall's forward chain.
Priority ordering in ip rules. Getting the priority numbers right took iteration. The internal bypass rules must come before everything (priority 49–56). WAN-exempt must come before source-based VPN rules (9983 vs 9984–9987). Intent fwmark rules come last (9990–9992). Getting this wrong doesn't cause obvious errors — it causes wrong routing, which is much harder to spot.
Counters for debugging. Every nftables rule in the intent chain has a named counter. When something isn't routing correctly, the first thing I check is nft list chain inet fw4 pbr_intent_prerouting — the counter names show exactly which rule matched for a given interface. If intent_uk_eth0_20 is incrementing but the traffic isn't reaching the VPN, the problem is downstream (ip rule or routing table). If it's not incrementing, the problem is upstream (dnsmasq didn't populate the set, or the DNS query didn't match). Named counters turn a black-box routing problem into a traceable pipeline.
WAN-exempt needs a non-zero mark. This is subtle. Initially I used mark 0x0 for WAN-exempt traffic, reasoning that "unmarked = direct" made sense. But unmarked traffic from a VPN VLAN still matches the source-based ip rule and gets routed through VPN. The fix was giving WAN-exempt its own mark (0xFE) and a dedicated ip rule at a higher priority. Non-zero marks are the only way to pre-empt source-based rules.
The full system — VLANs, WireGuard tunnels, DNS-based intent routing, and template-driven config — runs on a single OpenWRT LXC consuming around 128MB of RAM. Every config file is generated from two YAML files and a set of Jinja2 templates. The CI/CD pipeline renders, deploys, and applies changes on every push. Adding a domain to a VPN tunnel is genuinely a one-line YAML change followed by git push.
If you're running a homelab and haven't segmented your network yet, VLANs are the single highest-value change you can make. And if you've already got VLANs but want per-domain routing control without client-side configuration, DNS-based policy routing is worth the setup time.