Posted in

Optimization of DNS Resolvers in the HTTP/3 Era: Mitigating the Impact of HTTPS (SVCB) Queries on Cache Performance

Click here for the Spanish version

The massive adoption of HTTP/3 and QUIC has profoundly reshaped the profile of DNS traffic. Modern browsers and operating systems generate an increasing volume of HTTPS queries (Type 65) intended to discover service parameters before establishing new connections. In local recursive resolvers—such as Unbound—this behavior exposes a structural inefficiency: authoritative servers operated by large providers (Google, Amazon, Cloudflare) enforce extremely low negative TTLs, commonly around 60 seconds.

This document presents a forensic diagnosis of a degradation in Cache Hit Rate and details an optimization strategy based on adjusting the negative TTL, achieving a performance increase from 30% to 60%, along with observable improvements in latency and a reduction in unnecessary recursions.


1. Introduction: The New Browsing Standard

The evolution of the Internet ecosystem toward more efficient transport protocols led the IETF to standardize SVCB and HTTPS records through RFC 9460. These records allow clients to negotiate critical parameters—such as ALPN—during DNS resolution, reducing latencies associated with additional TCP/TLS negotiation rounds.

However, this edge-oriented, intelligent approach introduces an operational cost for intermediate resolvers. As Bert Hubert (PowerDNS) warns:

“DNS has become the control plane for content delivery optimization, but this frequently conflicts with the efficiency of local caching.”


2. The Issue Identified

In a production environment running Unbound 1.19.x with a strict privacy policy, monitoring revealed anomalous behavior:

  • Symptom: The total.num.cachemiss metric remained persistently high under moderate load.
  • Empirical data: Cache Hit Rate remained steady around 30.8%, unusually low for an environment with consistent browsing patterns.

2.1 Forensic Traffic Analysis

Using tcpdump with selective filtering, the traffic responsible for excessive recursion was isolated:

sudo tcpdump -i eno1 port 53 -n -vvv | grep -i "HTTPS"

Key findings:

  • Volume: Between 15% and 20% of queries were HTTPS (TYPE65).
  • Typical response: Most queried domains do not have this record → NOERROR/NODATA responses.
  • Trigger: The SOA section showed a MINIMUM TTL ≈ 60 seconds.

According to RFC 2308, resolvers must cache negative responses for the duration indicated by the MINIMUM field of the SOA. Consequently, Unbound was forced to “forget” the absence of the record every 60 seconds, triggering new recursion on each repeated client query.


3. Solution Strategy

The superficial solution would be to block TYPE65, but this would compromise modern standards and degrade QUIC-based experiences.

The correct approach is to proactively manage the negative TTL.

3.1 Theoretical Basis

The goal is to decouple provider-driven TTL policies—geared toward rapid changes—from the reality of a local network, where the absence of an HTTPS record is a stable and predictable state.

3.2 Implementation in Unbound

The decision was made to override the negative TTL imposed by the authoritative server and set a value more consistent with actual traffic patterns.

server:
    # Positive TTL
    cache-min-ttl: 3600

    # Critical optimization
    cache-max-negative-ttl: 3600

The effect is clear: an event that previously occurred 60 times per hour now happens only once.


4. Results and Validation

After 24 hours of continuous operation (time.up = 91085), metrics obtained using unbound-control stats_noreset showed:

MetricPre-OptimizationPost-OptimizationDelta
HTTPS BehaviorRecursion every 60 sNegative caching 3600 s+98% efficiency
Average latency>300 ms≈276 ms (stable)Significant improvement
Cache Hit Rate30.82%59.48%+28.6 points

The system processed 2,780 HTTPS queries and cached 3,823 NODATA responses. Without the optimization, each would have expired in 60 seconds, generating thousands of additional external queries.


5. Conclusion

A permanent tension exists between the TTL policies of large providers and the needs of local networks. For performance-oriented administrators, accepting default values is no longer a viable option.

Optimizing through an extended negative TTL:

  • reduces network load,
  • stabilizes latency,
  • doubles the Cache Hit Rate,
  • and preserves compatibility with modern standards.

In the era of HTTP/3, DNS resolvers require proactive tuning to sustain operational efficiency.


References

  • RFC 9460: Service Binding and Parameter Specification via the DNS (SVCB and HTTPS Resource Records), 2023.
  • RFC 2308: Negative Caching of DNS Queries, 1998.
  • NLnet Labs: Unbound Configuration Documentation – Cache Tuning.
  • APNIC / G. Huston: DNS OARC Analysis on Query Volume & HTTPS RRs (2020).
  • Local analysis based on unbound-control metrics and tcpdump captures on Linux/Debian infrastructure.

Leave a Reply