NetSense logo
Back

How to Stop Outbound DDoS in ISP Networks

By Plamen Petkov
Fri Apr 10 2026
14 min read
Upstream ISP transit link bandwidth graph showing increased traffic to particular Destination Networks and ASNs - DDoS flood coming from inside the ISP

How to Stop Outbound DDoS in ISP Networks

(Without Breaking Users)

From automated tracing to per-subscriber firewall rules to Quarantine BNG - what we deployed across ISP customer networks to contain botnet traffic from compromised CPEs

Published: April 2026 | Reading time: ~8 min | Part 2 of 2

DDoS - how your Upstream ILL looks like after containing the flood

Fig. 1 — An ISP customer's upstream transit link returning to normal after containment.

Previously, in Part 1: In late March 2026, ISPs across India started calling us. Thousands of subscriber CPEs had been compromised by botnets (Katana, Kimwolf) and were launching massive outbound UDP floods. It looked like congestion. It was a security event. Flow analysis via NetSense Flow revealed the pattern — outbound UDP on ports 80, 443, and 0, concentrated toward Chinese ASNs. Part 1 covered how to detect it. This post covers how to stop it. Read Part 1: The DDoS Was Coming From Inside the Network

The Gap Between Seeing It and Stopping It

Once our ISP customers could see the attack in their flow data, the next question was always the same: "Great, we can see it. Now how do we trace it to specific subscribers and stop it without taking down the whole network?"

This is where most ISPs get stuck. Detection is the easy part. The hard part is the tracing chain: from a destination IP at the border, through NAT pools, to a specific BNG, to an individual subscriber session. It requires correlating data across multiple systems, and during a live incident, speed matters.

Automated Tracing: From Border IP to Subscriber

We built SensyAI as a playbook automation layer — not a magic AI box, but a system that can chain together the steps a skilled operator would take, faster and more consistently. Think of it as a senior NOC engineer who never forgets a step, never misreads a NAT table, and can SSH into 20 BNGs simultaneously. We deploy it at ISP customer sites as part of the NetSense stack.

What SensyAI actually does

DDoS - SensyAI identify and mitigate ip addresses

Fig. 2 — SensyAI scanning a border router IP at an ISP customer site. It classified five flows: #1 is CRITICAL (197 MB UDP to Alibaba, 569 sources, ramping). #2-4 are benign (WhatsApp calls, VPN). It offers to block the attack destination at the border.

The tracing chain:

DDoS - SensyAI identify and mitigate users that are flooding

Fig. 3 — End of the chain: user 'iqbal.38' on ppp4178. tcpdump shows 500 packets in under 1 second, ALL UDP 1400-byte to 47.94.172.112:80 (Alibaba Cloud). Full line rate on a 100 Mbps plan. SensyAI offers to disconnect or rate-limit.

Why Netbox integration matters: Without a queryable source of truth mapping NAT pools to BNGs, this tracing step becomes manual spreadsheet archaeology. Every ISP has this mapping somewhere — the question is whether automation tools can access it programmatically. We integrate SensyAI with Netbox at every customer deployment for exactly this reason.

Why the BNG Platform Matters More Than ISPs Think

Everything described above: tcpdump on subscriber interfaces, NAT table queries, per-subscriber nftables rules, SSH access for automation - depends on one thing: having a BNG that lets you do all of this. During a live DDoS event with millions of PPS from thousands of subscribers, the BNG needs to survive the load and remain inspectable.

Several of our ISP customers run software-based BNGs on Linux. This gives them capabilities that proprietary hardware platforms simply do not expose — and it was a decisive advantage during the March 2026 incident:

What you need during DDoS

Software BNG on Linux

Typical hardware BNG

Packet capture on subscriber interface

tcpdump on any ppp interface

Not available or limited

NAT table query

conntrack -L, programmable

Vendor CLI, slow, incomplete

Per-subscriber firewall

nftables with meters, raw table

Static ACLs, limited granularity

Automation access

SSH + any Linux tool

Vendor API, if it exists

Survive extreme PPS

XDP/eBPF, kernel bypass

Fixed ASIC, may drop sessions

Scale during attack

Add instances horizontally

Buy bigger box

What compromised CPEs look like (via NetSense Pulse)

NetSense Pulse collects real-time CPE telemetry: traffic, PPS, CPU, memory. At ISP sites where Pulse is deployed, compromised devices showed unmistakable signatures during the incident:

DDoS - user cpe stats - flooding with traffic2

Fig. 4 — Bursty pattern: outbound traffic spikes to 400+ Mbps, drops, spikes again. Packet rate mirrors. Classic botnet attack-idle-attack cycle.

DDoS - user cpe stats - cpu spikes during flood

Fig. 5 — CPU spikes to 100% during each attack burst. The device is not forwarding traffic — it is generating it. Malware running on the device itself.

The Firewall Rules: What We Helped Deploy and What We Learned

Working with our ISP customers, we deployed a layered nftables ruleset on their BNG's FORWARD_GUARD chain in the raw table — processing packets before conntrack. This is important: during a flood, you want to drop bad packets before they consume CPU on connection tracking or NAT.

What follows are the actual rules running in production at customer sites, along with the reasoning and the mistakes we made getting here.

Layer 1: Hard drops (the obvious stuff)

Invalid conntrack states, NULL scans, impossible flag combos (SYN+FIN, SYN+RST), and all port 0 traffic. These should never exist. No tuning needed — just drop them.

bash
# --- hard drop: invalid / weird packets ---nft 'add rule ip raw FORWARD_GUARD iifname "ppp*" ct state invalid counter drop comment "invalid ct"'nft 'add rule ip raw FORWARD_GUARD iifname "ppp*" tcp flags & (fin|syn|rst|psh|ack|urg) == 0x0 counter drop comment "NULL scan"'nft 'add rule ip raw FORWARD_GUARD iifname "ppp*" tcp flags & (fin|syn) == (fin|syn) counter drop comment "SYN+FIN"'nft 'add rule ip raw FORWARD_GUARD iifname "ppp*" tcp flags & (syn|rst) == (syn|rst) counter drop comment "SYN+RST"' # --- hard drop: port 0 ---nft 'add rule ip raw FORWARD_GUARD iifname "ppp*" tcp sport 0 counter drop comment "tcp sport 0"'nft 'add rule ip raw FORWARD_GUARD iifname "ppp*" tcp dport 0 counter drop comment "tcp dport 0"'nft 'add rule ip raw FORWARD_GUARD iifname "ppp*" udp sport 0 counter drop comment "udp sport 0"'nft 'add rule ip raw FORWARD_GUARD iifname "ppp*" udp dport 0 counter drop comment "udp dport 0"'

Layer 2: ICMP + amplification ports

ICMP: 100 pps/subscriber (burst 200). Amplification ports (chargen, QOTD, DNS, NTP, SSDP, memcached): 500 pps. Legitimate use fits easily within these limits.

# --- ICMP limiter ---nft 'add rule ip raw FORWARD_GUARD iifname "ppp*" ip protocol icmp meter cust_icmp { ip saddr limit rate over 100/second burst 200 packets } counter drop comment "ICMP flood"' # --- amplification ports ---nft 'add rule ip raw FORWARD_GUARD iifname "ppp*" udp dport { 17, 19, 53, 123, 1900, 11211 } meter cust_amp { ip saddr limit rate over 500/second burst 300 packets } counter drop comment "amp"' 

Layer 3: New flow control (this is the important one)

Limiting new TCP connections (SYN) to 200/s and new UDP flows to 500/s per subscriber. This directly counters the bursty attack pattern — each burst needs new flows, and capping flow creation rate chokes the burst before it builds.

bash
# --- new flow control: important for burst attacks ---nft 'add rule ip raw FORWARD_GUARD iifname "ppp*" ct state new tcp flags & (syn|ack) == syn meter cust_tcp_new { ip saddr limit rate over 200/second burst 300 packets } counter drop comment "new TCP conn limit"'nft 'add rule ip raw FORWARD_GUARD iifname "ppp*" ip protocol udp ct state new meter cust_udp_new { ip saddr limit rate over 500/second burst 300 packets } counter drop comment "new UDP flow limit"'

Layer 4: Port-aware rate limiting

This is where we spent the most time tuning — and where we got it wrong initially.

What we got wrong first: Our initial deploy at one ISP customer set UDP/443 to 1,000 pps, same as TCP/80. This broke QUIC for heavy users — video calls, Google services, YouTube uploads. We had to back off to 4,000 pps for port 443 within hours. The lesson: QUIC is real traffic now, and you cannot treat it the same as port 80 UDP (which has almost zero legitimate use). We also initially forgot to exempt speed test traffic on port 8080, which generated a wave of "my speed test is broken" tickets before we added the relaxed rule.

Final thresholds after tuning:

Layer 5: Total PPS safety net

Hard cap: 8,000 pps per subscriber. No matter what port or protocol. For a 100 Mbps plan, this is more than sufficient for any legitimate activity. It is the last line of defense.

bash
# --- total PPS safety net ---nft 'add rule ip raw FORWARD_GUARD iifname "ppp*" meter cust_total { ip saddr limit rate over 8000/second burst 1500 packets } counter drop comment "total PPS"' nft 'add rule ip raw FORWARD_GUARD return'

The Quarantine BNG: Don't Punish Everyone for the Botnet's Sins

Here is the problem with per-subscriber rate limits on production BNGs: they work too well. They stop the floods, but they also clip legitimate power users. YouTube creators uploading 4K video. Twitch streamers. Engineers pushing large files. Households running NAS cloud sync. On a high-speed fiber network where plans go up to 400 Mbps, these users exist in significant numbers — and they chose fiber broadband specifically because they use the upload.

Permanently rate-limiting all subscribers to contain a botnet defeats the ISP's value proposition. The architecture we recommend — and have deployed at several customer sites — is a Quarantine BNG.

How it works

The business case: Rate-limiting all 200K subscribers on a BNG to stop 47 compromised ones is like canceling every flight because one passenger is disruptive. The Quarantine BNG isolates the problem. Production users never notice. And quarantined subscribers are incentivized to clean their devices, which improves network hygiene over time.

What still is not solved

Honesty section. We do not have everything figured out — and neither do our customers.

What Every ISP Operator Should Take From This

If you operate a broadband network of any size in 2026, outbound DDoS from compromised subscriber devices is not a hypothetical. It is happening. Based on what we saw across multiple ISP networks in March 2026, here is what matters:

When upload exceeds download, look at the flows. Stop troubleshooting infrastructure. In consumer broadband, upload inversion is almost always a security event.

You need flow visibility, not just SNMP. Interface utilization tells you that something is wrong. Flow data tells you what. Without it, you are guessing.

Automate the tracing chain. Border IP -> NAT pool -> BNG -> subscriber. If this takes your team 30+ minutes manually, you need to invest in tooling and a queryable IPAM.

Your BNG must survive the storm AND be inspectable. If you cannot tcpdump a subscriber interface, query NAT tables, or apply per-subscriber firewall rules during a live attack, your BNG is a liability.

Layered firewall rules beat blanket blocking. Port-aware, per-subscriber rate limits in nftables raw table. Tune them carefully — what breaks QUIC users is different from what breaks port 80 floods.

Quarantine compromised subscribers, do not penalize everyone. Move confirmed infected users to a dedicated BNG. Relax production rules. Protect the experience of paying customers who chose your network for speed.

Be honest about what you have not solved. Pre-attack detection, CPE firmware hygiene, dynamic plan-specific thresholds. This is a moving target, not a checkbox.

Closing Thought

The biggest shift this incident forced — for us and for our ISP customers — was a mindset change. For years, ISP NOCs were trained to think about DDoS as something that happens to them — inbound attacks on their infrastructure. March 2026 taught everyone that the ISP network is the attack infrastructure. Subscribers' compromised devices are the weapons. And the first symptom does not look like a security event — it looks like congestion.

The ISPs that will handle this well are not necessarily the ones with the biggest scrubbing contracts or the most bandwidth. They are the ones who can see the pattern fastest, trace it to the source fastest, and contain it without punishing their paying customers. Speed of diagnosis is the competitive advantage. Everything else follows from that.

That is what we build NetSense for. And March 2026 was the hardest test it has faced.

"A link graph tells you that you have a problem. Flow data tells you what the problem is. And a good tracing chain tells you exactly where it is coming from. The rest is execution."

Missed Part 1? Read The DDoS Was Coming From Inside the Network — how we helped ISPs detect outbound DDoS that looked like congestion.

NetSense NMS by Extreme Labs provides the monitoring, flow analytics, CPE telemetry, and automation capabilities described in this post.

Ready to see NetSense in action?

Book a live demo and see how NetSense NMS gives your ISP full-stack visibility across OLTs, switches, and customer CPE.

Book a Live Demo