Skip to content
PowerMTA Experts

PowerMTA service

When mail stops landing, we find out why — fast.

PowerMTA troubleshooting is incident work: a sudden delivery drop, a provider block, a blacklisting or broken authentication. We triage the blast radius, read the logs and bounce codes until the real cause surfaces, stabilize the flow so mail moves again, then fix the source so the same problem does not return next week.

PowerMTA troubleshooting is incident work on a live delivery problem: a sudden drop, a provider block or deferral, a blacklisting or broken authentication. The method is to read the bounce codes the receiving servers return — 4xx is temporary, 5xx is permanent, and the text after the code names the cause — then stabilize the flow and fix the source so the same fault does not return. An incident has a clock on it that an audit does not: a blacklisting caught in an hour is a delisting that takes an afternoon, while the same one caught after a week needs weeks of careful sending to rebuild.

In short

  • A delivery incident has a clock on it: every hour the cause runs unfixed is reputation that takes weeks longer to rebuild, so speed of diagnosis is the whole game.
  • The bounce code is the evidence that cannot lie: 4xx is temporary and usually self-recovers, 5xx is permanent, and the text after the code names the actual cause.
  • Treating the symptom makes it worse — raising the send rate on a backing-up queue, or rewriting copy when an IP is blocklisted, accelerates the damage instead of fixing it.
  • Microsoft 550 5.7.515 is bulk-sender authentication enforcement; 550 5.7.606 is an IP blocklist that blocks the whole Microsoft ecosystem at once — cross-referenced against SNDS.
  • Delisting is the easy last step and often free; the real work is fixing the root cause first, because a removal granted while the cause is live gets reversed within days.

A delivery incident has a clock on it that an audit does not. Every hour an IP sits on a blocklist, or a provider keeps deferring your queue, is reputation draining away and revenue not arriving — and the longer it runs, the longer the recovery. A blacklisting caught in an hour is a delisting that takes an afternoon. The same blacklisting caught after a week is a reputation that needs weeks of careful sending to rebuild. So troubleshooting is not leisurely diagnosis. It is the discipline of finding the real cause quickly, stopping the bleeding, and then making sure the wound actually closes.

The trap, under pressure, is to treat the symptom. Mail is going to spam, so someone rewrites the subject line. The queue is backing up, so someone raises the send rate. Both make things worse, because the symptom is rarely where the cause lives. The first job in any incident is to read what the receiving servers are actually telling you, in the one place they cannot lie about it: the bounce codes.

Where do you start when mail stops landing?

Every rejection carries a code, and the code is the fastest route to the cause — if you read all of it. A bare 550 tells you almost nothing; the enhanced status code that follows it, the 5.x.y string and the human-readable text after that, tell you almost everything. The first digit sorts the world into two: a 4xx is temporary and your server should retry, a 5xx is permanent and retrying only digs the hole deeper. Here are the codes that account for most PowerMTA incidents, and what each one is really pointing at.

CodeTypeWhat it usually means
421 4xx temporary Rate-limited, server busy, greylisting, or a sudden complaint spike. Normally retries on its own — unless it keeps recurring from one provider, which points at reputation, not the address.
450 4xx temporary Mailbox unavailable, full, or a transient block. Often greylisting on first contact with a recipient server.
550 5.1.1 5xx permanent The recipient does not exist. Remove it from the list at once — repeated hits on dead addresses are one of the fastest ways onto a blocklist.
550 5.7.1 5xx permanent A policy block: DMARC, SPF or DKIM failure, the recipient blocking you, or your IP on a blocklist. This is Gmail’s catch-all rejection.
550 5.7.26 5xx permanent Unauthenticated mail — authentication is missing or not aligned with the From domain.
550 5.7.23 5xx permanent A DMARC policy failure where your own record told the receiver to reject unauthenticated mail, and it did.
550 5.7.515 5xx permanent Microsoft: the required authentication level is not met. Its bulk-sender enforcement code since May 2025.
550 5.7.606 5xx permanent Microsoft’s IP blocklist code — it blocks the entire Microsoft ecosystem at once. Cross-reference Microsoft SNDS to confirm.
554 5xx permanent Transaction failed — usually a spam-policy violation or a blocklist entry. Read the text after the code to narrow it.

Always read the full status string and the text after it; the same 550 carries very different causes.

One pattern is worth committing to memory, because it redirects half of all incidents correctly on the first read. A recurring 421 or 450 from a single major provider is almost never about the individual address. It is that provider throttling or deferring your mail because of reputation, and the right next move is not to keep retrying harder but to check how that provider sees you — Google Postmaster Tools, Microsoft SNDS, Yahoo’s Sender Hub — before sending another message into the same wall. Gmail makes this explicit now: the same compliance problem appears first as a 421 warning and, since November 2025, escalates to a permanent 550 if it is not fixed.

Triaging a bounce, from code to cause
Read the bounce code the evidence that cannot lie 4xx — temporary 421 · 450 · rate-limit · greylist Retries on its own — usually but recurring from one provider = reputation, not the address 5xx — permanent the cause decides the action 550 5.1.1 dead address → remove now 550 5.7.1 policy block: DMARC/SPF/DKIM 550 5.7.515 Microsoft auth enforcement 550 5.7.606 Microsoft IP blocklist → SNDS A 4xx that keeps coming back is a 5xx in waiting — read the pattern, not the single line. The text after the code names the cause; the cause decides the fix.
Every triage starts at the code. A 4xx is temporary and usually self-recovers, but the same 4xx recurring from one provider is a reputation signal, not an address problem. A 5xx is permanent, and which 5xx it is decides the action: a dead address comes off the list immediately, a policy block sends you to authentication, and a Microsoft 5.7.606 sends you to SNDS to confirm the IP listing. The single most expensive mistake is reading one line instead of the pattern across many.

How do you read a bounce code in practice?

When a message is rejected, your system receives a Delivery Status Notification — the bounce. It has three parts worth separating. There is the subject, usually something like “Mail delivery failed” or “Undelivered Mail Returned to Sender”, which tells you nothing. There is a human-readable explanation written by the receiving server, which is sometimes honest and sometimes deliberately vague. And there is the SMTP line itself — the numeric code, the enhanced status code, and the provider’s own message appended to it — which is the part that actually matters.

The mistake is to read the friendly sentence and stop. The friendly sentence might say “message blocked”; the code after it is the difference between a recipient who no longer exists, a DMARC alignment failure, and an IP on a blocklist — three problems with three completely different fixes. So the first thing we do on an incident is get to the raw responses, not the summarized ones: the full status string from the PowerMTA accounting and bounce logs, a handful of actual rejected-message headers, and the per-message timeline of attempts, deferrals and retries. A sanitized dashboard summary will tell you that delivery to a provider fell. The raw log tells you why, and the why is the whole job.

Reading the bounce pattern from PowerMTA's own logs
incident triage — acct-*.csv
# Which providers are returning which codes, ranked by volume?
$ pmta show queues | grep -E "outlook|hotmail" | head
outlook.com   recipients 84120   status 550 5.7.606   blocked

# Confirm it is an IP block, not a content or auth issue — count the code
$ grep "5.7.606" /var/log/pmta/acct-*.csv | wc -l
84120        # the whole Microsoft ecosystem at once — cross-check SNDS

# Is it isolated to Microsoft, or is Gmail deferring too? (read the pattern)
$ awk -F, '{print $7}' /var/log/pmta/acct-*.csv | sort | uniq -c | sort -rn | head
  84120 550 5.7.606     # Microsoft only
  41880 250 2.0.0       # Gmail still delivering normally
The accounting logs hold the truth a dashboard summarizes away. Here the pattern is unambiguous: every Microsoft recipient is hitting 550 5.7.606 — an IP blocklist that takes the whole ecosystem at once — while Gmail keeps delivering. That single fact rules out content and authentication as the cause and sends the next step straight to Microsoft SNDS, instead of wasting an afternoon rewriting copy that was never the problem.

What symptoms do senders call us for?

Incidents arrive described in plain language, not status codes, and these are the ones that fill the inbox. If one of them is happening to you right now, this is the page to start from.

  • A delivery rate that dropped and will not recover
  • A provider deferring or blocking with repeated 4xx or 5xx codes
  • An IP or domain that turned up on Spamhaus or another blocklist
  • Microsoft rejecting everything with 550 5.7.515 or 5.7.606
  • Queues backing up and the spool growing without an obvious cause
  • DKIM or SPF that broke after a DNS change or a domain move
  • Mail that used to inbox now landing in spam for the same recipients
  • A migration or config change that went sideways and needs unwinding

They look like different problems and often share a cause. A blacklisting, a wave of 550s and mail suddenly going to spam can all trace back to the same recycled spam trap that slipped into a list last week. Part of the work is refusing to treat the three as three separate fires.

How do you work a PowerMTA incident?

The method is the same whether the cause turns out to be a broken DNS record or a compromised account. First, triage: we establish the blast radius — which streams, which providers, since when — so the most damaging part gets attention first and we are not chasing a minor deferral while the main queue burns. Second, diagnosis: we read the bounce codes, the message headers, the PowerMTA logs and the reputation signals together, because any one of them alone can mislead, and the cause usually shows up where two of them agree.

Third, stabilize: we stop the bleeding. That might mean rerouting a stream to a clean source, rate-limiting into a provider that is deferring, pausing a send that is making reputation worse, or fixing the broken record that is failing authentication. The aim is to get legitimate mail moving again without deepening the damage. Only then comes the fourth step, root cause: with the immediate crisis contained, we resolve the underlying issue and close the gap so the same incident does not recur the following week. Skipping straight to step four while the queue burns, or stopping at step three and declaring victory, are the two most common ways an incident becomes a recurring one.

A concrete shape makes the method less abstract. A B2B sender watches Gmail open rates halve overnight. The bounce logs show a climbing count of 550 5.7.1 from Gmail specifically, while Outlook and Yahoo look normal. That pattern — one provider, a policy code, a sudden onset — points away from the recipient list and toward authentication or reputation at Gmail alone. Postmaster Tools confirms it: domain reputation slid from high to low over three days. The trail leads to a new tool a marketing team wired up the week before, sending unaligned mail under the main domain and dragging its DMARC reputation down with it. The fix is not in PowerMTA at all — it is bringing that tool into alignment or off the primary domain. Read the codes and the per-provider pattern and the cause is obvious within an hour; skip them and the same incident gets misdiagnosed as a content or volume problem and “fixed” for a week to no effect.

Which common fixes make an incident worse?

Under pressure, teams reach for the same handful of moves, and most of them deepen the problem. Sending slower feels prudent, but if the cause is a blacklisting or an authentication failure, a slower trickle of rejected mail is still rejected mail — you have changed the volume of the damage, not the damage. Rewriting the subject line treats a reputation block as if it were a content filter; the message never reached a content filter, so the rewrite changes nothing. Starting a fresh warmup on a damaged IP, before the cause is fixed, just teaches the providers the bad behavior again at a polite pace.

The most expensive wrong move is the one that feels most decisive: delisting and resuming. Submitting the removal form while the spam trap is still in the list, or the open relay is still open, gets you delisted for a few hours and relisted by the weekend — and now with a repeat-offender history that makes the next removal slower. Every one of these is an attempt to skip diagnosis. The reason we triage and read the logs before touching anything is precisely to avoid spending the incident’s most valuable hours making it worse.

Can you get us off a blacklist, and how does delisting really work?

Spamhaus is the listing that does the most damage, because it is queried billions of times a day and a single entry can silently reject your mail to Gmail, Outlook and half a prospect list at once. But “Spamhaus” is not one list. It is several, combined under ZEN, and each has a completely different cause and removal path: the SBL, a manually reviewed list of direct spam sources; CSS, an automated list targeting snowshoe and high-volume patterns; XBL, for compromised machines, open relays and proxies; PBL, a policy list for IPs that should not be sending directly at all; and the DBL for domain reputation. Treating them as one generic blacklist is how teams waste hours filing the wrong removal request.

Here is the part most vendors will not put in writing: the delisting request itself is free, and for several of those lists it is self-service. Anyone charging a fee simply to submit a removal form is selling you something you could do yourself in minutes. The value is not the click. It is knowing which list you are on, fixing the root cause — because Spamhaus will refuse or reverse a removal while the underlying problem is still active, and can disable self-service delisting until it is satisfied — and rebuilding the reputation afterward. A modern Spamhaus listing is also frequently a signal of a deeper problem, a compromised account or an open relay, rather than a mere deliverability hiccup, so we treat the listing as a symptom and go looking for what caused it.

Timelines follow the list. A policy or automated listing can clear within minutes to a day once the cause is genuinely resolved; a manually reviewed SBL listing usually takes one to three business days; and a sending reputation that has actually been harmed takes longer still to rebuild, commonly several weeks of low-volume, high-engagement sending. The fastest delisting in the world does nothing if the behavior that caused it is still running.

What do the Microsoft 550 5.7.515 and 5.7.606 codes mean?

Microsoft incidents deserve their own note, because they fail differently from Gmail and they fail wide. Where Gmail’s rejections tend to be precise and well-documented, Microsoft’s are terser and often need cross-referencing against its Smart Network Data Services feed to understand at all. The one that empties a room is 550 5.7.606: Microsoft’s IP blocklist code, which does not block a single mailbox or domain but the entire Microsoft ecosystem at once — Outlook.com, Hotmail, Live and the business tenants behind them, all rejecting in the same moment.

That breadth makes a Microsoft listing both more alarming and, in one respect, more tractable: there is a defined path back through Microsoft’s own mitigation and sender support process, but it only works once SNDS shows the underlying reputation has actually recovered. We read SNDS first, fix what it is reacting to — complaint spikes, trap hits, a compromised stream — and then work the mitigation request, rather than firing off support tickets into a reputation that is still red. Microsoft also tends to punish the sending pattern that was shaped purely for Gmail, so part of a Microsoft rescue is often re-segmenting how you send into Outlook addresses specifically.

The backoff that makes it worse

There is one PowerMTA-specific failure that turns a small problem into an outage, and it deserves naming because it is so common in incident work. When a provider returns a soft 4xx, PowerMTA is supposed to ease off through its backoff rules and recover gently. If those rules are misconfigured — a pattern list that does not match the deferral string the provider is actually returning, a retry window that loops instead of recovers — the engine keeps hammering a provider that already asked it to slow down. A brief rate limit becomes a sustained block, the spool grows, and what should have been a self-healing wobble becomes a queue explosion.

This is why a troubleshooting pass almost always includes a look at the smtp-pattern-list and the backoff and retry directives, even when the presenting symptom is something else. Connection limits matter here too: because providers defend their connection ceilings harder than their message rates, an estate pushing too many simultaneous connections earns 421s faster and recovers from them slower. Fixing the incident sometimes means fixing the reflex that the configuration has to a problem, not just the problem itself.

One fire, or the pattern behind it

Some incidents are genuinely one-off — a DNS change fat-fingered, a single list imported without cleaning. We put those out and step away, no retainer, no upsell. But a fair number turn out to be the visible part of something systemic: a backoff configuration that will keep converting deferrals into blocks, a list-hygiene process that will keep feeding traps to the engine, a reputation that has been sliding for a month. When that is what we find, we say so and let you decide whether ongoing monitoring is worth it. The honest version of incident response includes telling you when the incident is a symptom, and when it is the whole disease.

Staying fixed is its own deliverable. When an incident turns out to be systemic, we hand back a working estate together with the specific change that keeps it working — the corrected backoff list, the list-hygiene step that stops the trap hits, the monitoring threshold that would have caught this one a day earlier. The point of resolving an incident properly is that you should not have to meet the same one twice.

One last thing that matters in incidents specifically: hours. A delivery problem does not wait for business hours in your timezone, and an IP listed overnight is reputation lost by morning. Our team spans European, North American and Latin American hours, so an urgent incident reaches someone who can read a log and act, rather than a ticket that sits until the working day begins where we happen to be. In this work, speed is not a flourish — it is the difference between an afternoon’s delisting and a month’s rebuild.

FAQ

Incident questions

How fast can you start on an active incident?

Quickly. Flag it as urgent through the contact form or by email and we move on it, because with a live delivery problem every hour the cause runs unfixed is reputation you will spend weeks rebuilding. The single biggest accelerator is access to the evidence — the bounce logs, a few rejected message headers, and which providers are affected. With those in hand we can usually name the cause fast.

Do you take one-off emergencies, or only retainers?

Both. A large share of engagements start as a single fire we put out, and some stay exactly that. There is no requirement to sign a managed contract to get help with an incident. That said, plenty of incidents turn out to be the visible symptom of a systemic issue — a backoff rule, a list-hygiene gap, a reputation slide — and when that is the case we will tell you, so you can decide whether ongoing monitoring is worth it.

What if the problem turns out not to be PowerMTA?

It often is not, and that is fine — we diagnose the whole path, not just the engine. A delivery collapse is just as likely to be a DNS change that broke SPF, a list full of dead addresses, a content trigger, or a reputation slide that started weeks earlier. We follow the evidence to wherever the fault actually sits and tell you plainly, even when the answer is something outside the MTA entirely.

Can you get us off a Spamhaus blacklist?

Yes, but the honest part matters: the delisting request itself is free and, for some Spamhaus lists, self-service — anyone charging a fee just to click a removal form is selling you something you can do yourself. The real work, and the reason listings keep coming back for teams who skip it, is diagnosing which list you are on, fixing the root cause Spamhaus will check for, and rebuilding the reputation afterward. That is what we do; the form is the easy last step.

How long does recovery take once the cause is fixed?

It depends on the damage. A policy-list or automated listing can clear in minutes to a day once the underlying issue is resolved. A manually reviewed Spamhaus SBL listing typically takes one to three business days after you have genuinely fixed the cause. Rebuilding a sending reputation that has actually been harmed is slower — commonly several weeks of careful, low-volume, high-engagement sending, which is why catching an incident early is worth so much.

Will you just delist us and let us resume sending?

No, because that is how senders end up relisted within days. Spamhaus and the major providers will reverse or refuse a removal where the underlying problem is still active, and they can disable self-service delisting until they are satisfied it is fixed. We stabilize first, find and resolve the actual cause, then handle the removal — in that order — so the fix holds instead of buying you a quiet weekend before it returns.

We are mid-migration and delivery broke. Can you help?

Yes, and migration cutovers are a common source of incident calls — a VirtualMTA mapped to the wrong IP, DKIM selectors that did not move across, a reputation that was not carried over. We can stabilize the broken send, unwind whatever went in wrong, and either get the migration back on track or roll it back cleanly while we work out what happened.

Start with the audit.

Twenty-five points across authentication, reputation, infrastructure and compliance — a written assessment, no charge and no obligation. It tells both of us exactly what we are working with.