[Mimedefang] right value for MX_MAXIMUM

Dianne Skoll dfs at roaringpenguin.com
Wed Jun 24 09:52:22 EDT 2015


On Wed, 24 Jun 2015 05:43:20 -0600
"Muthu N.C" <ncmuthu at gmail.com> wrote:

> In a day I am getting the below error multiple times. By looking at
> the md-mx-ctrl rawstats, all the 80 slaves are busy at that time. We
> are receiving around 500 e-mails in a minute.

500/minute is about 720K per day which is reasonably high for one machine.

> How much slaves should be set for this workload?

Well, there's no fixed answer.  There are basically four possible
bottlenecks to MIMEDefang performance.  You need to find the bottleneck
and remove it.  Making changes to non-bottleneck parameters won't help.

The bottlenecks are:

1: CPU power.  You need a fast CPU for this sort of workload.  On our
Hosted CanIt cluster, the typical scanner uses 2.3GHz Intel Xeon
E5-2630 CPUs containing 24 cores total (that's what Linux reports,
counting hyperthreading.)  If CPU power is the bottleneck, "top" will
show high %sy and %us percentages.  If those add up to close to 100,
then the CPU is underpowered.

2: Memory.  We figure on 50MB/slave, but of course it depends on the
complexity of your filter.  80 scanners should fit in 4GB of RAM.  On
Hosted CanIt, our typical scanning node has between 24 and 48GB of
RAM.  You can never have too much RAM.  You can tell if you're short
of RAM by checking swap statistics (si and so) using vmstat.
Actually, if you are swapping, your machine will be impossibly slow
and you'll definitely notice that.

3: Disk.  If your disk bandwidth is too low, you're sunk.  You need to
increase it or run multiple scanning machines in parallel.  Our typical
Hosted CanIt scanner contains 8 disks using Linux software RAID-10, which
gives about 4x the write and 8x the read performance compared to a
single disk.  You can tell if disk is the bottleneck if "top" reports
a high "%wa" (wait-state) and you see a bunch of processes in
disk-wait ("D") state.

4: Network.  If your filter does DNS lookups and answers take a while
to come in, scanning processes will back up.  Make sure you use a fast
local caching name server.  You can tell the network is a bottleneck
because the computer will have lots of idle time, very little
wait-state processes, and lots of sleeping MIMEDefang scanners.  If
slow DNS is the problem, tracing (using "strace") on one of the
scanners will show a bunch of "sendto" and "recvfrom" system calls on
port 53 with the recvfrom calls stalling for a while.  (That's
assuming you run Linux, of course.)

So I need to know your hardware details and then I can recommend an
MX_MAXIMUM value.

Regards,

Dianne.



More information about the MIMEDefang mailing list