[Mimedefang] graphdefang 0.9, --trim, memory usage, other comments

Kevin A. McGrail kmcgrail at pccc.com
Tue Oct 28 18:24:35 EST 2003


> I'm running graphdefang 0.9 to graph a data-set of roughly 800,000 to
> 1,200,000 mails per day, depending on the day.  For the most part, I'm
> pretty happy with it.  I do have some comments and questions.
>
> First the comments.
>
> I've made some modifications to the source and have added an "average"
> keyword that functions similar to summary, but tracks an average over
> time.  If there is interest in integrating upstream, I'd be more than
> happy to take a diff and send it on.
>
> Running with --trim doesn't draw the graphs.  This means that I'm having
> to run graphdefang twice every time I need the graphs built.  After I
> think more about how I want to do it, I'm going to modify it so that
> graphdefang continues graphing after it trims, rather than exiting.  If
> there's interest, I'll forward a patch for that, too.
>
> graphdefang 0.9 can run with one data file or multiple datafiles.  I've
> added the ability to run without any data files and draw graphs based only
> on the .db file.  I'm also going to add the ability to import data from
> multiple .db files.  Right now, I am pre-processing .db files from several
> machines into a unified file and then running graphdefang against the
> unified .db to create a single graph.  I need to cleanly integrate the
> pre-processor into the graphdefang framework.  Again, if there is
> interest, I will forward that patch as well.
>
> I've made several other changes (configurable summary db name, summary db
> path, and config file name).  If there's interest...

These sound awesome.  I for one would be very interested and I know that the
author, John Kirkland, has been very interested and responsive to feedback.

> As others have mentioned, memory usage is a problem.  I'm regularly seeing
> in excess of 1Gb of ram in use when graphdefang runs.  I'm only 12 days
> into my data-set and have no idea how much more will be eaten when I get a
> full month's worth of data to graph.  Any ideas for reducing memory usage?
> One that I've had is to stop loading all of the data from the tie() and
> working off disk.  Which would have the unfortunate side affect of slowing
> down processing.  Anything else?

This is where my theory comes in that an RDBMS is the only way to scale
graphdefang to these types of large systems.  I have to believe that the
query systems, memory handling, etc. from an RDBMS is tailor designed for
these issues.

Regards,
KAM



More information about the MIMEDefang mailing list