Ticket #102 (closed defect: fixed)

Opened 12 months ago

Last modified 11 months ago

disk traffic huge with ~200,000 records in MUA DB

Reported by: chris.franks@… Owned by: Phil Smart
Priority: major Milestone: v1.1.0
Component: Raptor MUA Version: v1.0.1
Keywords: disk io caching Cc: richard.james@…

Description

MUA setup using MySQL on localhost.

Disk traffic is ~50,000KB/s (our mailing list server is doing under 250 KB/s - 200x less).

System still has free RAM. But is caching very quickly (upto 2.5g in cache, down to roughly 500m in 5 seconds, and back up again).

This is making everything slow (i.e. user input at the console can take 5/6seconds to appear, the raptorwebd, when its running is very slow). We think this is due to disk latency (high IO wait value in the CPU monitoring graph).

I followed http://iam.cf.ac.uk/trac/RAPTOR/ticket/85 initially, thinking this was a JVM memory issue but its still happening, with a limit of 512M (and only the MUA and database server running).

I'll attaced some munin monitoring graphs for the machine - is this a common problem with MySQL? It seems to run ok when we clear the tables but we get back up to ~160,000 records in about a week).

Any pointers for speeding things up would be appreciated.

Thanks.

Chris

Attachments

cpu-day.png (28.4 KB) - added by chris.franks@… 12 months ago.
cpu graph (white space is either pre-graphing or "too busy to graph"
memory-day.png (39.8 KB) - added by chris.franks@… 12 months ago.
memory usage (white space is pre graphing or else "too busy to graph")
hourdebug.log (97.7 MB) - added by chris.franks@… 12 months ago.
hours log from mua on debug

Change History

Changed 12 months ago by chris.franks@…

cpu graph (white space is either pre-graphing or "too busy to graph"

Changed 12 months ago by chris.franks@…

memory usage (white space is pre graphing or else "too busy to graph")

comment:1 Changed 12 months ago by smartp@…

Unfortunately we have our testing system using postgres rather than mysql which does not suffer problems on this scale. Would need to do some testing with MySQL, the new version is more optimised in DB access, but I still need to add some other things to the new version before its released. If you'r willing to attach the MUA log (on debug) for about a 1 hour period to that ticket I can see what queries are being performed and at what frequency.

Things a bit slow our end lately sorry.

Phil

Changed 12 months ago by chris.franks@…

hours log from mua on debug

comment:2 Changed 12 months ago by https://sid.kent.ac.uk/shibboleth!https://iam.cf.ac.uk/sp/shibboleth!zvh7tlm4lp9+gcgbtxeqnb6ldtw=

I see similar load problems with MySQL and with Postgres (I gave up with MySQL after not very long though but Postgres seems to come up with the same performance issues eventually).

raptor=# select count(*) from shibbolethidpauthenticationevent ;
 count  
--------
 436057
(1 row)

raptor=# select count(*) from ezproxyauthenticationevent ;
  count  
---------
 2974634
(1 row)

And with an "idle" system:

$ vmstat 5
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  0 695204 113092    936 697764   94   30   710    89    3    4 42  2 39 16  0	
 1  0 695028 111852    944 697848  224    0   224    18 1110  248 49  1 47  3  0	
 1  0 695028 111472    952 697972    0    0    36    38 1125  251 48  2 50  0  0	
 1  0 695156 112288    960 698176  100   57   161    95 1415  838 47  2 49  2  0	
 1  0 694960 110676    968 698812  225    0   336     9 1109  269 48  1 46  4  0	
^C

comment:3 Changed 11 months ago by smith@…

  • Status changed from new to closed
  • Resolution set to fixed

Improved for v1.1.0 - will reopen if necessary.

Note: See TracTickets for help on using tickets.