Wednesday, 11 April 2012

Exchange 2010 Management Pack Woes...

We've had quite severe performance issues on our Exchange 2010 implementation since roll out.  Basically we ran out of CPU.  Not nice.

The replication service on both mailbox servers slowly ramped CPU usage so that the server went up to 100% which then caused all manner of nasty failures.

Through-out our troubled times we have
  • Looked at the Exchange 2010 configuration with a fine toothed comb
  • Had 3 "Exchange Experts" come in and check things out
  • Raised PSS call with Microsoft
  • Raise call with VMWare (the servers are all virtual)
  • Increased RAM and CPU to ESX limits (almost - we only have 96GB in our hosts)
  • Tweaked registry settings
  • Moved to use pvSCSI instead of vSAS and back again (pvSCSI has issues in high IO environments!! We need new pvSCSI drivers)
  • Rebuilt a second set of Mailbox servers
None of these things has shown us the cause or indeed fixed an unknown cause.  Until we made a configuration mistake.

The guys who rebuilt the environment failed to install the SCOM agent (at first).  These boxes showed no signs of CPU creep.  Once they realised the missing SCOM agent they quickly installed it.  Suddenly we saw CPU ramping up albeit slowly over a period of 2-3weeks.

So we turned off the agent and the CPU settles back down.

We are now looking at removing the Exchange 2010 management pack from our environment to see if this is the cause rather than the SCOM agent.

More news when I know.

20/07/2012

Ok so now we have an Exchange environment with NO live mailboxes only a handful of test mailboxes.  With the SCOM agent installed we see the rising CPU.  Uninstall the SCOM agent and no rise.

Current situation is that we are not using SCOM to monitor our Exchange production servers.  Not good.  Microsoft don't seem to know what is going on either.




2 comments:

  1. We are getting this on 2 of our production Exchange 2010 boxes. Both are configured identically to servers in another site (VMWare, same configs). It's definitely the SCOM agent. Did you ever get to the bottom of this?

    ReplyDelete
  2. No. Sorry. I wish I had. I have since changed jobs so do not know if this is now fixed but when I left (2013) they were still running without SCOM monitoring Exchange 2010.

    ReplyDelete