#IAMMEC Session Report: Exchange 2013 High Availability & Business Continuity

Simply put, Scott Schnoll is and always will be a legend among the Exchange crowd.  He always gives some great talks.

Let’s hop right into it:

  • Storage Enhancements
    • Multiple Databases per volume
      • Number of copies must equal the number on volumes
        • Example of valid design:  Databases 1 through 4 are copied onto 4 volumes in an ‘Active/Standby/Standby/Lagged’ configuration
        • This also ties into Automatic Reseeding
      • Reseeds are much faster in this scenario – potential days shaved off
      • 20% reduction in passive copy IOPS
      • Allows for an increase in disk utilization (MSIT was able to go from 360 to 600 mailboxes on the same spindles)
      • Failover is ~10 seconds in the production environment!
      • Requirements:
        • 1 Partition
        • The number of database copies per volume must equal the copies of each database
          • This also ties into Automatic Reseeding
      • Best Practices
        • Same neighbors on all servers
        • Balance your Activation Preferences
    • Automatic Reseeding
      • Uses spare drives mapped to mount points to handle failover (think of it like storage level sparing)
      • Much easier to implement this from the start – while you can transition to this, it is NOT easy
      • Basic ‘How it works’
        • Configure Storage
        • Create DAG
        • Create Directory and mount points
        • Configure DAG Properties
          • AutoDagVolumesRootFolderPath
          • AutoDagDatasesRootFolderPath
          • AutoDagDatabaseCopiesPerVolume
        • Create Databases
        • Put some mailboxes on them
    • Automatic Recovery from Storage Failues
      • Used for hung databases, locked controllers, etc.
      • Works like Managed availability – will restart services/attempt to mount, else restarts server if good copy exists elsewhere in DAG that is mountable
    • Improved Lagged Copies
      • Activation is simplified!
      • Automatic Playback of log files in critical situations
        • Yes, Exchange 2013 is nearly self-aware.
      • Integration with Safety Net (FKA Transport Dumpster)
      • Takes care of itself
        • Handles low log disk space (plays in log files to free up space)
        • Page Patching
        • Activates itself if less than 3 good copies!
          • Also configurable via AD and registry (have to modify both)
      • No need for log surgery or hunting for corruption – it handles it for you!
  • Managed Availability
    • Gives the Best Server & Copy Selection process increased functionality and intelligence in its decision making processes
    • It notifies through the event logs so 3rd party folks can snag events
    • SCOM is highly recommended
    • It really will damage test environments if you don’t change the defaults!
  • Single Copy Alerting
    • In 2010 it was a scheduled task
    • Now it’s an MA rule
    • Will throw eventID 4138 for Red, 4139 for Green
  • Best Copy Selection Changes
    • Looks at the entire server
    • Active Manager runs it
    • Includes the protocol stack to insure that any move will actually function.
    • Will prefer servers in this order:
      • All Healthy Status
      • Health Sets at Medium or higher
      • Better than source
      • Same as source
  • Maintenance Mode
    • Now there’s a maintenance mode!
  • DAG Networks
    • Automatically configured
      • This is based on the idea that your MAPI network interface has a registration in DNS while your Replication interface should NOT.
    • Can always be manually configured.
  • Site Resiliency
    • 2010 – Operationally complex, you have to shift everything (MBX, HUB, CAS) and the namespace is a single point of failure
    • 2013 – Simplified, can recover either CAS/MBX or both, and the namespace is now redundant
    • Recovery is automatic now
    • DNS resolves to multiple IP address
    • Clients (Outlook 2007+, OWA, etc.) will skip failed IP addresses after 20 second timeouts
    • Preferred Configuration
      • 3 Sites
        • 2 sites with Exchange components
        • 1 site with file share witness (minimal requirements)
          • This host is highly recommended to be isolated from the networks of the 2 Exchange sites (i.e. if your 2 sites are on ATT, run the 3rd on Verizon)
    • Expect ~20 second site failovers for CAS, similar time for MBX
    • Can manually failover sites through 3 powershell commands!

So, I’ll give you some brief Cliff’s on this rather lengthy synopsis:  It will be very difficult to kill off a properly architected Exchange 2013 installation.

Advertisement
This entry was posted in Disaster Recovery, Exchange, Exchange 2013 and tagged , , , , , , . Bookmark the permalink.

1 Response to #IAMMEC Session Report: Exchange 2013 High Availability & Business Continuity

  1. Spot on with this write-up, I really believe this website needs much more attention.
    I’ll probably be back again to read through more, thanks for the advice!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s