#IAMMEC Session Report: Virtualization in the New Exchange

  • Supported
    • Hypervisors
      • HyperV
      • 3rd Party Hypervisors
    • Host-based Clustering
      • All roles
    • Exchange Roles
      • All roles
    • Storage
      • Block level
        • iSCSI, FC, pass-through, etc.
      • Same requirements as Ex2010
        • Support for SMB3 file shares for specific limited scenario
          • Guest on 2012 with guest VHD’s on SMB3
          • CANNOT point guest to SMB3 share (will fail)
          • Still will not support running on NFS-based VMWare guest location
    • Migration
      • All Ex2013 roles
      • Never want to quick-migrate a machine on a DAG
        • Hyper-V specific
        • Live Migration/vMotion completely supported
    • Jetstress testing in guests
      • Yes, on supported Windows hypervisors or ESX 4.1 (or newer)
      • Used to be an issue due to perf counters not being accurate
  • Not Supported
    • Dynamic memory & Memory Overcommit
      • Not supported for any 2013 role
    • Hypervisor Snapshots
      • Not Supported for any 2013 role
    • Differencing/Delta Disks
      • Not Supported for any 2013 role
    • Apps on the root
      • Only deploy management, monitoring, AV, etc.
    • Significant processor oversubscription
      • Limited to 2:1, best practice is 1:1
        • Start hitting at CPU contraints
          • Delivery throughput reduction = queue growth
          • Content indexing throughput reduction = increased IOPS
          • Store ROP processing throughput reduction = RPC latency & end-user pain
  • Proper Exchange sizing ensures that resources are available on-demand, so don’t allow hypervisors to yank those resources away.
  • Server 2012
    • Many deployment-blocking limits removed
    • Customers who virtualize 2013 will have a great experience on Server 2012 HyperV
    • Important to be aware of what does and doesn’t work (and supportability limits)
  • Multirole vs. Single Role on Virtual
    • Deploy it the same way as you would the physicals
  • Hyper-V Gotchas
    • 10% Overhead for Hyper-V Only

There was a tweet that gave a synopsis of a previous duplicate of the session: “Don’t virtualize Exchange.”  This is the wrong answer.  Virtualize, but do so correctly.  Don’t half-ass it, and you’ll pull it off splendidly.

Posted in Deployment, Exchange 2013, Virtualization | Tagged , , , , , | Leave a comment

#IAMMEC Session Report: Exchange 2013 Sizing Guide

  • Brief History of Exchange Sizing
    • Sizing come from production deployments
      • Dogfood, MSIT, customer and field feedback
    • Final IOPS comes from internal deployments with isolated user proviles
    • Guidance is conservative due to deployment scenarios that are different from Microsoft
    • Best Practices always includes IOPS validation with Jetstress, some customers use Loadgen for better system validation
    • MBX Role Requirements Calculator is the BEST way to size Exchange 2010
  • Impact of 2013 Architecture
    • Single MBX role provides:
      • De-facto partition unit
      • Unit of High Availability
      • Cache Efficiencies
      • Hardware Efficiencies
      • Simialr to 2010
        • Must factor in CPU, Memory, Storage, and Network
    • Mailbox Servers
      • Memory requirements have increased in Exchange 2013
        • Mailbox or ‘all in one’ must have at least 8GB
      • Minimum CPU requirements follow published OS guidelines
      • Disk space requirements on install drive increased dramatically
        • ~30GB
        • Don’t forget to factor in your page file!
          • Memory Size + 10MB
      • Storage Sizing
        • Storage is sized for capacity as well as IOPS
        • Transport has storage requirements, may be sufficient to place transport queue on OS volume, consider space requirements of max queue growth
        • Safety Net results in additional storage needs for transport queue
          • Will dynamically increase this based on mail flow profiles and message sizing
          • Can move this off to dedicated drives/elsewhere if needbe
          • Store IOPS requirements reduced compared to prior versions
          • High availability design in a key factor in both capacity and IOPS sizing
    • CAS Role
      • Memory sizing
        • Significant memory requirements
        • Memory scales based on connection counts
        • MS guidance will likely scale memory with allocated CPU cores
        • Final guidance for 2013 likely equal to or a bit above current CAS 2010 guidance
      • CPU Sizing
        • CPU core ratio for determining required cas cpu resources when rtm guidance is available
        • Exchange 2010 guideline was 3:4, architecture change in Exchange 2013 reduces this
        • High availability likely a major factor in the number of required CAS servers
        • Disable hyperthreading (SMT) for the same reasons discussed for the Mailbox role
          • Reason:  A lot of managed processes, memory is allocated based on the number of cores, including the number of HT cores, so on allocation, more memory is allocated than is realistically needed and negatively impacts the system
          • Significant impact to some Exchange service memory footprints
          • SMT provides gain in processor throughput, but overall the gain is not worth the ‘cost’ based on our lab measurements
          • Virtualized environment:
            • VM will see the number of CPUs that are exposed
    • Aim for balanced hardware
      • Lowest TCO usually involves getting the most out of your hardware
      • Consider a 2U server with 12 LFF disks as a well-balanced hardware playform for the mailbox role
        • 2 drives R1 for OS and transport
        • ~9 drives for mailbox databases, no raid
        • ~1 drive as a spare
    • Sizing without guidance and tools
      • Use the balanced hardware as a starting point
      • Be conservative
      • Size for high availability requirements (failure domains!), then migrate slowly while monitoring
      • Add more hardware as necessary
      • Or, optionally, wait for guidance!
  • Gist
    • Mailbox
      • Increasing CPU slightly over 2010
      • Increasing RAM a good bit over 2010
      • Storage is a 50% reduction over 2010
    • CAS
  • Tools
    • Mailbox Role Requirements calculator/
      • Released at/after RTM
    • Jetstress
      • Available at RTM + 3 Months
      • ESRP
        • For storage partners, same deal, +3 months
    • Loadgen
      • RTM + 3 months

The biggest deal brought from the sizing session was the 3 month number.  So much that I verified with the speakers that those numbers were accurate.

If you are not willing to guess at your hardware (or just spend a ton and make sure you don’t have to do so), you will basically have to wait until those tools are released.

Posted in Uncategorized | 2 Comments

#IAMMEC Session Report: Exchange 2007/2010 to 2013 Migration

  • Preparation
    • Prepare
      • “Exchange Deployment Assistant” – available on TechNet
      • Verify Prerequisites
      • Install Exchange SP and/or updates across the ORG
        • Exchange 2007 will be a RU, therefore it cannot verify in AD that all servers are at the same level
        • UPDATES WILL NOT BE AVAILABLE UNTIL AFTER THE FIRST OF THE YEAR
          • Meaning you can deploy a fresh environment on RTM, no coexistence until 1Q13
      • Prepare AD with E2013 Schema
      • Validate Client Access
        • Outlook 2003 not supported
        • Entourage 2008 EWS supported
    • Deploy Exchange 2013 Servers
      • Install both E2013 MBX and CAS servers
    • Create Legacy Namespace
      • Create DNS record to point to legacy E2007 CAS
    • Obtain and Deploy Certificates
      • Obtain and deploy certificates on E2013 CAS
      • Deploy certificates on Exchange 2007 CAS
        • For the legacy.contoso.com namespace
    • Switch Primary Namespace to Exchange 2013 CAS
      • Validate client access
    • Move Mailboxes
      • Build out DAG
      • Move users to E2013 MBX
    • Repeat for additional sites
  • Hosting Multiple Roles
    • Recommend NOT consolidating roles to 1 server so that you can take full advantage of the stateless CAS architecture
  • Client Protocol Connectivity Flow
    • Autodiscover
      • 2007 requires the legacy namespace because it doesn’t have the logic to handle the requests
      • 2010 – set the internal autodiscover uri to 2013
        • Outlook clients will lookup the SCP records in AD (oldest first)
      • 2007 – set the internal autodiscover uri to 2013
        • Outlook clients will lookup the SCP records in AD (oldest first)
    • Clients
      • Site scope is still controlling the SCP lookup
        • Prevents cross-site lookups
      • 2007 – Outlook clients go directly to mbx on RPC/TCP
      • 2010 – Outlook clients go to CAS array on RPC/TCP
    • Outlook Anywhere
      • 2007 – IIS Auth: NTLM is required and must be manually configured
      • 2010 – IIS Auth: NTLM is required and configured by SP3
      • 2013 – Connects to legacy servers and authenticates via NTLM
      • If you have an intranet site, you would likely not have OA on the internal side, but you need to do so in the coexistence scenario
      • Client settings need to match what is on the Ex2013 CAS
      • Remember
        • Enable Outlook Anywhere on intranet 2007/2010 servers
        • Make 2007/2010 client settings the same as 2013 server
        • IIS Authentication methods must include NTLM
    • OWA
      • 2010 – single sign on, Ex2013 CAS proxies for Ex2010
      • 2007 – 2013 will auth user on mail.contoso.com, lookup cas, issues redirect to legacy.mail.contoso.com, user logs into 2007
        • Yes, that means dual logons K
        • No really, that means your users will have to logon twice.
        • Seriously, get ready to piss off a lot of people.
        • If you happen to have a TMG deployment, you don’ t have to piss people off.
        • It will HTTP proxy to the internal sites for you.
        • If you have a 2nd site with a unique external namespace, then you’re better off as you will get automatically routed there without the double authentication.
    • EAS/EWS
      • 2010 – You’re good, SSO via SP3, even proxies 2nd sites with external namespaces
      • 2007 – From client to LB to mail.contoso.com to EX2013 CAS to EX2013 MBX to EX2007 CAS to EX2007MBX
        • No, really.
        • Has to be this way because of code handling of phone devices that don’t go through multiple autodiscovers
        • Same for intranet sites
          • Though if you have a separate namespace you route directly there
    • POP/IMAP
      • 2010 – Proxies to appropriate CAS
      • 2007 – Same, proxies to appropriate CAS
    • SMTP
      • Handled on the MBX

Let me make the Cliff’s Notes simple:

If you are on Exchange 2010, it will be a cakewalk.

If you are on Exchange 2007, you are going to have a lot of fun and it will be more painful for everyone – not only yourself, but your users as well.

Posted in Exchange, Exchange 2013, Migration | Tagged , , , , , | Leave a comment

#IAMMEC Session Report: Ex2013 Planning and Deploying High Availability

  • Lagged Copies
    • Dealt with automatically by the system but the tradeoff is that it can be automatically played forward in case it’s needed
    • SafetyNet is what was Transport Dumpster, now it’s Transport Dumpster on Steroids
    • Lagged Copy is 7 days, Safety Net is 7 days
    • Lag will play back without activating until ~400 logs are left in the playback queue.
      • If another copy becomes available that will mount, else the Lag copy will come in to play.
  • NLB – Supported for 8 or fewer CAS array targets, not recommended though.
  • Load Balancers – layer 7 no longer required, layer 4 load balancers sufficient enough to handle the CAS traffic
  • Hub Transport runs on all members of a DAG.
  • File Share Witness for Site Resiliency purposes should be located in a 3rdlocation
    • Make sure the connection from both Exchange sites -> FSW is NOT routed through one of the exchange sites to hit the other.
  • Requests in 2013 from the client will make 1 request from the cas to the mbx in a multisite environment, vs. 2010 making multiple requests
  • “If you put your cas load to your dr site without putting mailboxes there, it’s like dipping your toe in the lake and calling that your swim for the day.”  Put it all in the DR site for testing it regularly.  Run it for a few days, a couple of weeks, etc.
  • Work the hardware out
    • Don’t let the hardware go stagnant.  Run production from your DR site and make certain you’re doing good on your failovers.
  • Exercise your recovery on a regular basis.

In all this was a great session.  Only downside was having to deal with people lapsing to the Site Resiliency side of things, which was a totally different discussion topic.

Posted in Disaster Recovery, Exchange, Exchange 2013 | Tagged , , , , , | Leave a comment

#IAMMEC Tip: Exchange 2013 Admin Center Access via Downlevel Account

When you first install Exchange 2013, the Exchange Admin Center is a bit quirky if you want to use it with an administrative account that is sitting on an earlier version.

You will run into an issue where it will attempt to route your request to your legacy site, which doesn’t help you manage your entire infrastructure from the Exchange 2013 Admin Center.

To get past this, you have to modify the URL that you use to access it like so:

https://mobile.contoso.com/ecp?ExchClientVer=15

Note the value for ecp – in theory (as they aren’t sure if it’ll be available in release) you can modify this for 14 (for Exchange 2010).

Posted in Exchange, Exchange 2013, Management | Tagged , , | Leave a comment

#IAMMEC Session Report: Exchange 2013 High Availability & Business Continuity

Simply put, Scott Schnoll is and always will be a legend among the Exchange crowd.  He always gives some great talks.

Let’s hop right into it:

  • Storage Enhancements
    • Multiple Databases per volume
      • Number of copies must equal the number on volumes
        • Example of valid design:  Databases 1 through 4 are copied onto 4 volumes in an ‘Active/Standby/Standby/Lagged’ configuration
        • This also ties into Automatic Reseeding
      • Reseeds are much faster in this scenario – potential days shaved off
      • 20% reduction in passive copy IOPS
      • Allows for an increase in disk utilization (MSIT was able to go from 360 to 600 mailboxes on the same spindles)
      • Failover is ~10 seconds in the production environment!
      • Requirements:
        • 1 Partition
        • The number of database copies per volume must equal the copies of each database
          • This also ties into Automatic Reseeding
      • Best Practices
        • Same neighbors on all servers
        • Balance your Activation Preferences
    • Automatic Reseeding
      • Uses spare drives mapped to mount points to handle failover (think of it like storage level sparing)
      • Much easier to implement this from the start – while you can transition to this, it is NOT easy
      • Basic ‘How it works’
        • Configure Storage
        • Create DAG
        • Create Directory and mount points
        • Configure DAG Properties
          • AutoDagVolumesRootFolderPath
          • AutoDagDatasesRootFolderPath
          • AutoDagDatabaseCopiesPerVolume
        • Create Databases
        • Put some mailboxes on them
    • Automatic Recovery from Storage Failues
      • Used for hung databases, locked controllers, etc.
      • Works like Managed availability – will restart services/attempt to mount, else restarts server if good copy exists elsewhere in DAG that is mountable
    • Improved Lagged Copies
      • Activation is simplified!
      • Automatic Playback of log files in critical situations
        • Yes, Exchange 2013 is nearly self-aware.
      • Integration with Safety Net (FKA Transport Dumpster)
      • Takes care of itself
        • Handles low log disk space (plays in log files to free up space)
        • Page Patching
        • Activates itself if less than 3 good copies!
          • Also configurable via AD and registry (have to modify both)
      • No need for log surgery or hunting for corruption – it handles it for you!
  • Managed Availability
    • Gives the Best Server & Copy Selection process increased functionality and intelligence in its decision making processes
    • It notifies through the event logs so 3rd party folks can snag events
    • SCOM is highly recommended
    • It really will damage test environments if you don’t change the defaults!
  • Single Copy Alerting
    • In 2010 it was a scheduled task
    • Now it’s an MA rule
    • Will throw eventID 4138 for Red, 4139 for Green
  • Best Copy Selection Changes
    • Looks at the entire server
    • Active Manager runs it
    • Includes the protocol stack to insure that any move will actually function.
    • Will prefer servers in this order:
      • All Healthy Status
      • Health Sets at Medium or higher
      • Better than source
      • Same as source
  • Maintenance Mode
    • Now there’s a maintenance mode!
  • DAG Networks
    • Automatically configured
      • This is based on the idea that your MAPI network interface has a registration in DNS while your Replication interface should NOT.
    • Can always be manually configured.
  • Site Resiliency
    • 2010 – Operationally complex, you have to shift everything (MBX, HUB, CAS) and the namespace is a single point of failure
    • 2013 – Simplified, can recover either CAS/MBX or both, and the namespace is now redundant
    • Recovery is automatic now
    • DNS resolves to multiple IP address
    • Clients (Outlook 2007+, OWA, etc.) will skip failed IP addresses after 20 second timeouts
    • Preferred Configuration
      • 3 Sites
        • 2 sites with Exchange components
        • 1 site with file share witness (minimal requirements)
          • This host is highly recommended to be isolated from the networks of the 2 Exchange sites (i.e. if your 2 sites are on ATT, run the 3rd on Verizon)
    • Expect ~20 second site failovers for CAS, similar time for MBX
    • Can manually failover sites through 3 powershell commands!

So, I’ll give you some brief Cliff’s on this rather lengthy synopsis:  It will be very difficult to kill off a properly architected Exchange 2013 installation.

Posted in Disaster Recovery, Exchange, Exchange 2013 | Tagged , , , , , , | 1 Comment

#IAMMEC Session Recap: Manageability and Monitoring Talk

In the afternoon I attended the Manageability and Monitoring talk by Charlie Chung and Greg Thiel and took some rough notes.

  • EX2013 – Protocols are always served from the protocol instance that is local to the active database copy
  • Best way to monitor is at the user level
    • Availability – can I access the service?
    • Latency – how is my experience?
    • Errors – am I able to accomplish what I want?
  • Scaling applications magnifies errors
  • Main goal is to reduce the alerts
  • ‘Stuff breaks and the experience does not’
  • Managed Availability Overview
    • Probes will determine if something is broke
    • Check will validate thresholds and make certain they’re within certain parameters
    • Notify will take bugs and other odd failures
    • Monitor will take data from probes/checks/notifies and apply business logic to either recover if it’s possible (i.e. restart services, etc.) or escalate to a human
  • This runs on every single box and then reports up to SCOM (if you are so inclined to utilize SCOM)

Basically, the built in manageability tools that are automated are similar to SCOM – same building blocks and concepts apply, so if you’ve utilized SCOM, then you’re going to be used to how these tools operate.

And like the SCOM tools, you can also modify the defaults if needed.  However, this should rarely be needed in your production environment.  In a TEST environment, however, you really need to change the defaults, or you’ll start getting some odd issues creep up (since test labs are generally thin on resource availability).

Posted in Exchange, Exchange 2013 | Tagged , , , | Leave a comment