Wednesday, October 28, 2009

Exchange 2010 - Recovery Scenario #2 - Recover from a DAG member loss

In this scenario, I have a 3 three server DAG, and I use Windows Server backup to backup my Exchange 2010 Active database. On the server with the active copy, I hit the virtual power button. The Exchange services fail over to another server in the DAG right away.


The Microsoft documentation:

Recover a DAG member Exchange Server


Remove the copy of the Database in a DAG:



This will warn that it cannot communicate with the server. That is expected.


Then, you can remove the server from the DAG:





Reinstall Windows 2008 R2 from DVD (remember, DAG requires Enterprise!)

Reset computer account in Domain (Right click Reset in AD Users and Computers)

Name and IP the server, confirm the date/time is correct (since in a DAG, I also needed to IP my DAG network)

Install Exchange Pre-Requisites

Install Exchange 2010 using:

setup /m:recoverserver




If you skipped the DAG removal steps above, setup will fail with:



Once setup succeeds, you need to reboot the server (at this point, I would also patch as needed - at the time of this writing 2008 R2 and Exchange 2010 RC have no additional patches)


Since I have a DAG, I am able to re-add it to the database availability group and allow the database to reseed. If you were in a single server environment, this is where your backup would come into play. This might be scenario #3


Add the recovered server back to the DAG


Add a mailbox database copy to the recovered server.


Assuming you have a real DAG and you might have 300GB of data to re-seed by adding a copy, this is where the Windows Server Backup may play part. You may be able to restore the Exchange data to an alternate location, and before you add the database copy, move the restored EDB to the folder path for the database. This would allow you to skip time consuming reseeding, as long as your restored EDB was the most recent backup taken of the database.


Optionally, you can activate the database on the recovered server as well.

Tuesday, October 27, 2009

Exchange 2010 - Recovery Scenario #1 - Mailbox or items

In this post I wrote about how you can now backup Exchange 2007 SP2 and Exchange 2010 mailbox databases with Windows Server backup.

Since them, word of Exchange 2010's release has come, and with that, the question of "when will my backup vendor provide updates that are compatible with Exchange 2010" and "what can I do in the meantime"

One of the easiest solutions is using Windows Server backup, and then allow your existing backups to do file level backups of that data. The question once this is in place, of course is how do I recover from that?


So I intend to cover three scenarios:

  1. Single Item or Mailbox recovery - accidental delete, assuming you passed or misconfigured deleted item and deleted mailbox retention.
  2. Loss of a DAG member - How to recover from losing a single member of a DAG.
  3. Entire Server recovery - Building/site failure, need to return to service and restore data from backups. (this will be single server from backup)

Additionally, I am using a database that is in a DAG for this, but am writing it as if it was standalone, as the #2 and #3 scenarios would be addressed by the Database Availability Group.

So, on to scenario #1 - I have disabled my mailbox in the EMC, and run the below powershell to force the database clean:

Clean-MailboxDatabase geodb1

Now I see my mailbox under "Disconnected Mailbox" in a normal scenario, this is what Exchange 2003 and up has offered, where I could right click my mailbox and choose to re-connect it to my user account:



Of course, I want to go to backups so I reset my database to have a 0 day deleted item mailbox retention and refreshed this screen and my mailbox was no more!



Do note, the settings in the above screenshot are NOT recommended. Default is 30 days, and I recommend leaving it there or higher!

Next, we must recover the data "to an alternate location" using Windows Server Backup
Choose your Backup Date, then your recovery type should be "Applications"

Choose Exchange:

(I included the show details, which is the store GUID)

I chose here to recover to another location (Note: c:\RDB1 is NOT where my RDB's EDB/logs/anything are)

Do note that "this option will copy just the application data" - there are additional steps after this!

Finally, launch the recovery.

Once completed, you will have the file structure of the database in the path specified:

Now that we have our data files, the recovery is similar to Exchange 2007 SCR's database portability.

Run ESEUTIL /R from the log file directory

Then we can run:
Eseutil /mh geodb1.edb

And determine the DB is healthy:

Next few steps edited on 11/3/2009 for missing content:

Now we can create our new Recovery mailbox database using:
new-MailboxDatabase -Recovery -Name rdb1 -Server exch2010 -EDBFilePath "c:\rdb1\rdb1.edb" -LogFolderPath "c:\rdb1\"

Then we need to allow Restores:
Set-MailboxDatabase -AllowFileRestore:True

Copy the EDB file to the path EDBFilePath of the RDB1 Database, renamed it appropriately, and then it should mount successfully (NOTE: Logs didn't need to be copied since ESEUTIL /R replayed them into the EDB, however if you do copy them into place, Exchange will see they are replayed and move on)

Once mounted, we can use

get-MailboxStatistics -database rdb1

to see that the data is there:>

Now, in the Exchange documentation, it states that:

Restore-Mailbox -Identity chris -RecoveryDatabase rdb1

would recover the data into the mailbox. The problem is, we don't have a mailbox with that GUID any more. If I re-enable a new mailbox for chris, he will get a new mailbox GUID.

Enable-Mailbox chris
Restore-Mailbox chris -RecoveryDatabase rdb1

I get:

This makes sense, it cannot match GUID's and stops - more on this in a second.

However, you are able to run a recovery operation (similar to Export-Mailbox in Exchange 2007)

Restore-Mailbox -RecoveryMailbox chris -Identity chris -RecoveryDatabase rdb1 -TargetFolder "Recovery"

And the results, all of the content in a subfolder named "Recovery"



I attempted a few other things to see if I could restore directly into the mailbox, but was not able to find any luck.

Important to note - if I was recovering for a user that missed their deleted item retention time, I can use the restore-mailbox to specify by subject, dates, folders and more. Because I mail disabled the user, I am not able to restore directly.

Friday, October 16, 2009

OCS Voice Ignite Training - Registered!

Yay!

We have been trying for most of the year to get me a seat at one of these, and instead, I think we got two seats so I will get to tag along with one of our Cisco Voice guys as well (this should be helpful in backfilling voice knowledge for me!)

Pretty stoked to get there. Irving, TX in February!


Photobucket

Wednesday, October 14, 2009

Installing Exchange 2010 quickly using PowerShell

In Exchange 2007, I typically used ServerManagercmd.exe to quickly deploy required Exchange 2007 parts. In Exchange 2010, when I ran ServerManagercmd, I get the warning that:

Servermanagercmd.exe is deprecated, and is not guaranteed to be supported in future releases of Windows. We recommend that you use the Windows PowerShell cmdlets that are available for Server Manager.

This is replaced by Powershell commands:
Get-WindowsFeature
Add-WindowsFeature
Remove-WindowsFeature

When you run those - you will get an error. Need to run this first:
Import-Module ServerManager

So let's see how fast we can make this go! I am installing for ALL roles. If you need to split out roles, you should read the MS documentation at:
http://technet.microsoft.com/en-us/library/bb691354(EXCHG.140).aspx

You can of course use the Servermanagedcmd -i that they give you, but knowing it's deprecated and I will be doing this 40 times next year, I wanted to know the new way. So here it is!

Install 2008 R2 off a CD/ISO
Set computer time, networking, machine name and domain
Install the AD tools using Powershell
Add-WindowsFeature RSAT-ADDS
Re-Boot
Upon reboot, launch PowerShell as Administrator and copy paste the below (again, this is for MBX, HT, CAS on a single server, check the link above for more detailed pre-requisite planning)
Add-WindowsFeature Web-Metabase, Web-Lgcy-Mgmt-Console, Web-Server, Web-ISAPI-Ext, Web-Metabase, Web-Lgcy-Mgmt-Console, Web-Basic-Auth, Web-ASP, Web-Digest-Auth, Web-Windows-Auth, Web-Dyn-Compression, Web-Net-Ext, RPC-over-HTTP-proxy, AS-NET-Framework, NET-HTTP-Activation

Set the TCP .net sharing service to automatic startup
Set-service NetTcpPortSharing -startuptype automatic

Optional - download and install the x64 version of the Microsoft Office Filter Pack (this allows office attachments content to be searched and indexed.

And from the Exchange 2010 install directory: .\setup.com /mode:install /roles:mb,ht,ca

Now, if you take this and extend it to using PowerShell's remote capabilities, you can prep a BUNCH of 2010 servers quickly!

Tuesday, October 13, 2009

Exchange 2010 - What is an arbitration mailbox?

If you found this searching you most likely found out about arbitration mailboxes much the way most people will. By either finding out they accidentally deleted them, or by finding out that you need to move, disable or remove them in order to delete a database, or uninstall Exchange 2010, or remove the mailbox role.

From TechNet:
"Arbitration mailboxes are used for managing approval workflow. For example, an arbitration mailbox is used for handling moderated recipients and distribution group membership approval."

This is part of the Moderated Transport features that are new in Exchange 2010.

A lot more information about using arbitration mailboxes can be found here: Understanding Moderated Transport

In short, arbitration mailboxes are where messages awaiting moderation are stored, as well as information about moderator decisions are kept.

Now back to getting over the two most common immediate needs for arbitration mailboxes.

I deleted my arbitration accounts from my AD
This isn't really all that bad. I did it the first time I installed Ex2010 and had a panic moment before I found the fixes. Pretty simply, you need to rerun the AD preparation steps from the 2010 media.

Setup.com /PrepareAD
Setup.com /PrepareSchema
Setup.com /PrepareDomain

Only /prepareAD is required to recreate these accounts, but I left the other steps in here as well just for documentations sake.

I am trying to remove Exchange 2010, or a database, or the Mailbox role and am being told there are arbitration mailboxes preventing me from continuing
This is also not too bad. When you try to remove the first DB in Exchange 2010, there are a few arbitration mailboxes that will prevent database deletion. You have the choice of moving, removing, or mail-disable these mailboxes. Since you cannot see these in the Exchange Management Console, you need to launch Exchange Management Shell (EMS)

Get-Mailbox -Arbitration

This will list the arbitration mailboxes. To narrow it down to a specific database, you can edit this to:

Get-Mailbox -Arbitration -Database DB1

If you are used to PowerShell cmdlets in Exchange 2007, one big change to recall here is that specifying servername\databasename won't work anymore. This is one of the reasons why the database names need to be unique to the organization - so you don't have to specify servers anymore!

Once you have your "get" command returning the correct list of mailboxes, it's time to move, disable or remove them. Disabling the last arbitration mailbox is not allowed, so I recommend moving them as the first preference here.

Move:
Get-Mailbox -Arbitration -Database db1 | New-MoveRequest -TargetDatabase db2

Disable:
Get-Mailbox -Arbitration -Database db1 | Disable-Mailbox -Arbitration

Remove:
Get-Mailbox -Arbitration -Database db1 | Remove-Mailbox -Arbitration -RemoveLastArbitrationMailboxAllowed

If there is enough interest a little later, I may do a write up on using the arbitration mailboxes, but at this point there is still a lot of other Exchange 2010 things to learn and figure out!

Monday, October 12, 2009

Virtualized DC issue with time synchronization

This was a pretty simple mistake but it took me a while to figure out. We were noticing everything on the domain was 10 minutes behind other devices we used and in troubleshooting, I configured my PDC emulator to sync with pool.ntp.org using NTP, and it did, and seconds after, the time would revert back.

Of course, the issue here was that the DC was virtualized and Hyper-V time synchronization was taking preference and syncing to the Hyper-V server's local time which had fallen out of sync. The fix here was either disabling time synchronization on my DC, or enabling NTP pool synchronization for my Hyper-V server. I chose the latter, and moments later all machines were the correct time.

Credit where credit is due:
http://www.aperture.ro/index.php/2009/01/windows-time-sync-hyper-v-enabled-domain-controller-dilemma/



Thursday, October 08, 2009

Exchange 2007 SCR How to - Part 2

This is a continuation from Part 1 where we configured the SCR replication.


Failing over to the target SCR server


In this example, I am not having an ACTUAL failure. I am choosing to dismount the DB on the source server. I typically also will check the "do not mount the database store on startup" just so that if I do get to stopping/starting any services later that I don't accidentally remount the DB that I am attempting to fail over. And once I successfully fail over, I delete the now "old" SG and database from the EMC as well as the EDB file and transaction logs that were associated. I like to keep out of date data tidy like this. If you disagree, at the minimum, you should move these files into a well labeled folder so you know exactly what/when the files were from so that 6 months later (or wherever your comfort level is) you can do housecleaning in an educated manner. In order to ensure we have live data, I sent myself an email just before dismounting the database. You can see behind it the Get-StorageGroupCopyStatus showing the status as Healthy, with the hard coded 50 log replaylaglength.



First, we dismount the active database (in a real failure, something did this for you)


Dismount-Database OECU-EXCH1\ExecDB


Once the source database is dismounted, we can begin the SCR activation process. The first step here is to create a new SG and a new DB. DO NOT USE the folders/files/paths of your SCR data! For example, create a new SG named "RecoverySG" and a new DB named "RecoveryDB" and have their paths be new/unused folders and paths. (this is the part I mentioned above is not clear enough in the technet article)Run powershell as admin so it can do file operations!

New-StorageGroup -Name RecoverySG -LogFolderPath 'D:\Exchange Logs\RecoverySG' -SystemFolderPath 'D:\Exchange Logs\RecoverySG'


New-MailboxDatabase -name RecoveryDB -StorageGroup RecoverySG -EdbFilePath 'D:\Exchange Databases\RecoveryDB\REcoveryDB.edb'


Notice again - those are NEW and empty paths and files. If you attempt to use your SCR data at this point, you will have undesirable results! Now we run the restore command. This checks the status of the log shipping, and will attempt to copy missing log if files if needed. It also disables the original SCR and makes the database viable for mounting.


Restore-StorageGroupCopy "EXCHANGESOURCE\ExecSG" -standbymachine EXCHANGETARGET



Now is a good time for a quick reminder on what SCR is and how it does log shipping. SCR essentially just copies log files to the target, and when a backup occurs, the target will also replay those log files. So if we skip the next step, we risk bringing up a database that essentially is only as up to date as the last backup. If that was just before this test, it may not be a big deal, or may not be noticed (especially in lab where you don't have live mail flow, etc) So now we need to run ESEUTIL /R to replay the log files. This is done by running eseutil from the location of the log files like so:


eseutil /r E00


The /r Exx is the prefix for that databases logs (you can check by looking in the log folder directory for that storage group)


This should replay the logs and bring the database to a clean shut down state. You can confirm this by running eseutil /mh and specify the EDB file. The database state should be Clean Shutdown.


The below command updates the new RecoverySG's paths to match the paths of our SCR Database. The "configurationOnly" flag here tells it NOT to move the existing file, but to just change the configuration.

Move-StorageGroupPath "EXCHANGETARGET\RecoverySG" -SystemFolderPath "D:\Exchange Logs\ExecSG" -LogFolderPath "D:\Exchange Logs\ExecSG" -ConfigurationOnly


Now we need to do the same thing for the Database - point the "new" DB at our "recovered" data.

Move-DatabasePath "EXCHANGETARGET\RecoveryDB" -EdbFilePath "D:\Exchange Databases\ExecDB.edb" -ConfigurationOnly


Now we need to set the database to allow file restore. This is what will allow this database to be mounted.

Set-MailboxDatabase "EXCHANGETARGET\RecoveryDB" -AllowFileRestore:$true


If you skip the above step, when you attempt to mount the database, you will receive an error that appears to be permissions related.

Mount-Database "EXCHANGETARGET\RecoveryDB"


This is where most admins breathe a sigh of relief, but we aren't done - we need to move user s to this DB. Well, not really. What this really does is updates these user's AD objects to have their Exchange server and homeMDB at their new location in the RecoveryDB.

Get-Mailbox -Database "EXCHANGESOURCE\ExecDB" where {$_.ObjectClass -NotMatch '(SystemAttendantMailboxExOleDbSystemMailbox)'} | Move-Mailbox -ConfigurationOnly -TargetDatabase "EXCHANGETARGET\RecoveryDB"


The "where" clause in the middle of this is to prevent you from moving the System mailboxes that are unique to each DB. When you run this command - assuming your SCR target is in a different AD site - keep in mind you will need to sync AD to have your users in the main site start coming back online. You can trigger this in AD Sites and Services or various other ways, or just wait for replication to occur. At this point the users and database should all be online and client access to their data should be restored. If any HT servers in your organization had mail delivered for these users during the outage, it would deliver now. I recommend using OWA to test data as you may have client connectivity issues to troubleshoot as well with Outlook or Outlook Anywhere.


We can see the test email that I sent just prior to dismounting the database on the SCR source, so we know the logs replayed correctly. If your data is "out of date" then the likely thing that did not work is your ESEUTIL /R to replay the logs. You may be able to dismount the database and replay these again, but if you leave the DB mounted for a while and new messages flow in, your log sequence will likely be corrupted or broken. If you do this, you likely will need to go to restoring your DB from the previous night and then attempting to replay the SCR log files again.


Reseeding back to the original source server


Reseeding back is the exact same process, but your source and target are flip flopped. So you re-seed your "live" data that is on your DR server back to your main server. First, clean up the old DB/SG and the files/folders under it on your target server. Then, you can choose to rename/modify any of the DB or SG names or paths to your liking. (This can be skipped if wanted, but needs to be done before you configure SCR) Then, you can repeat the creation of an SCR replica and reseed the data back. Once data is seeded and healthy on the target, you repeat the failover process to "fail back" Once you fail back, clean up all the SG/DB paths/names once again on the DR server. Don't forget to recreate the SCR seed to your DR location after!

Tuesday, October 06, 2009

Exchange 2007 SCR How to - Part 1


All of this document is based on database portability offered in Exchange 2007 SP1 known as Standby Continuous Replication, or SCR. Microsoft's article is here: http://technet.microsoft.com/en-us/library/bb738132.aspx and one of the most overlooked items I felt in this document is this bit:
I I will get into more details on this below.
Before getting startedStorage group and database paths must match on the source and target. So D:\Exchange Databases\ for the EDBFilePath must be valid on both servers. Due to this, I recommend creating the folder/path structure on the source and target as you go and name everything really smartly so you know what logs you are looking at when you are in explorer. I typically use the two cmdlets below to make sure I have the info I need:
Get-StorageGroup -server EXCHANGESOURCE ft name,logfolderpath,systemfolderpath
Get-MailboxDatabase -server EXCHANGESOURCE ft name,edbfilepath
The System Folder path is recommended to be in the same folder as the log file path for uniformity as well as to reduce the risk of missing it in this step of having to type the default location of C:\Program Files\Microsoft\Exchange Server\Mailbox\Storage Group.
Also, only one database per storage group is supported for SCR log shipping to work and replay logs on the SCR target.
I recommend putting each .edb file into a separate sub folder as well because you later (in eseutil) need to specify the database directory (not path to EDB) to replay logs, and it’s a little intimidating if you have 4-5 edb files in the same directory.
Seeding the databases
On EXCHANGESOURCE, enable SCR:
Enable-StorageGroupCopy -StandbyMachine EXCHANGETARGET -Identity "EXCHANGESOURCE\ExecSG" -ReplayLagTime 0.1:0:00 -TruncationLagTime 0.2:0:0

Those day/time formats are in day.hour:minutes:seconds format - leaving one or two zeroes does not matter for hour/day/time.

ReplayLagTime is the time that the target server will wait to replay a log file into the EDB. Above, it is set to 1 hour. If not specified, the default is 24 hours. While this may work - it can mean replaying a lot of logs, so a lower setting is preferred. There is a hard coded lag of 50 files here. This means you will always see the ReplayLagTime as at least 50 when running Get-StorageGroupCopyStatus
TruncationLagTime is the time that the SCR target will delay deleting a replayed log. This is helpful if there was ever an incident where restoring from backup had to be performed, and the replayed log files from an SCR source could be used to shorten the gap between backup and moment of failure. The Microsoft default for this is 0, however.
You may receive this warning:
WARNING: ExecSG copy is enabled but seeding is needed for the copy. The storage group copy is temporarily suspended. Please resume the copy after seeding the database.
Get-StorageGroupCopyStatus -StandbyMachine EXCHANGETARGET
Will show the SCR replication status, including copy queue length and a suspended status because the DB is not there yet. If not suspended, suspend with:
Suspend-StorageGroupCopy -Identity "EXCHANGESOURCE\ExecSG" -StandbyMachine EXCHANGETARGET
Now, on EXCHANGETARGET, we can seed the database.
Run EMS as administrator, or these will error with "Access to the path (edbfilelocation)\temp-seeding is denied"
Update-StorageGroupCopy -Identity "EXCHANGESOURCE\Executive Staff Storage Group" -StandbyMachine EXCHANGETARGET
This will then seed the data:

If you receive the following error:
Database Seeding Error: Error returned from an ESE function call (0xc7ff1004), error code (0x0).
You need to enable and allow Windows Powershell as a program in Windows firewall.
Once the seeding is completed, the suspend operation should automatically resume. If it does not, you can manually do this with:
Resume-StorageGroupCopy -Identity "EXCHANGESOURCE\ExecSG" -StandbyMachine EXCHANGETARGET
Confirming that the Database seed is healthy
From the SCR source:
Get-StorageGroupCopyStatus -StandbyMachine EXCHANGETARGET
From the SCR target:
Get-StorageGroupCopyStatus -server EXCHANGESOURCE -StandbyMachine EXCHANGETARGET
This outputs something like:
NameSummaryCopyStatus CopyQueueLength ReplayQueueLength LastInspectedLogTime
ExecSGHealthy 0118710/6/2009
Obviously, "Healthy" is what you want to see here. If there are NotConfigured, they are either not configured, OR you left the -standbymachine off! If you have errors, check your application event logs, ensure the folder structure is correct and read the next step below.
CopyQueueLength is the number of transaction logs waiting to be shipped. If this number is commonly growing, your WAN connection may not have sufficient bandwidth.
ReplayQueueLength is the number of logs in the SCR target's log directory waiting to be replayed. This number will increase continually until a full backup is taken on the SCR source, at which point the SCR target "replays" these logs and commits them to the EDB on the target server. It is important to know there is a hard coded lag of 50 log files that cannot be changed.
Last InspectedLogTime shows the data and time of the last log inspected on the SCR target. The time usually is … in powershell, so run something like:
Get-StorageGroupCopyStatus -StandbyMachine EXCHANGETARGET | ft name, LastInspectedLogTime
Additionally, from the SCR target, you can run test-ReplicationHealth to troubleshoot any issues with SCR. This cmdlet does not work from the source server and errors that LCR (local continuous replication) is not configured. It also accepts a -verbose argument which displays a lot more detail.
Continue reading Part 2 which includes failover and failback as well.