Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Microsoft

Journal weave's Journal: The Nightmare of Active Directory Replication (win 2003)

Had a day from hell yesterday. Had a power failure Friday night which affected two of our four Active Directory controllers. As luck would have it, the emergency generator which is supposed to power the room at times like this failed to start too. Bottom line, two ADs went dark for an hour.

When they came up, for whatever reason, the BIOS on one of them lost its time and came up as year 2003. It was quickly noticed, fixed, and rebooted with correct time. All was well, or at least I thought.

Our four ADs are across two sites with a replication path resembling a box...

a1 <----> b1
/\ . . . . /\
| . . . . . |
\/ . . . . \/
a2 <----> b2

Site a and b are connected via a wan link and are in different AD "sites."

A few days later, it's noticed that site b isn't getting replicated data from a. Some playing around reveals that b can replicate to a but not visa-versa.

"repadmin /showreps" reveals numerous auth errors saying "logon failure target account name is incorrect."

Doing a "repadmin /syncall" throws a similar error if run on the site b machines.

Google searches on this indicate problems with machine account password, a duplicate machine name, or even dynamic dns problems. I note that if a DC at site b just accesses \\a1.domain.name\c$ it gives the same error, but not if done via \\a1\c$ or \\ipaddr\c$ so that makes me believe it's not an auth issue but a name resolution issue.

So much time is spent checking DNS, the guid version of each dns machine's name, comparing the guid on each box to see if it's identical, etc, etc...

ok, all works there, so time to think about the machine acct password. Find references and kb articles saying how to use "netdom resetpwd" where each article details the steps like purging the kerb ticket list at different places. Arrggh...

Since b can't talk to a, try resetting the password on b boxes. No go. Then try an a box, still no go. Was missing a vital step that took digging through usenet posts to find and which isn't clear from the microsoft tech docs.

Syntax for the netdom resetpwd command is:

netdom resetpwd /s:servername /userd:domainadmin /passwordd:*

... where domainadmin is a domain admin account. Well, the "servername" specified is critical to making it work. The netdom command will reset the machine account password on that box (and in the case of an AD box, its own AD records) PLUS record it into AD on the box specified by /s. We were setting that to the local AD server.

So the key to making it work was doing this on each box on site a and specifying the server as its replication partner in site b to inject a good password record there.

After that was done, credentials worked again and replication started happening again.

The steps required to do this include purging the ticket list and starting and stoping the kdc service.

Example:

net stop kdc
klist purge
netdom resetpwd /s:b1.domain.name /userd:domain\admin /passwordd:*
net start kdc

I'm hoping this gets indexed into google and helps someone else out with this problem someday.

This discussion has been archived. No new comments can be posted.

The Nightmare of Active Directory Replication (win 2003)

Comments Filter:

All seems condemned in the long run to approximate a state akin to Gaussian noise. -- James Martin

Working...