Infrastructure

Linux Active Directory join across distributions with SSSD and DNS discipline

A multi-distribution runbook for joining Linux to Active Directory with realmd, adcli, SSSD, Kerberos, validation steps, and the checks that catch fragile integrations before they reach production.

20 Apr 2026 linuxactive-directorysssdkerberosdnsidentity

A successful realm join proves much less than most teams think. The join itself is usually the easy part. The fragile part appears later when DNS is inconsistent, Kerberos tickets age out, SSSD resolves users differently than expected, or PAM accepts authentication but never creates a usable session.

This article is deliberately multi-distribution. The technical chain is the same on every serious deployment, with DNS, Kerberos, realmd, adcli, SSSD, PAM and authorization controls. What changes between distributions is mostly packaging, a few defaults, service habits and how people troubleshoot them.

What this article is trying to achieve

The target state is simple.

A Linux server is joined to Active Directory.

Directory users and groups resolve correctly.

Interactive logon works.

Home directory creation is predictable.

Access is limited to the right identities.

The machine can be validated with a short post-join checklist instead of a vague assumption that the join must still be healthy because it worked last week.

Scope and assumptions

The examples below assume these conditions.

ItemExample
AD domainexample.local
Linux hostnamesrv-app-01.example.local
Domain controller discoveryDNS SRV records available
Join accountadministrator or delegated service account
Target OUOU=Linux,OU=Servers,DC=example,DC=local
Time sourcesynchronized with the same trusted time domain

Network reachability must exist toward the AD-integrated DNS servers, Kerberos and LDAP endpoints used by your design. If DNS discovery is already unstable, do not start with the join command. Fix name resolution first.

The chain that matters on every distribution

The stack is always the same even when package names differ.

realmd orchestrates discovery and join flow.

adcli performs the machine account operations against AD.

SSSD handles identity, authentication cache, NSS and PAM integration.

Kerberos provides the trust and ticket layer.

PAM and oddjob-mkhomedir or an equivalent mechanism decide whether a successful authentication becomes a usable session.

That is why multi-distribution guidance works best when the article keeps a common diagnostic backbone and only branches when package or platform behavior genuinely differs.

Validate DNS before touching the domain

If DNS is wrong, the rest of the workflow becomes misleading. The server may discover part of the directory, join with partial success, or resolve one controller and fail on another.

bash precheck-dns.sh
hostname -f
cat /etc/resolv.conf
resolvectl status

dig _ldap._tcp.dc._msdcs.example.local SRV
dig _kerberos._tcp.example.local SRV

dig dc01.example.local
dig dc02.example.local
getent hosts dc01.example.local
getent hosts dc02.example.local

What you want to see is consistent SRV discovery, stable forward resolution for controllers and no accidental dependency on a public resolver that knows nothing about your AD namespace.

A join can appear to work even with broken DNS if one controller is reachable and cached. The failure then shows up later as random logon issues, delayed group resolution or intermittent kinit failures.

Validate time and Kerberos expectations

Clock skew remains one of the simplest ways to create a join that looks healthy but behaves badly during real authentication.

bash precheck-time-kerberos.sh
timedatectl

# Use chronyc or ntpq depending on your stack
chronyc sources -v || true
ntpq -pn || true

kinit administrator@EXAMPLE.LOCAL
klist

If kinit fails before the join, the problem is usually easier to isolate than after SSSD and PAM are in the middle of the path.

Distribution-specific package installation

The common mistake here is to copy a single package list from one blog post and assume it applies everywhere.

Ubuntu and Debian family

bash ubuntu-debian-packages.sh
sudo apt update
sudo apt install -y   realmd sssd sssd-tools adcli krb5-user   samba-common-bin packagekit   oddjob oddjob-mkhomedir   libnss-sss libpam-sss

RHEL, Rocky and AlmaLinux family

bash rhel-family-packages.sh
sudo dnf install -y   realmd sssd sssd-tools adcli krb5-workstation   samba-common-tools oddjob oddjob-mkhomedir   authselect authselect-compat

On current RHEL-like systems, authselect matters because PAM and NSS profile management should not be edited blindly by hand.

Discover the domain before joining

Discovery tells you whether the host can see the directory in a sane way before it writes machine identity state.

bash realm-discover.sh
realm discover example.local

The output should confirm the directory type, the detected domain name, the configured client software and the server software. If discovery is slow, inconsistent or empty, that is already a signal that DNS or network assumptions are wrong.

Join the machine in a controlled way

The join command is short. The decisions around it are not.

bash realm-join.sh
sudo realm join example.local   -U administrator   --computer-ou="OU=Linux,OU=Servers,DC=example,DC=local"

Joining directly into the right OU is worth doing. It avoids post-join cleanup, keeps GPO and delegation intent readable and makes later automation easier.

Validate the join immediately instead of trusting the command exit code alone.

bash post-join-quick-check.sh
realm list
sudo adcli info example.local
sudo adcli testjoin

Inspect and normalize SSSD

Even if the join populates configuration automatically, you still need to read it. A working default is not the same as an intentional configuration.

A practical baseline looks like this.

ini /etc/sssd/sssd.conf
[sssd]
services = nss, pam, sudo
config_file_version = 2
domains = example.local

[domain/example.local]
id_provider = ad
access_provider = ad
cache_credentials = true
default_shell = /bin/bash
use_fully_qualified_names = false
fallback_homedir = /home/%u
ldap_id_mapping = true

Then enforce the expected permissions and restart SSSD.

bash sssd-permissions-and-restart.sh
sudo chmod 600 /etc/sssd/sssd.conf
sudo systemctl restart sssd
sudo systemctl status sssd --no-pager

The important decisions here are not decorative.

use_fully_qualified_names = false changes how identities are typed and how sudo rules or scripts must refer to groups.

fallback_homedir decides where a first interactive session lands.

cache_credentials = true improves survivability during temporary directory reachability issues, but it also means troubleshooting must account for cache behavior.

ldap_id_mapping = true can simplify environments that do not manage POSIX attributes centrally. In environments with explicit UID and GID governance, you may choose differently.

Distribution-specific authentication integration

Ubuntu and Debian family

Most of the time, realmd and package scripts handle PAM wiring correctly, but you still want to confirm that home directory creation is active.

bash ubuntu-debian-pam-check.sh
grep mkhomedir /etc/pam.d/common-session /etc/pam.d/common-session-noninteractive || true
sudo pam-auth-update

RHEL, Rocky and AlmaLinux family

Use authselect instead of editing PAM stacks casually.

bash rhel-family-authselect.sh
sudo authselect select sssd with-mkhomedir --force
sudo systemctl enable --now oddjobd
sudo authselect current

That step is easy to skip when a lab test works through SSH with an existing account. It becomes visible only when new users cannot create a session cleanly on first login.

Validate user and group resolution

Joining the host is not the same as proving the identity path is healthy.

bash identity-resolution-checks.sh
id user1
getent passwd user1
getent group linux-admins
getent group "domain admins" || true

What matters here is consistency. id, getent, SSH and sudo should all agree on how the account and group names are represented.

If they do not agree, you usually have one of three issues. DNS discovery is inconsistent. SSSD configuration does not match your naming assumptions. Or you are testing against stale cache state.

Restrict who is allowed to log on

The default access posture is often broader than teams realize. Do not assume the join created the logon policy you wanted.

bash realm-permit.sh
# broad access, rarely the right long-term choice
sudo realm permit --all

# more explicit access control
sudo realm deny --all
sudo realm permit -g "linux-admins"

Not every environment uses realm permit as the long-term authorization layer. Some rely on SSSD access rules, some on directory groups with stricter policy design. The important point is to make the access model explicit and testable.

Add sudo through an AD-backed group

Once naming rules are settled, sudo can remain very simple.

text /etc/sudoers.d/linux-admins
%linux-admins ALL=(ALL:ALL) ALL

If your environment keeps fully qualified names enabled, the group reference changes accordingly. That is one reason to decide early whether short names or fully qualified names are the better fit for your estate.

Validate the result with a real user instead of trusting the file alone.

bash sudo-validation.sh
sudo -l -U user1

Clear SSSD cache when troubleshooting naming or membership drift

Cache is useful until it hides the state you are trying to diagnose.

bash sssd-cache-troubleshooting.sh
sudo sss_cache -E
sudo systemctl restart sssd
journalctl -u sssd --no-pager -n 100

Use that sequence carefully. It is a troubleshooting tool, not a daily operational habit.

A post-join validation runbook worth keeping

This is the short list that catches most fragile integrations before users do.

bash post-join-validation.sh
hostname -f
realm list
sudo adcli testjoin
klist
id user1
getent passwd user1
getent group linux-admins
sudo -l -U user1
journalctl -u sssd --no-pager -n 100

If one of these checks fails, the join is not finished from an operational standpoint.

The failures that show up most often

DNS remains the main source of false confidence. A server that points at the wrong resolvers can still pass a partial test and then fail unpredictably when controller selection changes.

Time drift is the second recurring cause. Kerberos does not tolerate sloppy synchronization just because the host is virtual and mostly idle.

The third common problem is naming inconsistency. Teams join with short names in mind, then configure sudo, scripts or monitoring with fully qualified names, or the reverse.

The fourth is cache confusion. People fix a group, retry immediately, then conclude the directory still disagrees when the host is serving stale SSSD data.

The fifth is session usability. Authentication succeeds, but home directory creation, shell assignment or PAM behavior makes the first login effectively broken.

Design decisions to settle before automating this with Ansible or AWX

Before you industrialize this workflow, decide these points clearly.

Will you use centrally managed POSIX attributes or SSSD mapping.

Will identities be referenced as short names or fully qualified names.

Will every domain user be allowed to log on, or only designated groups.

Will sudo remain local through sudoers.d files or move toward directory-backed policy.

What prechecks must pass before the automation is allowed to attempt a join.

A join playbook without these decisions is usually just a faster way to deploy an inconsistent identity model.

References

  • Ubuntu Server Guide, Active Directory integration with SSSD
  • Red Hat Enterprise Linux documentation, connecting to Active Directory by using SSSD and realmd
  • SSSD documentation, configuration and troubleshooting guidance
  • Microsoft documentation on SRV records, AD DNS requirements and Kerberos behavior