Overview
From 2013-2017, I worked forAvalon Consulting, LLC as aHadoop consultant. During this time I worked with a lot of clients and secured (TLS/SSL,LDAP,Kerberos, etc) quite a few Hadoop clusters for bothHortonworks andCloudera. There have been a few posts out there about debugging Kerberos problems like@steveloughran“Hadoop and Kerberos: The Madness beyond the Gate”. This post covers a few of the tips I’ve collected over the years that apply to Kerberos in general as well as to Apache Hadoop.
Increasekinit
verbosity
By default,kinit
doesn’t display any debug information and will typically come back with an obscure error on failure. The following command will enable verbose logging to standard out which can help with debugging.
KRB5_TRACE=/dev/stdout kinit -V
Java Kerberos/KRB5 and SPNEGO Debug System Properties
Java internal classes that deal with Kerberos have system properties that turn on debug logging. The properties enable alot of debugging so should only be turned on when trying to diagnose a problem and then turned off. They can also be combined if necessary.
The first property handlesKerberos errors and can help with misconfigured KDC servers,krb5.conf
issues, and other problems.
-Dsun.security.krb5.debug=true
The second property is specifically forSPNEGO debugging for a Kerberos secured web endpoint. SPNEGO can be hard to debug, but this flag can help enable additional debug logging.
-Dsun.security.spnego.debug=true
These properties can be set with*_OPTS
variables forApache Hadoop and related components like the example below:
HADOOP_OPTS="-Dsun.security.krb5.debug=true" #-Dsun.security.spnego.debug=true"
Hadoop Command Line Debug Logging
Most of theApache Hadoop command line tools (ie:hdfs
,hadoop
,yarn
, etc) use the same underlying mechanism for loggingLog4j
.Log4j
doesn’t allow dynamically adjusting log levels, but it does allow the logger to be adjusted before using the commands. Hadoop exposes the root logger as an environment variableHADOOP_ROOT_LOGGER
. This can be used to change the logging of a specific command without changinglog4j.properties
.
HADOOP_ROOT_LOGGER=DEBUG,console hdfs ...
Debugging Hadoop Users and Groups
Users withApache Hadoop are typically authenticated throughKerberos as explainedhere. The username of the user once authenticated is then used to determine groups. Groups withApache Hadoop can be configured in a variety of ways withHadoop Groups Mappings. Debugging what Apache Hadoop thinks your user and groups are is critical for setting up security correctly.
The first command takes a user principal and will return what the username is based on the configuredhadoop.security.auth_to_local
rules.
hadoop org.apache.hadoop.security.HadoopKerberosName USER_PRINCIPAL
The second command takes the username and determines what the groups are associated with it. This uses the configured Hadoop Groups Mappings to determine what the groups are.
hdfs groups USERNAME
The third command is uses the currently authenticated user and prints out the current users UGI. It also can take a principal and keytab to print information about that UGI.
hadoop org.apache.hadoop.security.UserGroupInformationhadoop org.apache.hadoop.security.UserGroupInformation "PRINCIPAL" "KEYTAB"
The fourth commandKDiag
is relatively new since it was introduced withHADOOP-12426 and first released inApache Hadoop 2.8.0. This command wraps up some additional debugging tools in one and checks common Kerberos related misconfigurations.
# Might also set HADOOP_JAAS_DEBUG=true and set the log level 'org.apache.hadoop.security=DEBUG'hadoop org.apache.hadoop.security.KDiag
Conclusion
More than half the battle of dealing with Kerberos and distributed systems is knowing where to look and what logs to generate. With the right logs, it becomes possible to debug the problem and resolve it quickly.
Top comments(0)
For further actions, you may consider blocking this person and/orreporting abuse