Example Java code to use Cloudera Hive jdbc driver to run SQL on a Hive database which is Kerberos enabled.
Step-1: First make sure you can get a Kerberos ticket using kinit on a linux terminal and you have a Kerberos principal that can access Hive tables.
$ kinit userid@REALM
$ klistIt should show something like:
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: userid@REALM
Valid starting Expires Service principal
08/27/2021 10:25:05 08/27/2021 20:25:05 krbtgt/xyz@REALM
renew until 09/03/2021 10:25:00Important make sure it shows Ticket cache: FILE: like above. If it says other things like KCM:, KEYRING: etc. the Java jdbc driver will fail to get the ticket from cache and the code wont work.
Step-2: Download the Cloudera Hive jdbc driver from Cloudera website cloudera.com . It is something like HiveJDBC41.jar which you download and keep in the same directory as the Java code for example /home/userid/myscripts . Note that the Apache Hive jdbc driver will fail miserably if you try to run this Java code with it as it will require different jdbc syntax than Cloudera jdbc driver.
Step-3: Write your Java code as below:
Put your code in the directory like: /home/userid/myscripts
$ vi /home/userid/myscripts/HiveJDBCTest.javaimport java.sql.SQLException;
import java.sql.Connection;
import java.sql.ResultSet;
import java.sql.Statement;
import java.sql.DriverManager;
import org.apache.hadoop.security.UserGroupInformation;public class HiveJDBCTest {
public static void main(String[] args) throws SQLException{
try {
Class.forName("com.cloudera.hive.jdbc41.HS2Driver");
// below apache jdbc driver is for non-cloudera hive and doesnt work with this code
// Class.forName("org.apache.hive.jdbc.HiveDriver");
} catch (ClassNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
System.exit(1);
}try {
// below properties are not needed for cloudera jdbc driver so commented out
// System.setProperty("javax.security.auth.useSubjectCredsOnly","false");
// org.apache.hadoop.conf.Configuration conf = new org.apache.hadoop.conf.Configuration();
// conf.set("hadoop.security.authentication", "Kerberos");
// UserGroupInformation.setConfiguration(conf);
// UserGroupInformation.loginUserFromKeytab("hive/hostfqdn@REALM", "./hive.keytab");
// loginuserfromkeytab is useless as below .getconnection("jdbc..) uses the ccache from kinit done manually outside this program only. Also ccache must be FILE: not KCM: or KEYRING: as seen in klist output.System.out.println("getting connection....");
// Connection cnct = DriverManager.getConnection("jdbc:hive2://hive2host:10000/default;principal=hive/hostfqdn@REALM");
// above jdbc url gives error as not valid for cloudera hive driver which needs below jdbc url format. java.sql.SQLException: [Cloudera][HiveJDBCDriver](500164) Error initialized or created transport for authentication: Peer indicated failure: Unsupported mechanism type PLAIN.Connection cnct = DriverManager.getConnection("jdbc:hive2://hive2host:10000/default;AuthMech=1;KrbRealm=REALM;KrbHostFQDN=hostfqdn;KrbServiceName=hive");System.out.println("got connection");
Statement stmt = cnct.createStatement();
String sql = "show tables";
System.out.println("Running: " + sql);
ResultSet res = stmt.executeQuery(sql);
if (res.next()) {
System.out.println(res.getString(1));
}
if (res.next()) {
System.out.println(res.getString(1));
}if (res.next()) {
System.out.println(res.getString(1));
}cnct.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}
Step-3: Compile the java code with correct classpath.
Use command below to find the needed classpath:
$ hadoop classpath
/etc/hadoop/conf:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop/lib/*:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop/.//*:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop-hdfs/./:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop-hdfs/lib/*:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop-hdfs/.//*:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop-mapreduce/.//*:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop-yarn/./:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop-yarn/.//*
Compile the code as:
$ javac -classpath /etc/hadoop/conf:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop/lib/*:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop/.//*:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop-hdfs/./:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop-hdfs/lib/*:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop-hdfs/.//*:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop-mapreduce/.//*:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop-yarn/./:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop-yarn/.//*:/home/userid/myscripts/*:. HiveJDBCTest.javawarning: No SupportedSourceVersion annotation found on org.apache.hadoop.hdds.conf.ConfigFileGenerator, returning RELEASE_6.
warning: Supported source version 'RELEASE_6' from annotation processor 'org.apache.hadoop.hdds.conf.ConfigFileGenerator' less than -source '1.8'
2 warnings
Run the java code as:
$ java -classpath /etc/hadoop/conf:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop/lib/*:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop/.//*:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop-hdfs/./:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop-hdfs/lib/*:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop-hdfs/.//*:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop-mapreduce/.//*:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop-yarn/./:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hadoop/libexec/../../hadoop-yarn/.//*:/home/userid/myscripts/*:. HiveJDBCTestgetting connection....
got connection
Running: show tables
hue__tmp_xyz
hivetable1
hivetable2
Step-4: If you run into any trouble try to debug code by adding the flag
$ java -Dsun.security.krb5.debug=true -classpath ……etc
Thats all flocks happy debugging Java Hive jdbc connection which can be a real pain to make it work due to so many issues that Kerberos may throw up. Add your comments below if you could make it work!
REFERENCE: