In this post I would like to show you how to integrate Apache Ranger with LDAP. I'll be using a minimal development 6-node Hortonworks cluster and FreeIPA as our LDAP provider. This will of course work similarly in a HDP 2.5 sandbox.
I won't go into much detail in regards to Apache Ranger or FreeIPA, because I will assume you'll know about these products and what you are trying to accomplish if you are reading this. However, that said, taken from Rangers website:
Ranger is a framework to enable, monitor and manage comprehensive data security across the Hadoop platform. The vision with Ranger is to provide comprehensive security across the Apache Hadoop ecosystem. Apache Ranger has the following goals:
And FreeIPA has the following Main features (again taken from their website):
I personally like FreeIPA because it takes two difficult things to setup and does so very clean and easy with a wonderful website GUI. Also, open source is wonderful (and free!). Environment Setup
Operating System for HDP and FreeIPA: centos-release-6-8.el6.centos.12.3.x86_64
HDP Version: 2.5.3.0-37 Ambari Version: 2.4.2.0 Ranger Version: 0.6.0 FreeIPA 3.0.0 Openldap Version: 2.4.40 Configuration Changes
To enable and incorporate LDAP you must implement a few things first. First, you'll need to incorporate LDAP into your client nodes, for CentOS6, this will install freeIPA v3.0.0. To have the latest version you'll have to use the tarball:
yum -y install ipa-client
TIP:
Once installed you'll need to keep track of the basic info, your bind DN, bind password (for simple authentication), the LDAP url, port and base dn for search. I found using this command to be helpful to debug and find exactly what you are looking for: ldapsearch -x -h ldaps://<FREEIPA_SERVER_FQDN> -p <PORT_NUMBER> -D "<BIND DN>" -w <PASSWORD> -b "<BASE DN>" uid=<USEDNAME> for example: ldapsearch -x -h ldaps://freeipa.novalocal -p 636 -D "cn=Directory Manager" -w SuperSecretPassword -b "dc=novalocal" uid=admin Step One: Log into Ambari and go into Ranger service and the Configuration menu. Step Two: Enter the Ranger User Info. You will need to enable the User Sync. Once enabled, all of the sync information will be shown. You'll need to select:
Step Three: On the next tab, User Configs change the:
Group Configs will stay the default, not synced.
Step four: Go from the Ranger User Info to the Advanced tab all the way at the top of the screen and we will need to modify two spots, Ranger Settings, and LDAP Settings.
In Ranger Settings:
In LDAP Settings,
At this point you can hit save and restart the necessary services for it to work.
To have Ranger update the users/groups it will do it regularly, however to force an update, you can manually restart the ranger usersync process. One thing I noticed right away was that in ranger, groups were not syncing. You can verify this by kinit as a user which is part of a specific group, for example group1. kinit user01 > kinit user102 > groups user102 user102 : user102
So to correct this, put following line into domain section in /etc/sssd/sssd.conf
ldap_group_object_class = ipaUserGroup Now, when you do your group check, it'll report back correctly. If it still doesn't report back, you might need to clear your SSSD cache, to clear the cache and update all records: sss_cache -E > kinit user102 > groups user102 user102 : user102 group1 I hope this tutorial was helpful for you. If you have any questions, please let me know in the comments below! Big data analytics is becoming an important topic for companies and R is a popular research tool. I had trouble finding an updated resource on how to install RHadoop on some of the most recent platforms available today with many of the tutorials out there being several software versions back or years old. So I felt it necessary to provide an updated guide on how to proceed with the installation. Currently, I am using the most up-to-date software possible consisting of: Hortonworks HDP-2.2.6.3-1 Ubuntu 12.04.5 LTS (GNU/Linux 3.13.0-55-generic x86_64) (on the latest stable Ubuntu version supported) R version 3.2.1 (2015-06-18) -- "World-Famous Astronaut" RStudio Version 0.99.463 We will begin by verifying that some prerequisites are installed: $ sudo apt-get install libboost-dev libboost-test-dev libboost-program-options-dev libboost-system-dev libboost-filesystem-dev libevent-dev automake libtool flex bison pkg-config g++ libssl-dev Then we can proceed to install R. You may need to add the repository to your sources list which can be found in /etc/apt/sources.list An example of the line you will need to add is: deb http://http://cran.revolutionanalytics.com/bin/linux/ubuntu precise/ The universal repository does contain R however, so you may want to see what is available there. $ sudo apt-get install r-base r-base-dev Next we will need to download a few files from Revolution Analytics for the actually R hadoop process. In your terminal execute: $ wget https://github.com/RevolutionAnalytics/plyrmr/releases/download/0.6.0/plyrmr_0.6.0.tar.gz $ wget https://github.com/RevolutionAnalytics/rmr2/releases/download/3.3.1/rmr2_3.3.1.tar.gz $ wget https://github.com/RevolutionAnalytics/rhdfs/releases/download/1.0.8/rhdfs_1.0.8.tar.gz $ wget https://github.com/RevolutionAnalytics/rhbase/releases/download/1.2.1/rhbase_1.2.1.tar.gz For some reason or another, I had issues downloading those files easily, so for convenience and my sanity later on for myself, I've attached those files which can be found below:
Once R is installed, we can log into R by typing "R" and install a few packages. Now, I've included rmr, rhdfs, rhbase, and plyrmr in the case that those packages eventually become available, however, at the present moment, they will error out.
$ install.packages(c("rJava", "RJSONIO", "rmr", "rhdfs", "rhbase", "plyrmr"), dependencies=TRUE, repos='http://cran.us.r-project.org') Once it is complete, we can quit R q() and no need to save the desktop. At this point we can begin to install some of the RHadoop packages. It is important that sudo is used here because these packages need to be installed under the system library packages, not the user library. This can be done by: $ sudo R CMD INSTALL plyrmr_0.6.0.tar.gz $ sudo R CMD INSTALL rmr2_3.3.1.tar.gz Java 7 JDK should already be installed, however, if it isn't, install it by running the following command and set all of $ sudo apt-get install openjdk-7-jdk $ sudo R CMD javareconf $ curl https://archive.apache.org/dist/thrift/0.8.0/thrift-0.8.0.tar.gz | tar zx $ cd thrift-0.8.0/ $ ./configure $ make $ sudo make install $ export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/local/lib/pkgconfig/ Verifiy pkg-config path is correct: $ pkg-config --cflags thrift returns: -I/usr/local/include/thrift $ sudo cp /usr/local/lib/libthrift-0.8.0.so /usr/lib/ I had issues using Thrift 0.9.2 so I back dated to 0.8.0, however, I have heard that 0.9.0 should work as well. Once these items are installed, then we can finish the installation with rhbase. $ R CMD INSTALL rhbase_1.2.1.tar.gz |
AuthorJames Benson is an IT professional. Archives
August 2022
Categories
All
|