- Notifications
You must be signed in to change notification settings - Fork26
One Click Script to Deploy CDP (CDP PvC & HDP & CDH)
License
frischHWC/one-script-deploy
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
SUCCESS | Type | Versions | OS | DB | KDC | TLS | Comment |
---|---|---|---|---|---|---|---|
✓ | basic | CDP 7.3.1.0 & CM 7.13.1.0 | RHEL 8.10 | Postgres 14 | MIT KDC | True | |
✓ | full | CDP 7.3.1.0 & CM 7.13.1.0 | RHEL 8.10 | Postgres 14 | MIT KDC | True | |
✓ | all-services | CDP 7.3.1.0 & CM 7.13.1.0 & CFM 2.2.9.0 & CSA 1.14.0.0 | RHEL 8.10 | Postgres 14 | MIT KDC | True | |
✓ | streaming | CDP 7.3.1.0 & CM 7.13.1.0 & CFM 2.2.9.0 & CSA 1.14.0.0 | RHEL 8.10 | Postgres 14 | MIT KDC | True | |
✓ | streaming-with-efm | CDP 7.3.1.0 & CM 7.13.1.0 & CFM 2.2.9.0 & EFM 2.2.99.0 & CSA 1.14.0.0 | RHEL 8.10 | Postgres 14 | MIT KDC | True | |
✓ | observability | CDP 7.3.1.0 & CM 7.13.1.0 & Observability 3.5.3 | RHEL 8.10 | Postgres 14 | MIT KDC | True | |
pvc | CDP 7.3.1.0 & CM 7.13.1.0 & PvC 1.5.4 | RHEL 8.8 | Postgres 14 | Free IPA | True | ||
all-services-pvc | CDP 7.3.1.0 & CM 7.13.1.0 & PvC 1.5.4 | RHEL 8.8 | Postgres 14 | Free IPA | True | ||
pvc-oc | CDP 7.3.1.0 & CM 7.13.1.0 & PvC 1.5.4 | CENTOS 7.9 | Postgres 14 | Free IPA | True | ||
all-services-pvc-oc | CDP 7.3.1.0 & CM 7.13.1.0 & PvC 1.5.4 | CENTOS 7.9 | Postgres 14 | Free IPA | True | ||
✓ | basic | CDP 7.1.9.14 & CM 7.11.3.9 | CENTOS 8.8 | Postgres 14 | MIT KDC | True | On AWS with setup-cluster-on-cloud |
✓ | full | CDP 7.1.9.14 & CM 7.11.3.9 | CENTOS 8.8 | Postgres 14 | MIT KDC | True | On AWS with setup-cluster-on-cloud |
✓ | streaming | CDP 7.1.9.14 & CM 7.11.3.9 & CFM 2.1.7 | CENTOS 8.8 | Postgres 14 | MIT KDC | True | On AWS with setup-cluster-on-cloud |
pvc | CDP 7.1.9.14 & CM 7.11.3.9 & PvC 1.5.4 | CENTOS 8.8 | Postgres 14 | Free IPA | True | On AWS with setup-cluster-on-cloud | |
✓ | cdp-719 | CDP 7.1.9.1015 & CM 7.11.3.26 | RHEL 8.10 | Postgres 14 | MIT KDC | True | |
✓ | cdp-719-basic | CDP 7.1.9.1015 & CM 7.11.3.26 | RHEL 8.10 | Postgres 14 | MIT KDC | True | |
✓ | cdp-719-all-services | CDP 7.1.9.1015 & CM 7.11.3.26 | RHEL 8.10 | Postgres 14 | MIT KDC | True | |
✓ | cdp-719-streaming | CDP 7.1.9.1015 & CM 7.11.3.26 & CFM 2.1.7.1000 | RHEL 8.10 | Postgres 14 | MIT KDC | True | |
✓ | cdp-719-pvc | CDP 7.1.9.1015 & CM 7.11.3.26 & PvC 1.5.4 | RHEL 8.8 | Postgres 14 | Free IPA | True | Use --os-version=8.8 |
✓ | cdp-717 | CDH 7.1.7.2026 & CM 7.6.7 | CENTOS 7.9 | Postgres 14 | MIT KDC | True | |
✓ | cdh-5 | CDH 5.16 & CM 5.16.2 | CENTOS 7.6 | Postgres 12 | MIT KDC | False | |
✓ | cdh-6 | CDH 6.3.4 & CM 6.3.4 | CENTOS 7.6 | Postgres 12 | MIT KDC | False | |
✓ | hdp-3 | HDP 3.1.5.6091 & Ambari 2.7.5.0 | CENTOS 7.6 | MySQL | MIT KDC | False | |
✓ | hdp-2 | HDP 2.6.5.0 & Ambari 2.6.2.2 | Postgres 12 | MySQL | MIT KDC | False |
Successful installation are marked with a ✓ in SUCCESS column
It requiresAnsible with minimum version 2.10 and maximum version 2.16.
Launch the script
to enable all requirements before launching the full script.requirements.sh
You need to pass one parameter, which is the type of the machine where you are running it, it could be: rhel, debian, suse or mac.As an example, if you run this on your macOS, you will run:
./requirements.sh mac
To install a cluster, default one is a CDP 7 - 10 nodes with Kerberos and TLS set:
export PAYWALL_USER= # Your Paywall User from Cloudera to access archive.cloduera.comexport PAYWALL_PASSWORD= # Your Paywall password from Cloudera to access archive.cloduera.comexport LICENSE_FILE= # Your Licence file from Clouderaexport CLUSTER_NAME= # A name of your choice (ex: cloudera-test )export NODES= # *Space* separated list of nodes (ex: "node1 node2 node3 ") (You must provide as much as nodes are needed for the type of installation you are launching, default being 10.)
./setup-cluster.sh \ --cluster-name=${CLUSTER_NAME} \ --license-file=${LICENSE_FILE} \ --paywall-username=${PAYWALL_USER} \ --paywall-password=${PAYWALL_PASSWORD} \ --nodes="${NODES}"
N.B. : This assumes that a passwordless connection is present from here to all your cluster nodes, however provide a password with--node-password
or a private key file with--node-key
It is only working with AWS as of now.
Note: This requires you to install Terraform and AWS CLI on the machine where you will launch this script.
It will require also that you create in advance a key pair in AWS, get your private key locally, and also get AWS access key and secret access key:
export ACCESS_KEY="" # Your AWS Access Keyexport SECRET_ACCESS_KEY="" # Your AWS Secret Access Keyexport KEY_PAIR_NAME="" # Name of a key pair in AWS that will be set to acess your machinesexport PRIVATE_KEY_PATH="" # Local private key path to use to access your machinesexport WHITELIST_IP="" # Your IP, so only this IP will be able to access your machines
export PAYWALL_USER= # Your Paywall User from Cloudera to access archive.cloduera.comexport PAYWALL_PASSWORD= # Your Paywall password from Cloudera to access archive.cloduera.comexport LICENSE_FILE= # Your Licence file from Clouderaexport CLUSTER_NAME= # A name of your choice (ex: cloudera-test )
./setup-cluster-on-cloud.sh \ --cloud-provider="AWS" \ --aws-access-key=${ACCESS_KEY} \ --aws-secret-access-key=${SECRET_ACCESS_KEY} \ --aws-key-pair-name=${KEY_PAIR_NAME} \ --private-key-path=${PRIVATE_KEY_PATH} \ --whitelist-ip=${WHITELIST_IP} \ --os-version=8.7 \ --setup-etc-hosts=false \ \ --cluster-name=${CLUSTER_NAME} \ --license-file=${LICENSE_FILE} \ --paywall-username=${PAYWALL_USER} \ --paywall-password=${PAYWALL_PASSWORD}
All parameters above must be let like this, as they are appropriate to AWS machines.After these parameters, you can add all other parameters that worked with script: setup-cluster.sh.
The script, will use terraform to provide your machines, setup connectivity and then launch setup-cluster.sh with pre-configured parameters to create the wanted cluster.
./setup-cluster.sh \ --cluster-name=${CLUSTER_NAME} \ --cluster-type=basic \ --nodes-base="${NODES}"
./setup-cluster.sh \ --cluster-name=${CLUSTER_NAME} \ --license-file=${LICENSE_FILE} \ --paywall-username=${PAYWALL_USER} \ --paywall-password=${PAYWALL_PASSWORD} \ --nodes-base="${NODES}"
./setup-cluster.sh \ --cluster-name=${CLUSTER_NAME} \ --license-file=${LICENSE_FILE} \ --paywall-username=${PAYWALL_USER} \ --paywall-password=${PAYWALL_PASSWORD} \ --cluster-type=basic \ --nodes-base="${NODES}"
./setup-cluster.sh \ --cluster-name=${CLUSTER_NAME} \ --license-file=${LICENSE_FILE} \ --paywall-username=${PAYWALL_USER} \ --paywall-password=${PAYWALL_PASSWORD} \ --cluster-type=basic-enc \ --nodes-kts=<Dedicated Node(s) for KTS> \ --nodes-base="${NODES}"
CDP 7 - Basic 6 nodes with Free IPA on a dedicated node (All CDP clusters can have free-ipa just by adding --free-ipa=true and provide a node with --node-ipa=) (Kerberos / TLS)
./setup-cluster.sh \ --cluster-name=${CLUSTER_NAME} \ --license-file=${LICENSE_FILE} \ --paywall-username=${PAYWALL_USER} \ --paywall-password=${PAYWALL_PASSWORD} \ --cluster-type=basic \ --free-ipa=true \ --node-ipa=<One node dedicated to IPA> \ --nodes-base="${NODES}"
CDP 7 - Streaming cluster (6 nodes basic with Spark 3 and Flink + a VPC of 3 nodes of Kafka/Nifi) (Kerberos / TLS)
./setup-cluster.sh \ --cluster-name=${CLUSTER_NAME} \ --license-file=${LICENSE_FILE} \ --paywall-username=${PAYWALL_USER} \ --paywall-password=${PAYWALL_PASSWORD} \ --cluster-type=streaming \ --nodes-base="${NODES}"
CDP 7 - All Services (6 nodes basic with Spark 3 and Flink + 3 Nifi/Kafka nodes + 1 node for KTS ) (Kerberos / TLS)
./setup-cluster.sh \ --cluster-name=${CLUSTER_NAME} \ --license-file=${LICENSE_FILE} \ --paywall-username=${PAYWALL_USER} \ --paywall-password=${PAYWALL_PASSWORD} \ --cluster-type=all-services \ --nodes-kts=<Dedicated Node for KTS> \ --nodes-base="${NODES}"
./setup-cluster.sh \ --cluster-name=${CLUSTER_NAME} \ --license-file=${LICENSE_FILE} \ --paywall-username=${PAYWALL_USER} \ --paywall-password=${PAYWALL_PASSWORD} \ --cluster-type=pvc \ --nodes-ecs=<Space separated list of 3 nodes> \ --node-ipa=<One node dedicated to IPA> \ --nodes-base="${NODES}"
CDP 7 - 6 nodes basic for PVC with Openshift (Experiences installed on a provided OCP cluster) (Kerberos / TLS / FreeIPA)
./setup-cluster.sh \ --cluster-name=${CLUSTER_NAME} \ --license-file=${LICENSE_FILE} \ --paywall-username=${PAYWALL_USER} \ --paywall-password=${PAYWALL_PASSWORD} \ --cluster-type=pvc-oc \ --kubeconfig-path=<Path to your kubeconfig file> \ --oc-tar-file-path=<Path to your oc.tar file downloaded from RedHat> \ --node-ipa=<One node dedicated to IPA> \ --nodes-base="${NODES}"
CDP 7 - All Services (6 nodes basic with Spark 3 and Flink + 3 Nifi/Kafka nodes + 1 node for KTS + Associated with a PvC ) (Kerberos / TLS / FreeIPA)
./setup-cluster.sh \ --cluster-name=${CLUSTER_NAME} \ --license-file=${LICENSE_FILE} \ --paywall-username=${PAYWALL_USER} \ --paywall-password=${PAYWALL_PASSWORD} \ --cluster-type=all-services-pvc \ --nodes-kts=<Dedicated Node for KTS> \ --node-ipa=<Dedicated Node for IPA> \ --nodes-ecs=<Space separated list of 3 nodes> \ --nodes-base="${NODES}"
CDP 7 - Observability cluster (Requires a cluster to be pluggued to; it creates a cluster of 6 nodes ) (Kerberos / TLS)
./setup-cluster.sh \ --cluster-name=${CLUSTER_NAME} \ --license-file=${LICENSE_FILE} \ --paywall-username=${PAYWALL_USER} \ --paywall-password=${PAYWALL_PASSWORD} \ --cluster-type=observability \ --altus-key-id=<ALTUS key ID provided by Cloudera> \ --altus-private-key=<path to ALTUS private key provided by Cloudera> \ --cm-base-url=<http://<CM host to connect to OBSERVABILITY>:<Port> \ --tp-host=<Host in base cluster that will have Telemetry Publisher installed> \ --nodes-base="${NODES}"
./setup-cluster.sh \ --cluster-name=${CLUSTER_NAME} \ --license-file=${LICENSE_FILE} \ --paywall-username=${PAYWALL_USER} \ --paywall-password=${PAYWALL_PASSWORD} \ --cdh-version='7.1.8.1' \ --cm-version='7.7.3-33365545' \ --nodes-base="${NODES}"
./setup-cluster.sh \ --cluster-name=${CLUSTER_NAME} \ --license-file=${LICENSE_FILE} \ --paywall-username=${PAYWALL_USER} \ --paywall-password=${PAYWALL_PASSWORD} \ --kerberos=false \ --tls=false \ --nodes-base="${NODES}"
./setup-cluster.sh \ --cluster-name=${CLUSTER_NAME} \ --license-file=${LICENSE_FILE} \ --paywall-username=${PAYWALL_USER} \ --paywall-password=${PAYWALL_PASSWORD} \ --cluster-type=cdh6 \ --nodes-base="${NODES}"
./setup-cluster.sh \ --cluster-name=${CLUSTER_NAME} \ --license-file=${LICENSE_FILE} \ --paywall-username=${PAYWALL_USER} \ --paywall-password=${PAYWALL_PASSWORD} \ --cluster-type=cdh5 \ --nodes-base="${NODES}"
./setup-cluster.sh \ --cluster-name=${CLUSTER_NAME} \ --license-file=${LICENSE_FILE} \ --paywall-username=${PAYWALL_USER} \ --paywall-password=${PAYWALL_PASSWORD} \ --cluster-type=hdp3 \ --nodes-base="${NODES}"
At the end, CM or Ambari depending on your installation should be available at the first node URL with appropriate http or https and port(depending on tls parameters for HDP which is false by default and tls for CDP which is true by default).
During the installation, you can also follow the installation from CM or Ambari by connecting to it.
N.B.: It is recommended to not interfer with the cluster during ansible installation until it is done
At the end of the installation, if it completed successfully, users are created on machines, their keytabs too and are retrieved in your local computer under
,/tmp/
is also retrieved.krb5.conf
Moreover, it is also possible to launch some random data generation into various systems.
All default passwords are Cloudera1234
This describe in details the steps made during the installation in the right order, each one could be skipped and hence be launched separately.
Once you gathered all previous requirements, a launch could be made, it will mainly consist of 5 steps:
Prepare your machines
Launch the installation from the first node of your cluster using appropriate ansible playbook and files
Do post-install configuration (mainly for CDP)
Create users on your cluster
Load some data into your cluster
Each step could be skipped (see command line help).
This group of scripts, coordinated by main script:
has the goal to configure machines provided and launch a CDP (or HDP, CDH) installation with ansible.Finally, some extra configurations steps and random data could be generated into different services.setup-cluster.sh
All this, is only made from your machine.
This script relies on ansible scripts that must be accessible from your machine (if they are not, please setup an internal webserver and provide its url through command line).
Ansible script relies also on Cloudera repository to access CDP, CM, HDP, Ambari etc… (if they are not accessible, please setup an internal webserver and provide its url through command line).
This script relies also on github repository to load data. (if they are not accessible, please setup an internal webserver and provide its url through command line).
This step usesPlaybook hosts_setup.
If you did not set parameter--setup
to false, it will prepare all machines by setting ssh-passwordless, pushing required files to them.
N.B.: This step can be done only one time and then bypass if you reuse same machines
This step usesPlaybook ansible_install_preparation and then launch commands directly on the host to launch ansible installation there.
The first playbook used can be skipped setting parameter--install
to false, which is true by default.
It cleans up the first node, creates a directory
, get ansible repository as a zip in it and add files for your installation in it.~/deployment/ansible-repo/
Then, the proper ansible command corresponding to the installation is lauched directly on the first node.
This step usesPlaybook post_install.
If you install a CDP cluster and let parameter--post-install
to true, it will do some extra-steps, such as setting no unlogin on CM, fix various potential bugs.
This step usesPlaybook user_creation.
If you did not set explicitly parameter--user-creation
to false, and installation completed succesfully, some users are created defined inextra_vars of user_creation.
They are present on all nodes with their
directory containing their keytabs./home
Their keytabs are also fetch in your
directory along with the/tmp
allowing you to kinit directly from your computer.krb5.conf
This step usesPlaybook data_load.
If you let parameter--data-load
to true, a data loading step will start (only on CDP, HDP 2 and CDH 5 currently) to generate data into existing services of the paltform: HDFS, HBase, Hive etc…
It is based onrandom-datagen project
Note that this step is completely extensible as you can add new files to specify how data should be generated in folderplaybooks/data_load/generate_data/models
N.B.: This step will also create Ranger required policies, and these are also extensible by adding policies inplaybooks/data_load/ranger_policies/push_policies/policies
Once you are familiar with these scripts, you can easily tune them using command-line parameters to provide your own cluster files and repositories.
To provide a quick new definition of a cluster:
Copy-Paste directory ansible-cdp and name it for example: ansible-cdp-configured
Make all your modifications in files of your copied directory
Launch script with argument:
--cluster-type=ansible-cdp-configured
(It will automatically take files under ansible-cdp-configured/ directory)
Those steps can be launched indepently and you can configure it to create more users or load different and more data.
Look inside playbooks folder toextra_vars.yml to get more about possibilities.
Private Cloud setup (on ECS or OC) can also be launched independently on a running cluster.
Configuration of private cloud cluster can also be launched independently. (Use--install-pvc=false
but--pvc=true
to configure but not re-install your pvc).
Inextra_vars.yml you can provide CDWs, CDEs, CMLs that will be provisionned for you and also rights that you expect on your users.
TLS is not set for HDP & CDH clusters
Data loading is not made for HDP 3 & CDH 6 clusters
Free IPA is only available for CDP clusters
Please feel free to contribute and help solve and implement TODOs listed inTODOs.adoc
About
One Click Script to Deploy CDP (CDP PvC & HDP & CDH)
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Contributors2
Uh oh!
There was an error while loading.Please reload this page.