- Notifications
You must be signed in to change notification settings - Fork10
Platform Extension Framework (PXF) for Apache Cloudberry (Incubating)
License
apache/cloudberry-pxf
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
PXF is an extensible framework that allows a distributed database like Greenplum and Apache Cloudberry to query external data files, whose metadata is not managed by the database.PXF includes built-in connectors for accessing data that exists inside HDFS files, Hive tables, HBase tables, JDBC-accessible databases and more.Users can also create their own connectors to other data storage or processing engines.
This project is derived fromgreenplum/pxf and customized for Apache Cloudberry.
external-table/: Contains the Cloudberry extension implementing an External Table protocol handlerfdw/: Contains the Cloudberry extension implementing a Foreign Data Wrapper (FDW) for PXFserver/: Contains the server side code of PXF along with the PXF Service and all the Pluginscli/: Contains command line interface code for PXFautomation/: Contains the automation and integration tests for PXF against the various datasourcesci/: Contains CI/CD environment and scripts (including singlecluster Hadoop environment)regression/: Contains the end-to-end (integration) tests for PXF against the various datasources, utilizing the PostgreSQL testing frameworkpg_regress
Below are the steps to build and install PXF along with its dependencies including Cloudberry and Hadoop.
git clone https://github.com/apache/cloudberry-pxf.git
To build PXF, you must have:
GCC compiler,
makesystem,unzippackage,mavenfor running integration testsInstalled Cloudberry
Either download and install Cloudberry RPM or build Cloudberry from the source by following instructions in theCloudberry.
Assuming you have installed Cloudberry into
/usr/local/cloudberry-dbdirectory, run its environment script:source /usr/local/cloudberry-db/greenplum_path.sh # For Cloudberry 2.0source /usr/local/cloudberry-db/cloudberry-env.sh # For Cloudberry 2.1+JDK 1.8 or JDK 11 to compile/run
Export your
JAVA_HOME:export JAVA_HOME=/usr/lib/jvm/java-11-openjdkGo (1.9 or later)
You can download and install Go viaGo downloads page.
Make sure to export your
GOPATHand add go to yourPATH. For example:export GOPATH=$HOME/goexport PATH=$PATH:/usr/local/go/bin:$GOPATH/bin
Once you have installed Go, you will need the
ginkgotool which runs Go tests,respectively. Assuminggois on yourPATH, you can run:go install github.com/onsi/ginkgo/ginkgo@latest
PXF uses Makefiles to build its components. PXF server component uses Gradle that is wrapped into the Makefile for convenience.
cd cloudberry-pxf/# Compile PXFmake
To install PXF, first make sure that the user has sufficient permissions in the$GPHOME and$PXF_HOME directories to perform the installation. It's recommended to change ownership to match the installing user. For example, when installing PXF as usergpadmin under/usr/local/cloudberry-db:
mkdir -p /usr/local/cloudberry-pxfexport PXF_HOME=/usr/local/cloudberry-pxfexport PXF_BASE=${HOME}/pxf-basechown -R gpadmin:gpadmin"${PXF_HOME}"make install
NOTE: ifPXF_BASE is not set, it will default toPXF_HOME, and server configurations, libraries or other configurations, might get deleted after a PXF re-install.
Ensure that PXF is in your path. This command can be added to your.bashrc:
export PATH=/usr/local/cloudberry-pxf/bin:$PATH
Then you can prepare and start up PXF by doing the following.
pxf preparepxf start
If${HOME}/pxf-base does not exist,pxf prepare will create the directory for you. This command should only need to be run once.
Note: Local development with PXF requires a running Cloudberry cluster.
Once the desired changes have been made, there are 2 options to re-install PXF:
- Run
make -sj4 installto re-install and run tests - Run
make -sj4 install-serverto only re-install the PXF server without running unit tests.
After PXF has been re-installed, you can restart the PXF instance using:
pxf restart
Note
Since the docker container will house all Single cluster Hadoop, Cloudberry and PXF, we recommend that you have at least 4 cpus and 6GB memory allocated to Docker. These settings are available under docker preferences.
We provide a Docker-based development environment that includes Cloudberry, Hadoop, and PXF. Seeautomation/README.Docker.md for detailed instructions.
- Start IntelliJ. Click "Open" and select the directory to which you cloned the
pxfrepo. - Select
File > Project Structure. - Make sure you have a JDK (version 1.8) selected.
- In the
Project Settings > Modulessection, selectImport Module, pick thepxf/serverdirectory and import as a Gradle module. You may see an error saying that there'sno JDK set for Gradle. Just cancel and retry. It goes away the second time. - Import a second module, giving the
pxf/automationdirectory, select "Import module from external model", pickMaventhen click Finish. - Restart IntelliJ
- Check that it worked by running a unit test (cannot currently run automation tests from IntelliJ) and making sure that imports, variables, and auto-completion function in the two modules.
- Optionally you can replace
${PXF_TMP_DIR}with${GPHOME}/pxf/tmpinautomation/pom.xml - Select
Tools > Create Command-line Launcher...to enable starting Intellij with theideacommand, e.g.cd ~/workspace/pxf && idea ..
- In IntelliJ, click
Edit Configurationand add a new one of typeRemote - Change the name to
PXF Service Boot - Change the port number to
2020 - Save the configuration
- Restart PXF in DEBUG Mode
PXF_DEBUG=true pxf restart - Debug the new configuration in IntelliJ
- Run a query in CloudberryDB that uses PXF to debug with IntelliJ
See theCONTRIBUTING file for how to make contributions dedicated to the PXF for Cloudberry Database.
Under Apache License V2.0, See theLICENSE for details.
About
Platform Extension Framework (PXF) for Apache Cloudberry (Incubating)
Topics
Resources
License
Code of conduct
Contributing
Security policy
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.