- Notifications
You must be signed in to change notification settings - Fork1
Tips for developing Apache Spark, especially in IntelliJ IDEA
License
jeff303/spark-development-tips
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
First, follow thedocumented setup instructions for IntelliJ IDEA.
Next, ensure you have a valid Python version installed on your machine, and set up an SDK pointing to it. One thatis managed bypyenv
will work fine.
Next, add thepython
directory as a module, usingFile/New/Module from Existing Sources.... Next, in the settingsfor the new module, associate it with the Python SDK just created above.
Now, do a full project build to ensure there are no errors.
With all of the above done, you should be able to debug the tests underpython/pyspark/tests
, after a few environmentvariables are set up. The easiest way to do this is to create a new debug configuration for one of the Python tests(which will probably fail initially), then edit it. You will need to set the following:
- Working directory:
/path/to/source/spark
- Environment variables:
PYSPARK_PYTHON
=/path/to/your/python
(the same one the SDK points to)PYSPARK_DRIVER_VERSION
=$PYSPARK_PYTHON
At this point, you should be able to re-run the debug configuration and hit breakpoints.
About
Tips for developing Apache Spark, especially in IntelliJ IDEA
Resources
License
Uh oh!
There was an error while loading.Please reload this page.