databricks/databricks-sql-pythonPublic

NotificationsYou must be signed in to change notification settings
Fork126
Star210

Commit7f77c3e

authored and

committed

PySQL Connector split into connector and sqlalchemy (#444)

* Modified the gitignore file to not have .idea file* [PECO-1803] Splitting the PySql connector into the core and the non core part (#417)* Implemented ColumnQueue to test the fetchall without pyarrowRemoved tokenremoved token* order of fields in row corrected* Changed the folder structure and tested the basic setup to work* Refractored the code to make connector to work* Basic Setup of connector, core and sqlalchemy is working* Basic integration of core, connect and sqlalchemy is working* Setup working dynamic change from ColumnQueue to ArrowQueue* Refractored the test code and moved to respective folders* Added the unit test for column_queueFixed __version__Fix* venv_main added to git ignore* Added code for merging columnar table* Merging code for columnar* Fixed the retry_close sesssion test issue with logging* Fixed the databricks_sqlalchemy tests and introduced pytest.ini for the sqla_testing* Added pyarrow_test mark on pytest* Fixed databricks.sqlalchemy to databricks_sqlalchemy imports* Added poetry.lock* Added dist folder* Changed the pyproject.toml* Minor Fix* Added the pyarrow skip tag on unit tests and tested their working* Fixed the Decimal and timestamp conversion issue in non arrow pipeline* Removed not required files and reformatted* Fixed test_retry error* Changed the folder structure to src / databricks* Removed the columnar non arrow flow to another PR* Moved the README to the root* removed columnQueue instance* Revmoved databricks_sqlalchemy dependency in core* Changed the pysql_supports_arrow predicate, introduced changes in the pyproject.toml* Ran the black formatter with the original version* Extra .py removed from all the __init__.py files names* Undo formatting check* Check* Check* Check* Check* Check* Check* Check* Check* Check* Check* Check* Check* Check* Check* BIG UPDATE* Refeactor code* Refractor* Fixed versioning* Minor refractoring* Minor refractoring* Changed the folder structure such that sqlalchemy has not reference here* Fixed README.md and CONTRIBUTING.md* Added manual publish* On push trigger added* Manually setting the publish step* Changed versioning in pyproject.toml* Bumped up the version to 4.0.0.b3 and also changed the structure to have pyarrow as optional* Removed the sqlalchemy tests from integration.yml file* [PECO-1803] Print warning message if pyarrow is not installed (#468)Print warning message if pyarrow is not installedSigned-off-by: Jacky Hu <jacky.hu@databricks.com>* [PECO-1803] Remove sqlalchemy and update README.md (#469)Remove sqlalchemy and update README.mdSigned-off-by: Jacky Hu <jacky.hu@databricks.com>* Removed all sqlalchemy related stuff* generated the lock file* Fixed failing tests* removed poetry.lock* Updated the lock file* Fixed poetry numpy 2.2.2 issue* Workflow fixes---------Signed-off-by: Jacky Hu <jacky.hu@databricks.com>Co-authored-by: Jacky Hu <jacky.hu@databricks.com>Signed-off-by: varun-edachali-dbx <varun.edachali@databricks.com>

1 parentf9d6ef1 commit7f77c3eCopy full SHA for 7f77c3e

File tree

41 files changed

+467

-4911

lines changed

.github/workflows
.gitignore
CHANGELOG.md
CONTRIBUTING.md
README.md
examples
- sqlalchemy.py
poetry.lock
pyproject.toml
src/databricks
- sqlalchemy
- sql
  - client.py
tests/unit

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

41 files changed

+467

-4911

lines changed

`‎.github/workflows/code-quality-checks.yml‎`

Lines changed: 51 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -58,6 +58,57 @@ jobs:`
`58`	`58`	`#----------------------------------------------`
`59`	`59`	`-name:Run tests`
`60`	`60`	`run:poetry run python -m pytest tests/unit`
	`61`	`+run-unit-tests-with-arrow:`
	`62`	`+runs-on:ubuntu-latest`
	`63`	`+strategy:`
	`64`	`+matrix:`
	`65`	`+python-version:[ 3.8, 3.9, "3.10", "3.11" ]`
	`66`	`+steps:`
	`67`	`+#----------------------------------------------`
	`68`	`+# check-out repo and set-up python`
	`69`	`+#----------------------------------------------`
	`70`	`+ -name:Check out repository`
	`71`	`+uses:actions/checkout@v2`
	`72`	`+ -name:Set up python ${{ matrix.python-version }}`
	`73`	`+id:setup-python`
	`74`	`+uses:actions/setup-python@v2`
	`75`	`+with:`
	`76`	`+python-version:${{ matrix.python-version }}`
	`77`	`+#----------------------------------------------`
	`78`	`+# ----- install & configure poetry -----`
	`79`	`+#----------------------------------------------`
	`80`	`+ -name:Install Poetry`
	`81`	`+uses:snok/install-poetry@v1`
	`82`	`+with:`
	`83`	`+virtualenvs-create:true`
	`84`	`+virtualenvs-in-project:true`
	`85`	`+installer-parallel:true`
	`86`	`+`
	`87`	`+#----------------------------------------------`
	`88`	`+# load cached venv if cache exists`
	`89`	`+#----------------------------------------------`
	`90`	`+ -name:Load cached venv`
	`91`	`+id:cached-poetry-dependencies`
	`92`	`+uses:actions/cache@v2`
	`93`	`+with:`
	`94`	`+path:.venv-pyarrow`
	`95`	`+key:venv-pyarrow-${{ runner.os }}-${{ steps.setup-python.outputs.python-version }}-${{ github.event.repository.name }}-${{ hashFiles('**/poetry.lock') }}`
	`96`	`+#----------------------------------------------`
	`97`	`+# install dependencies if cache does not exist`
	`98`	`+#----------------------------------------------`
	`99`	`+ -name:Install dependencies`
	`100`	`+if:steps.cached-poetry-dependencies.outputs.cache-hit != 'true'`
	`101`	`+run:poetry install --no-interaction --no-root`
	`102`	`+#----------------------------------------------`
	`103`	`+# install your root project, if required`
	`104`	`+#----------------------------------------------`
	`105`	`+ -name:Install library`
	`106`	`+run:poetry install --no-interaction --all-extras`
	`107`	`+#----------------------------------------------`
	`108`	`+# run test suite`
	`109`	`+#----------------------------------------------`
	`110`	`+ -name:Run tests`
	`111`	`+run:poetry run python -m pytest tests/unit`
`61`	`112`	`check-linting:`
`62`	`113`	`runs-on:ubuntu-latest`
`63`	`114`	`strategy:`

`‎.github/workflows/integration.yml‎`

Lines changed: 0 additions & 2 deletions

Original file line number	Diff line number	Diff line change
`@@ -55,5 +55,3 @@ jobs:`
`55`	`55`	`#----------------------------------------------`
`56`	`56`	`-name:Run e2e tests`
`57`	`57`	`run:poetry run python -m pytest tests/e2e`
`58`		`- -name:Run SQL Alchemy tests`
`59`		`-run:poetry run python -m pytest src/databricks/sqlalchemy/test_local`

`‎.github/workflows/publish-manual.yml‎`

Lines changed: 78 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,78 @@`
	`1`	`+name:Publish to PyPI Manual [Production]`
	`2`	`+`
	`3`	`+# Allow manual triggering of the workflow`
	`4`	`+on:`
	`5`	`+workflow_dispatch:{}`
	`6`	`+`
	`7`	`+jobs:`
	`8`	`+publish:`
	`9`	`+name:Publish`
	`10`	`+runs-on:ubuntu-latest`
	`11`	`+`
	`12`	`+steps:`
	`13`	`+#----------------------------------------------`
	`14`	`+# Step 1: Check out the repository code`
	`15`	`+#----------------------------------------------`
	`16`	`+ -name:Check out repository`
	`17`	`+uses:actions/checkout@v2# Check out the repository to access the code`
	`18`	`+`
	`19`	`+#----------------------------------------------`
	`20`	`+# Step 2: Set up Python environment`
	`21`	`+#----------------------------------------------`
	`22`	`+ -name:Set up python`
	`23`	`+id:setup-python`
	`24`	`+uses:actions/setup-python@v2`
	`25`	`+with:`
	`26`	`+python-version:3.9# Specify the Python version to be used`
	`27`	`+`
	`28`	`+#----------------------------------------------`
	`29`	`+# Step 3: Install and configure Poetry`
	`30`	`+#----------------------------------------------`
	`31`	`+ -name:Install Poetry`
	`32`	`+uses:snok/install-poetry@v1# Install Poetry, the Python package manager`
	`33`	`+with:`
	`34`	`+virtualenvs-create:true`
	`35`	`+virtualenvs-in-project:true`
	`36`	`+installer-parallel:true`
	`37`	`+`
	`38`	`+# #----------------------------------------------`
	`39`	`+# # Step 4: Load cached virtual environment (if available)`
	`40`	`+# #----------------------------------------------`
	`41`	`+# - name: Load cached venv`
	`42`	`+# id: cached-poetry-dependencies`
	`43`	`+# uses: actions/cache@v2`
	`44`	`+# with:`
	`45`	`+# path: .venv # Path to the virtual environment`
	`46`	`+# key: venv-${{ runner.os }}-${{ steps.setup-python.outputs.python-version }}-${{ github.event.repository.name }}-${{ hashFiles('**/poetry.lock') }}`
	`47`	+# # Cache key is generated based on OS, Python version, repo name, and the `poetry.lock` file hash
	`48`	`+`
	`49`	`+# #----------------------------------------------`
	`50`	`+# # Step 5: Install dependencies if the cache is not found`
	`51`	`+# #----------------------------------------------`
	`52`	`+# - name: Install dependencies`
	`53`	`+# if: steps.cached-poetry-dependencies.outputs.cache-hit != 'true' # Only run if the cache was not hit`
	`54`	`+# run: poetry install --no-interaction --no-root # Install dependencies without interaction`
	`55`	`+`
	`56`	`+# #----------------------------------------------`
	`57`	`+# # Step 6: Update the version to the manually provided version`
	`58`	`+# #----------------------------------------------`
	`59`	`+# - name: Update pyproject.toml with the specified version`
	`60`	`+# run: poetry version ${{ github.event.inputs.version }} # Use the version provided by the user input`
	`61`	`+`
	`62`	`+#----------------------------------------------`
	`63`	`+# Step 7: Build and publish the first package to PyPI`
	`64`	`+#----------------------------------------------`
	`65`	`+ -name:Build and publish databricks sql connector to PyPI`
	`66`	`+working-directory:./databricks_sql_connector`
	`67`	`+run:\|`
	`68`	`+ poetry build`
	`69`	`+ poetry publish -u __token__ -p ${{ secrets.PROD_PYPI_TOKEN }} # Publish with PyPI token`
	`70`	`+#----------------------------------------------`
	`71`	`+# Step 7: Build and publish the second package to PyPI`
	`72`	`+#----------------------------------------------`
	`73`	`+`
	`74`	`+ -name:Build and publish databricks sql connector core to PyPI`
	`75`	`+working-directory:./databricks_sql_connector_core`
	`76`	`+run:\|`
	`77`	`+ poetry build`
	`78`	`+ poetry publish -u __token__ -p ${{ secrets.PROD_PYPI_TOKEN }} # Publish with PyPI token`

`‎.gitignore‎`

Lines changed: 1 addition & 1 deletion

Original file line number	Diff line number	Diff line change
`@@ -195,7 +195,7 @@ cython_debug/`
`195`	`195`	`# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore`
`196`	`196`	`# and can be added to the global gitignore or merged into this file. For a more nuclear`
`197`	`197`	`# option (not recommended) you can uncomment the following to ignore the entire idea folder.`
`198`		`-#.idea/`
	`198`	`+.idea/`
`199`	`199`
`200`	`200`	`# End of https://www.toptal.com/developers/gitignore/api/python,macos`
`201`	`201`

`‎CHANGELOG.md‎`

Lines changed: 5 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -1,5 +1,10 @@`
`1`	`1`	`#Release History`
`2`	`2`
	`3`	`+#4.0.0 (TBD)`
	`4`	`+`
	`5`	+- Split the connector into two separate packages:`databricks-sql-connector` and`databricks-sqlalchemy`. The`databricks-sql-connector` package contains the core functionality of the connector, while the`databricks-sqlalchemy` package contains the SQLAlchemy dialect for the connector.
	`6`	+- Pyarrow dependency is now optional in`databricks-sql-connector`. Users needing arrow are supposed to explicitly install pyarrow
	`7`	`+`
`3`	`8`	`#3.7.0 (2024-12-23)`
`4`	`9`
`5`	`10`	`- Fix: Incorrect number of rows fetched in inline results when fetching results with FETCH_NEXT orientation (databricks/databricks-sql-python#479 by@jprakash-db)`

`‎CONTRIBUTING.md‎`

Lines changed: 0 additions & 3 deletions

Original file line number	Diff line number	Diff line change
@@ -144,9 +144,6 @@ The `PySQLStagingIngestionTestSuite` namespace requires a cluster running DBR ve
`144`	`144`
`145`	`145`	The suites marked`[not documented]` require additional configuration which will be documented at a later time.
`146`	`146`
`147`		`-####SQLAlchemy dialect tests`
`148`		`-`
`149`		`-See README.tests.md for details.`
`150`	`147`
`151`	`148`	`###Code formatting`
`152`	`149`

`‎README.md‎`

Lines changed: 20 additions & 3 deletions

Original file line number	Diff line number	Diff line change
`@@ -3,9 +3,9 @@`
`3`	`3`	`[![PyPI](https://img.shields.io/pypi/v/databricks-sql-connector?style=flat-square)](https://pypi.org/project/databricks-sql-connector/)`
`4`	`4`	`[![Downloads](https://pepy.tech/badge/databricks-sql-connector)](https://pepy.tech/project/databricks-sql-connector)`
`5`	`5`
`6`		-The Databricks SQL Connector for Python allows you to develop Python applications that connect to Databricks clusters and SQL warehouses. It is a Thrift-based client with no dependencies on ODBC or JDBC. It conforms to the[Python DB API 2.0 specification](https://www.python.org/dev/peps/pep-0249/) and exposes a[SQLAlchemy](https://www.sqlalchemy.org/) dialect for use with tools like`pandas` and`alembic` which use SQLAlchemy to execute DDL. Use`pip install databricks-sql-connector[sqlalchemy]` to install with SQLAlchemy's dependencies.`pip install databricks-sql-connector[alembic]` will install alembic's dependencies.
	`6`	`+The Databricks SQL Connector for Python allows you to develop Python applications that connect to Databricks clusters and SQL warehouses. It is a Thrift-based client with no dependencies on ODBC or JDBC. It conforms to the[Python DB API 2.0 specification](https://www.python.org/dev/peps/pep-0249/).`
`7`	`7`
`8`		-This connector uses Arrow as the data-exchange format, and supports APIs to directly fetch Arrow tables. Arrow tables are wrapped in the`ArrowQueue` class to provide a natural API to get several rows at a time.
	`8`	+This connector uses Arrow as the data-exchange format, and supports APIs(e.g.`fetchmany_arrow`)to directly fetch Arrow tables. Arrow tables are wrapped in the`ArrowQueue` class to provide a natural API to get several rows at a time.[PyArrow](https://arrow.apache.org/docs/python/index.html) is required to enable this and use these APIs, you can install it via`pip install pyarrow` or`pip install databricks-sql-connector[pyarrow]`.
`9`	`9`
`10`	`10`	`You are welcome to file an issue here for general use cases. You can also contact Databricks Support[here](help.databricks.com).`
`11`	`11`
`@@ -22,7 +22,12 @@ For the latest documentation, see`
`22`	`22`
`23`	`23`	`##Quickstart`
`24`	`24`
`25`		-Install the library with`pip install databricks-sql-connector`
	`25`	`+###Installing the core library`
	`26`	+Install using`pip install databricks-sql-connector`
	`27`	`+`
	`28`	`+###Installing the core library with PyArrow`
	`29`	+Install using`pip install databricks-sql-connector[pyarrow]`
	`30`	`+`
`26`	`31`
`27`	`32`	```bash
`28`	`33`	`export DATABRICKS_HOST=********.databricks.com`
`@@ -60,6 +65,18 @@ or to a Databricks Runtime interactive cluster (e.g. /sql/protocolv1/o/123456789`
`60`	`65`	`>to authenticate the target Databricks user account and needs to open the browser for authentication. So it`
`61`	`66`	`>can only run on the user's machine.`
`62`	`67`
	`68`	`+##SQLAlchemy`
	`69`	+Starting from`databricks-sql-connector` version 4.0.0 SQLAlchemy support has been extracted to a new library`databricks-sqlalchemy`.
	`70`	`+`
	`71`	`+- Github repository[databricks-sqlalchemy github](https://github.com/databricks/databricks-sqlalchemy)`
	`72`	`+- PyPI[databricks-sqlalchemy pypi](https://pypi.org/project/databricks-sqlalchemy/)`
	`73`	`+`
	`74`	`+###Quick SQLAlchemy guide`
	`75`	`+Users can now choose between using the SQLAlchemy v1 or SQLAlchemy v2 dialects with the connector core`
	`76`	`+`
	`77`	+- Install the latest SQLAlchemy v1 using`pip install databricks-sqlalchemy~=1.0`
	`78`	+- Install SQLAlchemy v2 using`pip install databricks-sqlalchemy`
	`79`	`+`
`63`	`80`
`64`	`81`	`##Contributing`
`65`	`82`

`‎examples/sqlalchemy.py‎`

Lines changed: 0 additions & 174 deletions

This file was deleted.

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit7f77c3e

File tree

41 files changed

Some content is hidden

41 files changed

`‎.github/workflows/code-quality-checks.yml‎`

`‎.github/workflows/integration.yml‎`

`‎.github/workflows/publish-manual.yml‎`

`‎.gitignore‎`

`‎CHANGELOG.md‎`

`‎CONTRIBUTING.md‎`

`‎README.md‎`

`‎examples/sqlalchemy.py‎`

0 commit comments