Dec 27, 2024 · Aug 14, 2024 · Sep 24, 2024 · Sep 25, 2024 · Oct 8, 2024 · Oct 8, 2024
diff --git a/.github/workflows/code-quality-checks.yml b/.github/workflows/code-quality-checks.yml
      #----------------------------------------------
      - name: Run tests
        run: poetry run python -m pytest tests/unit
  run-unit-tests-with-arrow:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: [ 3.8, 3.9, "3.10", "3.11" ]
    steps:
      #----------------------------------------------
      #       check-out repo and set-up python
      #----------------------------------------------
      -   name: Check out repository
          uses: actions/checkout@v2
      -   name: Set up python ${{ matrix.python-version }}
          id: setup-python
          uses: actions/setup-python@v2
          with:
            python-version: ${{ matrix.python-version }}
      #----------------------------------------------
      #  -----  install & configure poetry  -----
      #----------------------------------------------
      -   name: Install Poetry
          uses: snok/install-poetry@v1
          with:
            virtualenvs-create: true
            virtualenvs-in-project: true
            installer-parallel: true

      #----------------------------------------------
      #       load cached venv if cache exists
      #----------------------------------------------
      -   name: Load cached venv
          id: cached-poetry-dependencies
          uses: actions/cache@v2
          with:
            path: .venv-pyarrow
            key: venv-pyarrow-${{ runner.os }}-${{ steps.setup-python.outputs.python-version }}-${{ github.event.repository.name }}-${{ hashFiles('**/poetry.lock') }}
      #----------------------------------------------
      # install dependencies if cache does not exist
      #----------------------------------------------
      -   name: Install dependencies
          if: steps.cached-poetry-dependencies.outputs.cache-hit != 'true'
          run: poetry install --no-interaction --no-root
      #----------------------------------------------
      # install your root project, if required
      #----------------------------------------------
      -   name: Install library
          run: poetry install --no-interaction --all-extras
      #----------------------------------------------
      #              run test suite
      #----------------------------------------------
      -   name: Run tests
          run: poetry run python -m pytest tests/unit
  check-linting:
    runs-on: ubuntu-latest
    strategy:
diff --git a/.github/workflows/integration.yml b/.github/workflows/integration.yml
      #----------------------------------------------
      - name: Run e2e tests
        run: poetry run python -m pytest tests/e2e
      - name: Run SQL Alchemy tests
        run: poetry run python -m pytest src/databricks/sqlalchemy/test_local
diff --git a/.github/workflows/publish-manual.yml b/.github/workflows/publish-manual.yml
 name: Publish to PyPI Manual [Production]

 # Allow manual triggering of the workflow
 on:
  workflow_dispatch: {}

 jobs:
  publish:
    name: Publish
    runs-on: ubuntu-latest

    steps:
      #----------------------------------------------
      # Step 1: Check out the repository code
      #----------------------------------------------
      - name: Check out repository
        uses: actions/checkout@v2  # Check out the repository to access the code

      #----------------------------------------------
      # Step 2: Set up Python environment
      #----------------------------------------------
      - name: Set up python
        id: setup-python
        uses: actions/setup-python@v2
        with:
          python-version: 3.9  # Specify the Python version to be used

      #----------------------------------------------
      # Step 3: Install and configure Poetry
      #----------------------------------------------
      - name: Install Poetry
        uses: snok/install-poetry@v1  # Install Poetry, the Python package manager
        with:
          virtualenvs-create: true
          virtualenvs-in-project: true
          installer-parallel: true

 #      #----------------------------------------------
 #      # Step 4: Load cached virtual environment (if available)
 #      #----------------------------------------------
 #      - name: Load cached venv
 #        id: cached-poetry-dependencies
 #        uses: actions/cache@v2
 #        with:
 #          path: .venv  # Path to the virtual environment
 #          key: venv-${{ runner.os }}-${{ steps.setup-python.outputs.python-version }}-${{ github.event.repository.name }}-${{ hashFiles('**/poetry.lock') }}
 #          # Cache key is generated based on OS, Python version, repo name, and the `poetry.lock` file hash

 #      #----------------------------------------------
 #      # Step 5: Install dependencies if the cache is not found
 #      #----------------------------------------------
 #      - name: Install dependencies
 #        if: steps.cached-poetry-dependencies.outputs.cache-hit != 'true'  # Only run if the cache was not hit
 #        run: poetry install --no-interaction --no-root  # Install dependencies without interaction

 #      #----------------------------------------------
 #      # Step 6: Update the version to the manually provided version
 #      #----------------------------------------------
 #      - name: Update pyproject.toml with the specified version
 #        run: poetry version ${{ github.event.inputs.version }}  # Use the version provided by the user input

      #----------------------------------------------
      # Step 7: Build and publish the first package to PyPI
      #----------------------------------------------
      - name: Build and publish databricks sql connector to PyPI
        working-directory: ./databricks_sql_connector
        run: |
          poetry build
          poetry publish -u __token__ -p ${{ secrets.PROD_PYPI_TOKEN }}  # Publish with PyPI token
      #----------------------------------------------
      # Step 7: Build and publish the second package to PyPI
      #----------------------------------------------

      - name: Build and publish databricks sql connector core to PyPI
        working-directory: ./databricks_sql_connector_core
        run: |
          poetry build
          poetry publish -u __token__ -p ${{ secrets.PROD_PYPI_TOKEN }}  # Publish with PyPI token
diff --git a/.gitignore b/.gitignore
 #  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
 #  and can be added to the global gitignore or merged into this file.  For a more nuclear
 #  option (not recommended) you can uncomment the following to ignore the entire idea folder.
 #.idea/
 .idea/

 # End of https://www.toptal.com/developers/gitignore/api/python,macos

diff --git a/CHANGELOG.md b/CHANGELOG.md
 # Release History

 # 4.0.0 (TBD)

 - Split the connector into two separate packages: `databricks-sql-connector` and `databricks-sqlalchemy`. The `databricks-sql-connector` package contains the core functionality of the connector, while the `databricks-sqlalchemy` package contains the SQLAlchemy dialect for the connector.
 - Pyarrow dependency is now optional in `databricks-sql-connector`. Users needing arrow are supposed to explicitly install pyarrow

 # 3.7.0 (2024-12-23)

 - Fix: Incorrect number of rows fetched in inline results when fetching results with FETCH_NEXT orientation (databricks/databricks-sql-python#479 by @jprakash-db)
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md

 The suites marked `[not documented]` require additional configuration which will be documented at a later time.

 #### SQLAlchemy dialect tests

 See README.tests.md for details.

 ### Code formatting

diff --git a/README.md b/README.md
 [![PyPI](https://img.shields.io/pypi/v/databricks-sql-connector?style=flat-square)](https://pypi.org/project/databricks-sql-connector/)
 [![Downloads](https://pepy.tech/badge/databricks-sql-connector)](https://pepy.tech/project/databricks-sql-connector)

 The Databricks SQL Connector for Python allows you to develop Python applications that connect to Databricks clusters and SQL warehouses. It is a Thrift-based client with no dependencies on ODBC or JDBC. It conforms to the [Python DB API 2.0 specification](https://www.python.org/dev/peps/pep-0249/) and exposes a [SQLAlchemy](https://www.sqlalchemy.org/) dialect for use with tools like `pandas` and `alembic` which use SQLAlchemy to execute DDL. Use `pip install databricks-sql-connector[sqlalchemy]` to install with SQLAlchemy's dependencies. `pip install databricks-sql-connector[alembic]` will install alembic's dependencies.
 The Databricks SQL Connector for Python allows you to develop Python applications that connect to Databricks clusters and SQL warehouses. It is a Thrift-based client with no dependencies on ODBC or JDBC. It conforms to the [Python DB API 2.0 specification](https://www.python.org/dev/peps/pep-0249/).

 This connector uses Arrow as the data-exchange format, and supports APIs to directly fetch Arrow tables. Arrow tables are wrapped in the `ArrowQueue` class to provide a natural API to get several rows at a time.
 This connector uses Arrow as the data-exchange format, and supports APIs(e.g. `fetchmany_arrow`)to directly fetch Arrow tables. Arrow tables are wrapped in the `ArrowQueue` class to provide a natural API to get several rows at a time. [PyArrow](https://arrow.apache.org/docs/python/index.html) is required to enable this and use these APIs, you can install it via  `pip install pyarrow` or `pip install databricks-sql-connector[pyarrow]`.

 You are welcome to file an issue here for general use cases. You can also contact Databricks Support [here](help.databricks.com).


 ## Quickstart

 Install the library with `pip install databricks-sql-connector`
 ### Installing the core library
 Install using `pip install databricks-sql-connector`

 ### Installing the core library with PyArrow
 Install using `pip install databricks-sql-connector[pyarrow]`


 ```bash
 export DATABRICKS_HOST=********.databricks.com
 > to authenticate the target Databricks user account and needs to open the browser for authentication. So it
 > can only run on the user's machine.

 ## SQLAlchemy
 Starting from `databricks-sql-connector` version 4.0.0 SQLAlchemy support has been extracted to a new library `databricks-sqlalchemy`.

 - Github repository [databricks-sqlalchemy github](https://github.com/databricks/databricks-sqlalchemy)
 - PyPI [databricks-sqlalchemy pypi](https://pypi.org/project/databricks-sqlalchemy/)

 ### Quick SQLAlchemy guide
 Users can now choose between using the SQLAlchemy v1 or SQLAlchemy v2 dialects with the connector core

 - Install the latest SQLAlchemy v1 using `pip install databricks-sqlalchemy~=1.0`
 - Install SQLAlchemy v2 using `pip install databricks-sqlalchemy`


 ## Contributing

diff --git a/examples/sqlalchemy.py b/examples/sqlalchemy.py
Original file line number	Diff line number	Diff line change
Expand Up		@@ -58,6 +58,57 @@ jobs:
		#----------------------------------------------
		- name: Run tests
		run: poetry run python -m pytest tests/unit
		run-unit-tests-with-arrow:
		runs-on: ubuntu-latest
		strategy:
		matrix:
		python-version: [ 3.8, 3.9, "3.10", "3.11" ]
		steps:
		#----------------------------------------------
		# check-out repo and set-up python
		#----------------------------------------------
		- name: Check out repository
		uses: actions/checkout@v2
		- name: Set up python ${{ matrix.python-version }}
		id: setup-python
		uses: actions/setup-python@v2
		with:
		python-version: ${{ matrix.python-version }}
		#----------------------------------------------
		# ----- install & configure poetry -----
		#----------------------------------------------
		- name: Install Poetry
		uses: snok/install-poetry@v1
		with:
		virtualenvs-create: true
		virtualenvs-in-project: true
		installer-parallel: true

		#----------------------------------------------
		# load cached venv if cache exists
		#----------------------------------------------
		- name: Load cached venv
		id: cached-poetry-dependencies
		uses: actions/cache@v2
		with:
		path: .venv-pyarrow
		key: venv-pyarrow-${{ runner.os }}-${{ steps.setup-python.outputs.python-version }}-${{ github.event.repository.name }}-${{ hashFiles('**/poetry.lock') }}
		#----------------------------------------------
		# install dependencies if cache does not exist
		#----------------------------------------------
		- name: Install dependencies
		if: steps.cached-poetry-dependencies.outputs.cache-hit != 'true'
		run: poetry install --no-interaction --no-root
		#----------------------------------------------
		# install your root project, if required
		#----------------------------------------------
		- name: Install library
		run: poetry install --no-interaction --all-extras
		#----------------------------------------------
		# run test suite
		#----------------------------------------------
		- name: Run tests
		run: poetry run python -m pytest tests/unit
		check-linting:
		runs-on: ubuntu-latest
		strategy:
Expand Down
Original file line number	Diff line number	Diff line change
Expand Up		@@ -55,5 +55,3 @@ jobs:
		#----------------------------------------------
		- name: Run e2e tests
		run: poetry run python -m pytest tests/e2e
		- name: Run SQL Alchemy tests
		run: poetry run python -m pytest src/databricks/sqlalchemy/test_local
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,78 @@
		name: Publish to PyPI Manual [Production]

		# Allow manual triggering of the workflow
		on:
		workflow_dispatch: {}

		jobs:
		publish:
		name: Publish
		runs-on: ubuntu-latest

		steps:
		#----------------------------------------------
		# Step 1: Check out the repository code
		#----------------------------------------------
		- name: Check out repository
		uses: actions/checkout@v2 # Check out the repository to access the code

		#----------------------------------------------
		# Step 2: Set up Python environment
		#----------------------------------------------
		- name: Set up python
		id: setup-python
		uses: actions/setup-python@v2
		with:
		python-version: 3.9 # Specify the Python version to be used

		#----------------------------------------------
		# Step 3: Install and configure Poetry
		#----------------------------------------------
		- name: Install Poetry
		uses: snok/install-poetry@v1 # Install Poetry, the Python package manager
		with:
		virtualenvs-create: true
		virtualenvs-in-project: true
		installer-parallel: true

		# #----------------------------------------------
		# # Step 4: Load cached virtual environment (if available)
		# #----------------------------------------------
		# - name: Load cached venv
		# id: cached-poetry-dependencies
		# uses: actions/cache@v2
		# with:
		# path: .venv # Path to the virtual environment
		# key: venv-${{ runner.os }}-${{ steps.setup-python.outputs.python-version }}-${{ github.event.repository.name }}-${{ hashFiles('**/poetry.lock') }}
		# # Cache key is generated based on OS, Python version, repo name, and the `poetry.lock` file hash

		# #----------------------------------------------
		# # Step 5: Install dependencies if the cache is not found
		# #----------------------------------------------
		# - name: Install dependencies
		# if: steps.cached-poetry-dependencies.outputs.cache-hit != 'true' # Only run if the cache was not hit
		# run: poetry install --no-interaction --no-root # Install dependencies without interaction

		# #----------------------------------------------
		# # Step 6: Update the version to the manually provided version
		# #----------------------------------------------
		# - name: Update pyproject.toml with the specified version
		# run: poetry version ${{ github.event.inputs.version }} # Use the version provided by the user input

		#----------------------------------------------
		# Step 7: Build and publish the first package to PyPI
		#----------------------------------------------
		- name: Build and publish databricks sql connector to PyPI
		working-directory: ./databricks_sql_connector
		run: \|
		poetry build
		poetry publish -u __token__ -p ${{ secrets.PROD_PYPI_TOKEN }} # Publish with PyPI token
		#----------------------------------------------
		# Step 7: Build and publish the second package to PyPI
		#----------------------------------------------

		- name: Build and publish databricks sql connector core to PyPI
		working-directory: ./databricks_sql_connector_core
		run: \|
		poetry build
		poetry publish -u __token__ -p ${{ secrets.PROD_PYPI_TOKEN }} # Publish with PyPI token
Original file line number	Diff line number	Diff line change
Expand Up		@@ -195,7 +195,7 @@ cython_debug/
		# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
		# and can be added to the global gitignore or merged into this file. For a more nuclear
		# option (not recommended) you can uncomment the following to ignore the entire idea folder.
		#.idea/
		.idea/

		# End of https://www.toptal.com/developers/gitignore/api/python,macos

Expand Down
Original file line number	Diff line number	Diff line change
		@@ -1,5 +1,10 @@
		# Release History

		# 4.0.0 (TBD)

		- Split the connector into two separate packages: `databricks-sql-connector` and `databricks-sqlalchemy`. The `databricks-sql-connector` package contains the core functionality of the connector, while the `databricks-sqlalchemy` package contains the SQLAlchemy dialect for the connector.
		- Pyarrow dependency is now optional in `databricks-sql-connector`. Users needing arrow are supposed to explicitly install pyarrow

		# 3.7.0 (2024-12-23)

		- Fix: Incorrect number of rows fetched in inline results when fetching results with FETCH_NEXT orientation (databricks/databricks-sql-python#479 by @jprakash-db)
Expand Down
Original file line number	Diff line number	Diff line change
Expand Up		@@ -144,9 +144,6 @@ The `PySQLStagingIngestionTestSuite` namespace requires a cluster running DBR ve

		The suites marked `[not documented]` require additional configuration which will be documented at a later time.

		#### SQLAlchemy dialect tests

		See README.tests.md for details.

		### Code formatting

Expand Down
Original file line number	Diff line number	Diff line change
Expand Up		@@ -3,9 +3,9 @@
		[![PyPI](https://img.shields.io/pypi/v/databricks-sql-connector?style=flat-square)](https://pypi.org/project/databricks-sql-connector/)
		[![Downloads](https://pepy.tech/badge/databricks-sql-connector)](https://pepy.tech/project/databricks-sql-connector)

		The Databricks SQL Connector for Python allows you to develop Python applications that connect to Databricks clusters and SQL warehouses. It is a Thrift-based client with no dependencies on ODBC or JDBC. It conforms to the [Python DB API 2.0 specification](https://www.python.org/dev/peps/pep-0249/) and exposes a [SQLAlchemy](https://www.sqlalchemy.org/) dialect for use with tools like `pandas` and `alembic` which use SQLAlchemy to execute DDL. Use `pip install databricks-sql-connector[sqlalchemy]` to install with SQLAlchemy's dependencies. `pip install databricks-sql-connector[alembic]` will install alembic's dependencies.
		The Databricks SQL Connector for Python allows you to develop Python applications that connect to Databricks clusters and SQL warehouses. It is a Thrift-based client with no dependencies on ODBC or JDBC. It conforms to the [Python DB API 2.0 specification](https://www.python.org/dev/peps/pep-0249/).

		This connector uses Arrow as the data-exchange format, and supports APIs to directly fetch Arrow tables. Arrow tables are wrapped in the `ArrowQueue` class to provide a natural API to get several rows at a time.
		This connector uses Arrow as the data-exchange format, and supports APIs(e.g. `fetchmany_arrow`)to directly fetch Arrow tables. Arrow tables are wrapped in the `ArrowQueue` class to provide a natural API to get several rows at a time. [PyArrow](https://arrow.apache.org/docs/python/index.html) is required to enable this and use these APIs, you can install it via `pip install pyarrow` or `pip install databricks-sql-connector[pyarrow]`.

		You are welcome to file an issue here for general use cases. You can also contact Databricks Support [here](help.databricks.com).

Expand All		@@ -22,7 +22,12 @@ For the latest documentation, see

		## Quickstart

		Install the library with `pip install databricks-sql-connector`
		### Installing the core library
		Install using `pip install databricks-sql-connector`

		### Installing the core library with PyArrow
		Install using `pip install databricks-sql-connector[pyarrow]`


		```bash
		export DATABRICKS_HOST=********.databricks.com
Expand DownExpand Up		@@ -60,6 +65,18 @@ or to a Databricks Runtime interactive cluster (e.g. /sql/protocolv1/o/123456789
		> to authenticate the target Databricks user account and needs to open the browser for authentication. So it
		> can only run on the user's machine.

		## SQLAlchemy
		Starting from `databricks-sql-connector` version 4.0.0 SQLAlchemy support has been extracted to a new library `databricks-sqlalchemy`.

		- Github repository [databricks-sqlalchemy github](https://github.com/databricks/databricks-sqlalchemy)
		- PyPI [databricks-sqlalchemy pypi](https://pypi.org/project/databricks-sqlalchemy/)

		### Quick SQLAlchemy guide
		Users can now choose between using the SQLAlchemy v1 or SQLAlchemy v2 dialects with the connector core

		- Install the latest SQLAlchemy v1 using `pip install databricks-sqlalchemy~=1.0`
		- Install SQLAlchemy v2 using `pip install databricks-sqlalchemy`


		## Contributing

Expand Down