Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit3eaaac9

Browse files
author
Jesse
authored
Add SQLAlchemy Dialect (#57)
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
1 parente3f3a6f commit3eaaac9

File tree

13 files changed

+1216
-87
lines changed

13 files changed

+1216
-87
lines changed

‎README.md‎

Lines changed: 17 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
[![PyPI](https://img.shields.io/pypi/v/databricks-sql-connector?style=flat-square)](https://pypi.org/project/databricks-sql-connector/)
44
[![Downloads](https://pepy.tech/badge/databricks-sql-connector)](https://pepy.tech/project/databricks-sql-connector)
55

6-
The Databricks SQL Connector for Python allows you to develop Python applications that connect to Databricks clusters and SQL warehouses. It is a Thrift-based client with no dependencies on ODBC or JDBC. It conforms to the[Python DB API 2.0 specification](https://www.python.org/dev/peps/pep-0249/).
6+
The Databricks SQL Connector for Python allows you to develop Python applications that connect to Databricks clusters and SQL warehouses. It is a Thrift-based client with no dependencies on ODBC or JDBC. It conforms to the[Python DB API 2.0 specification](https://www.python.org/dev/peps/pep-0249/) and exposes a[SQLAlchemy](https://www.sqlalchemy.org/) dialect for use with tools like`pandas` and`alembic` which use SQLAlchemy to execute DDL.
77

88
This connector uses Arrow as the data-exchange format, and supports APIs to directly fetch Arrow tables. Arrow tables are wrapped in the`ArrowQueue` class to provide a natural API to get several rows at a time.
99

@@ -24,16 +24,27 @@ For the latest documentation, see
2424

2525
Install the library with`pip install databricks-sql-connector`
2626

27-
Example usage:
27+
Note: Don't hard-code authentication secrets into your Python. Use environment variables
28+
29+
```bash
30+
export DATABRICKS_HOST=********.databricks.com
31+
export DATABRICKS_HTTP_PATH=/sql/1.0/endpoints/****************
32+
export DATABRICKS_TOKEN=dapi********************************
33+
```
2834

35+
Example usage:
2936
```python
37+
import os
3038
from databricksimport sql
3139

32-
connection= sql.connect(
33-
server_hostname='********.databricks.com',
34-
http_path='/sql/1.0/endpoints/****************',
35-
access_token='dapi********************************')
40+
host= os.getenv("DATABRICKS_HOST)
41+
http_path= os.getenv("DATABRICKS_HTTP_PATH)
42+
access_token= os.getenv("DATABRICKS_ACCESS_TOKEN)
3643

44+
connection= sql.connect(
45+
server_hostname=host,
46+
http_path=http_path,
47+
access_token=access_token)
3748

3849
cursor= connection.cursor()
3950

‎examples/README.md‎

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,4 +36,5 @@ To run all of these examples you can clone the entire repository to your disk. O
3636
-**`persistent_oauth.py`** shows a more advanced example of authenticating by OAuth while Bring Your Own IDP is in public preview. In this case, it shows how to use a sublcass of`OAuthPersistence` to reuse an OAuth token across script executions.
3737
-**`set_user_agent.py`** shows how to customize the user agent header used for Thrift commands. In
3838
this example the string`ExamplePartnerTag` will be added to the the user agent on every request.
39-
-**`staging_ingestion.py`** shows how the connector handles Databricks' experimental staging ingestion commands`GET`,`PUT`, and`REMOVE`.
39+
-**`staging_ingestion.py`** shows how the connector handles Databricks' experimental staging ingestion commands`GET`,`PUT`, and`REMOVE`.
40+
-**`sqlalchemy.py`** shows a basic example of connecting to Databricks with[SQLAlchemy](https://www.sqlalchemy.org/).

‎examples/sqlalchemy.py‎

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
"""
2+
databricks-sql-connector includes a SQLAlchemy dialect compatible with Databricks SQL.
3+
It aims to be a drop-in replacement for the crflynn/sqlalchemy-databricks project, that implements
4+
more of the Databricks API, particularly around table reflection, Alembic usage, and data
5+
ingestion with pandas.
6+
7+
Because of the extent of SQLAlchemy's capabilities it isn't feasible to provide examples of every
8+
usage in a single script, so we only provide a basic one here. More examples are found in our test
9+
suite at tests/e2e/sqlalchemy/test_basic.py and in the PR that implements this change:
10+
11+
https://github.com/databricks/databricks-sql-python/pull/57
12+
13+
# What's already supported
14+
15+
Most of the functionality is demonstrated in the e2e tests mentioned above. The below list we
16+
derived from those test method names:
17+
18+
- Create and drop tables with SQLAlchemy Core
19+
- Create and drop tables with SQLAlchemy ORM
20+
- Read created tables via reflection
21+
- Modify column nullability
22+
- Insert records manually
23+
- Insert records with pandas.to_sql (note that this does not work for DataFrames with indexes)
24+
25+
This connector also aims to support Alembic for programmatic delta table schema maintenance. This
26+
behaviour is not yet backed by integration tests, which will follow in a subsequent PR as we learn
27+
more about customer use cases there. That said, the following behaviours have been tested manually:
28+
29+
- Autogenerate revisions with alembic revision --autogenerate
30+
- Upgrade and downgrade between revisions with `alembic upgrade <revision hash>` and
31+
`alembic downgrade <revision hash>`
32+
33+
# Known Gaps
34+
- MAP, ARRAY, and STRUCT types: this dialect can read these types out as strings. But you cannot
35+
define a SQLAlchemy model with databricks.sqlalchemy.dialect.types.DatabricksMap (e.g.) because
36+
we haven't implemented them yet.
37+
- Constraints: with the addition of information_schema to Unity Catalog, Databricks SQL supports
38+
foreign key and primary key constraints. This dialect can write these constraints but the ability
39+
for alembic to reflect and modify them programmatically has not been tested.
40+
"""
41+
42+
importos
43+
fromsqlalchemy.ormimportdeclarative_base,Session
44+
fromsqlalchemyimportColumn,String,Integer,BOOLEAN,create_engine,select
45+
46+
host=os.getenv("DATABRICKS_SERVER_HOSTNAME")
47+
http_path=os.getenv("DATABRICKS_HTTP_PATH")
48+
access_token=os.getenv("DATABRICKS_TOKEN")
49+
catalog=os.getenv("DATABRICKS_CATALOG")
50+
schema=os.getenv("DATABRICKS_SCHEMA")
51+
52+
53+
# Extra arguments are passed untouched to the driver
54+
# See thrift_backend.py for complete list
55+
extra_connect_args= {
56+
"_tls_verify_hostname":True,
57+
"_user_agent_entry":"PySQL Example Script",
58+
}
59+
60+
engine=create_engine(
61+
f"databricks://token:{access_token}@{host}?http_path={http_path}&catalog={catalog}&schema={schema}",
62+
connect_args=extra_connect_args,
63+
)
64+
session=Session(bind=engine)
65+
base=declarative_base(bind=engine)
66+
67+
68+
classSampleObject(base):
69+
70+
__tablename__="mySampleTable"
71+
72+
name=Column(String(255),primary_key=True)
73+
episodes=Column(Integer)
74+
some_bool=Column(BOOLEAN)
75+
76+
77+
base.metadata.create_all()
78+
79+
sample_object_1=SampleObject(name="Bim Adewunmi",episodes=6,some_bool=True)
80+
sample_object_2=SampleObject(name="Miki Meek",episodes=12,some_bool=False)
81+
82+
session.add(sample_object_1)
83+
session.add(sample_object_2)
84+
85+
session.commit()
86+
87+
stmt=select(SampleObject).where(SampleObject.name.in_(["Bim Adewunmi","Miki Meek"]))
88+
89+
output= [iforiinsession.scalars(stmt)]
90+
assertlen(output)==2
91+
92+
base.metadata.drop_all()

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp