Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitaf1a49e

Browse files
author
Jesse Whitehouse
committed
Stop skipping StringTest
I've found that test_dont_truncate_rightside test is flaky because sometimesDBR returns the correct data but in the wrong order. It's supposed to checkthat a query returns ["AB", "BC"] but sometimes it returns ["BC", "AB"]which is the right data in the wrong order.Python list comparison doesn't evaluate that ["AB", "BC"] == ["BC", "AB"].We could reimplement the test using collections.Counter in stdlib orcheck with DBR team about why this order is sometimes different.Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>---Also, we had to break with SQLAlchemy's advice to never implement theTypeDecorator.literal_processor method because otherwise our stringsend up double-escaped and raise a syntax error.test_suite.py::StringTest_databricks+databricks::test_concatenate_binary PASSEDtest_suite.py::StringTest_databricks+databricks::test_concatenate_clauselist PASSEDtest_suite.py::StringTest_databricks+databricks::test_dont_truncate_rightside[%B%-expected0] PASSEDtest_suite.py::StringTest_databricks+databricks::test_dont_truncate_rightside[A%C%Z-expected2] PASSEDtest_suite.py::StringTest_databricks+databricks::test_dont_truncate_rightside[A%C-expected1] PASSEDtest_suite.py::StringTest_databricks+databricks::test_literal PASSEDtest_suite.py::StringTest_databricks+databricks::test_literal_backslashes PASSEDtest_suite.py::StringTest_databricks+databricks::test_literal_non_ascii PASSEDtest_suite.py::StringTest_databricks+databricks::test_literal_quoting PASSEDtest_suite.py::StringTest_databricks+databricks::test_nolength_string PASSED
1 parentab8e28f commitaf1a49e

File tree

3 files changed

+63
-19
lines changed

3 files changed

+63
-19
lines changed

‎src/databricks/sqlalchemy/__init__.py‎

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ class DatabricksDialect(default.DefaultDialect):
5151
colspecs= {
5252
sqlalchemy.types.DateTime:dialect_type_impl.DatabricksDateTimeNoTimezoneType,
5353
sqlalchemy.types.Time:dialect_type_impl.DatabricksTimeType,
54+
sqlalchemy.types.String:dialect_type_impl.DatabricksStringType
5455
}
5556

5657
@classmethod

‎src/databricks/sqlalchemy/_types.py‎

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,9 @@
55

66
fromdatetimeimportdatetime
77

8+
9+
fromdatabricks.sql.utilsimportParamEscaper
10+
811
@compiles(sqlalchemy.types.Enum,"databricks")
912
@compiles(sqlalchemy.types.String,"databricks")
1013
@compiles(sqlalchemy.types.Text,"databricks")
@@ -132,3 +135,62 @@ def process_result_value(self, value: Union[None, str], dialect) -> Union[dateti
132135
ifvalueisNone:
133136
returnNone
134137
returndatetime.strptime(value,"%H:%M:%S").time()
138+
139+
classDatabricksStringType(sqlalchemy.types.TypeDecorator):
140+
"""We have to implement our own String() type because SQLAlchemy's default implementation
141+
wants to escape single-quotes with a doubled single-quote. Databricks uses a backslash for
142+
escaping of literal strings. And SQLAlchemy's default escaping breaks Databricks SQL.
143+
"""
144+
145+
impl=sqlalchemy.types.String
146+
cache_ok=True
147+
pe=ParamEscaper()
148+
149+
defprocess_literal_param(self,value,dialect)->str:
150+
"""SQLAlchemy's default string escaping for backslashes doesn't work for databricks. The logic here
151+
implements the same logic as our legacy inline escaping logic.
152+
"""
153+
154+
returnself.pe.escape_string(value)
155+
156+
defliteral_processor(self,dialect):
157+
"""We manually override this method to prevent further processing of the string literal beyond
158+
what happens in the process_literal_param() method.
159+
160+
The SQLAlchemy docs _specifically_ say to not override this method.
161+
162+
It appears that any processing that happens from TypeEngine.process_literal_param happens _before_
163+
and _in addition to_ whatever the class's impl.literal_processor() method does. The String.literal_processor()
164+
method performs a string replacement that doubles any single-quote in the contained string. This raises a syntax
165+
error in Databricks. And it's not necessary because ParamEscaper() already implements all the escaping we need.
166+
167+
We should consider opening an issue on the SQLAlchemy project to see if I'm using it wrong.
168+
169+
See type_api.py::TypeEngine.literal_processor:
170+
171+
```python
172+
def process(value: Any) -> str:
173+
return fixed_impl_processor(
174+
fixed_process_literal_param(value, dialect)
175+
)
176+
```
177+
178+
That call to fixed_impl_processor wraps the result of fixed_process_literal_param (which is the
179+
process_literal_param defined in our Databricks dialect)
180+
181+
https://docs.sqlalchemy.org/en/20/core/custom_types.html#sqlalchemy.types.TypeDecorator.literal_processor
182+
"""
183+
defprocess(value):
184+
"""This is a copy of the default String.literal_processor() method but stripping away
185+
its double-escaping behaviour for single-quotes.
186+
"""
187+
188+
_step1=self.process_literal_param(value,dialect="databricks")
189+
ifdialect.identifier_preparer._double_percents:
190+
_step2=_step1.replace("%","%%")
191+
else:
192+
_step2=_step1
193+
194+
return"%s"%_step2
195+
196+
returnprocess

‎src/databricks/sqlalchemy/test/test_suite.py‎

Lines changed: 0 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -236,25 +236,6 @@ def test_row_w_scalar_select(self):
236236
"""
237237

238238

239-
classStringTest(StringTest):
240-
@pytest.mark.skip(
241-
reason="String implementation needs work. Quote escaping is inconsistent between read/write."
242-
)
243-
deftest_literal_backslashes(self):
244-
"""
245-
Exception:
246-
AssertionError: assert 'backslash one backslash two\\ end' in ['backslash one\\ backslash two\\\\ end']
247-
"""
248-
249-
@pytest.mark.skip(
250-
reason="String implementation needs work. Quote escaping is inconsistent between read/write."
251-
)
252-
deftest_literal_quoting(self):
253-
"""
254-
Exception:
255-
assert 'some text hey "hi there" thats text' in ['some\'text\' hey "hi there" that\'s text']
256-
"""
257-
258239

259240
classTextTest(TextTest):
260241
"""Fixing StringTest should fix these failures also."""

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp