Make sure to open an issue as abug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
Ensure the tests and linter pass
Code coverage does not decrease (if any source code was changed)
Appropriate docs were updated (if necessary)

Fixes#293
Fixes#314
Fixed#233
Fixes#37
🦕

feat: STRUCT and ARRAY support

52cee8c

product-auto-labelbot added the api: bigqueryIssues related to the googleapis/python-bigquery-sqlalchemy API. label

Aug 30, 2021

google-clabot added the cla: yesThis human has signed the Contributor License Agreement. label

Aug 30, 2021

Jim Fulton added6 commits

August 30, 2021 17:24

Merge branch 'main' into struct

a0b02f7

Fixed test that expected JSON rather than STRUCT

6bacc0d

Merge branch 'struct' of github.com:jimfulton/python-bigquery-sqlalch…

1ec0f88

…emy into struct

Added system test I neglected to check in before :(

74aab64

blacken

c5653e2

Merge branch 'main' into struct

a7f0b41

Copy link

ContributorAuthor

jimfulton commentedAug 31, 2021

FTR, WRT superset, once I finally got it working :), it behaves the same with and without these changes.

BTW, we have logic that tries to unpack sub-structs, I think so that there would eventually be scalars for superset to work with.

If you have an array of structs, we still create columns for the fields of the struct in the array. This causes superset to error, because it has no way to get at structs in an array. We should probably not unpack structs in arrays.

Jim Fulton added8 commits

August 31, 2021 16:17

Don't strip <ARRAY > from parameter types

9df1804

Otherwise, the BQ doesn't handle arrays of structs.

Added system tests to verift PR 67 and issue 233

0df1701

Merge branch 'struct' of github.com:jimfulton/python-bigquery-sqlalch…

7aad07f

…emy into struct

blacken

f10a571

Renamed test file to conform to samples test-file naming conventions

ec31040

Require google-cloud-bigquery 2.25.2 to get struct field-name undersc…

accf762

…ore fix

Added STRUCT documentation

ef5f891

fix bigquery version

cce9dbb

Copy link

snippet-botbot commentedSep 1, 2021•
edited
Loading

Here is the summary of changes.

You are about to add 10 region tags.

docs/struct.rst:14, tagbigquery_sqlalchemy_create_table_with_struct
docs/struct.rst:34, tagbigquery_sqlalchemy_insert_struct
docs/struct.rst:42, tagbigquery_sqlalchemy_query_struct
docs/struct.rst:50, tagbigquery_sqlalchemy_query_getitem
docs/struct.rst:58, tagbigquery_sqlalchemy_query_STRUCT
samples/snippets/STRUCT.py:23, tagbigquery_sqlalchemy_create_table_with_struct
samples/snippets/STRUCT.py:44, tagbigquery_sqlalchemy_insert_struct
samples/snippets/STRUCT.py:74, tagbigquery_sqlalchemy_query_struct
samples/snippets/STRUCT.py:79, tagbigquery_sqlalchemy_query_STRUCT
samples/snippets/STRUCT.py:84, tagbigquery_sqlalchemy_query_getitem

This comment is generated bysnippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, addsnippet-bot:force-run label or use the checkbox below:

Refresh this comment

Jim Fulton added7 commits

September 1, 2021 10:37

Merge branch 'main' into struct

290d955

get blacken to leave sample code alone.

b697df6

I want it narrow to avoid horizonal scrolling

Check in missing file :(

6a278b9

Merge branch 'struct' of github.com:jimfulton/python-bigquery-sqlalch…

bc62a56

…emy into struct

need sqla 1.4 for unnest

84426bd

fixed typo

587a0f7

Merge branch 'main' into struct

e6f4adf

jimfulton marked this pull request as ready for review

September 2, 2021 13:30

jimfulton requested review froma team ascode owners

September 2, 2021 13:30

jimfulton requested a review fromtmatsuo

September 2, 2021 13:30

Copy link

ContributorAuthor

jimfulton commentedSep 2, 2021•
edited
Loading

Some notes for reviewers:

Heart of change is_struct.py, which isn't large, but also isn't obvious. :( I cribbed from the built-in JSON and ARRAY types. When reviewing, it's probably helpful to look at those. The "Comparator" framework is confusing, in large part because the name doesn't make sense.
Having said that, the core logic is in_setop_getitem, which is a hook used by the base class ofSTRUCTs comparator.
The__getattr__ method just delegates to (the inherited)__getitem__.
This PR also has 2 other small changes:
- Machinery for mapping BQ types to SQLAlchemy types has been factored into a separate_types module, both to avoid clutteringbase more and to partially avoid circular imports. (There's still a circular import issue that isn't fixable without a bigger refactoring that I deemed unwarranted.)
- Implementation of ARRAY indexing, which wasn't implemented. A number of my tests (seetest__struct.py in both unit and system tests) used an example that has nested structs and arrays.

Jim Fulton added3 commits

September 2, 2021 13:31

Merge branch 'main' into struct

ffb5aa9

Merge branch 'main' into struct

47fa14f

Merge branch 'main' into struct

402bbbe

tswast self-requested a review

September 7, 2021 20:03

tswast requested changes

Sep 7, 2021

View reviewed changes

Copy link

Collaborator

tswast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I haven't quite digested everything in this PR yet, but I figured I'd share the feedback I have so far.

docs/struct.rstShow resolvedHide resolved

setup.pyShow resolvedHide resolved

sqlalchemy_bigquery/_struct.py OutdatedShow resolvedHide resolved

sqlalchemy_bigquery/_struct.pyShow resolvedHide resolved

sqlalchemy_bigquery/_struct.py OutdatedShow resolvedHide resolved

jimfultonand others added5 commits

September 7, 2021 14:54

Update sqlalchemy_bigquery/_struct.py

5bf07b4

Co-authored-by: Tim Swast <swast@google.com>

added STRUCT docstring

e937167

Add doc link

8661f5b

Merge branch 'struct' of github.com:jimfulton/python-bigquery-sqlalch…

b550aa1

…emy into struct

Added some comments

af68a54

tswast requested changes

Sep 8, 2021

View reviewed changes

Copy link

Collaborator

tswast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I like it! Just a few things I think we should clarify before merging.

sqlalchemy_bigquery/_struct.py OutdatedShow resolvedHide resolved

sqlalchemy_bigquery/_struct.py Outdated

Comment on lines 73 to 79

		globaltype_compiler

		try:
		process=type_compiler.process
		exceptAttributeError:
		type_compiler=base.dialect.type_compiler(base.dialect())
		process=type_compiler.process

Copy link

Collaborator

tswastSep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Could we put this in a_get_type_compiler /_get_process function? I don't see anywhere else we initializetype_compiler, but I'd be more comfortable having this logic closer to the# We have to delay getting the type compiler, because of circular imports. :( comment.

Copy link

ContributorAuthor

jimfultonSep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I refactored so this is combined and isolated in one place using a new, better named_get_subtype_col_spec function.

sqlalchemy_bigquery/_struct.py Outdated

		type_compiler=base.dialect.type_compiler(base.dialect())
		process=type_compiler.process

		fields=", ".join(f"{name}{process(type_)}"forname,type_inself.__fields)

Copy link

Collaborator

tswastSep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I assumeprocess is able to handle nested arrays/structs?

Copy link

ContributorAuthor

jimfultonSep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

yes

sqlalchemy_bigquery/_struct.py OutdatedShow resolvedHide resolved

sqlalchemy_bigquery/_struct.py Outdated

		f"STRUCT fields can only be accessed with strings field names,"
		f" not{name}."
		)
		subtype=self.expr.type._STRUCT__byname.get(name.lower())

Copy link

Collaborator

tswastSep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Where does_STRUCT__byname come from? I'm assuming somewhere from SQLAlchemy, but I'm not getting any results when searching forbyname.

Copy link

Collaborator

tswastSep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Oh, I think I figured it out:https://docs.python.org/3/tutorial/classes.html#private-variables

Any identifier of the form__spam (at least two leading underscores, at most one trailing underscore) is textually replaced with_classname__spam, where classname is the current class name with leading underscore(s) stripped.

Can we comment about this? I assume we have to do it because we knowself.expr.type is aSTRUCT, but it's notself.

Copy link

ContributorAuthor

jimfultonSep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I refactored this to make name mangling more explicit and consistent, so I don't think comments are needed anymore. See if you agree. :)

I mainly use "private" variables, which aren't :), to avoid namespace conflicts when subclassing across responsibility boundaries. Arguably, explicit naming is better.

sqlalchemy_bigquery/_struct.py Outdated

		returnoperator,index,subtype

		def__getattr__(self,name):
		ifname.lower()inself.expr.type._STRUCT__byname:

Copy link

Collaborator

tswastSep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm a bit confused whyself.__byname doesn't work in this case.

Edit: I see now that it's part of theComparator class. Still probably worth a similar comment to the one I recommend in_setup_getitem

Copy link

ContributorAuthor

jimfultonSep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

See my response on name mangling

sqlalchemy_bigquery/_struct.pyShow resolvedHide resolved

sqlalchemy_bigquery/_types.py Outdated

		forfinfield.fields
		]
		results+=_get_transitive_schema_fields(sub_fields,cur_fields)
		cur_fields.pop()

Copy link

Collaborator

tswastSep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Since we pop these off, does that mean we don't get the top-level struct field, just the leaf fields? I suspect this might hide some ARRAY columns if a parent node has modeREPEATED, but is not included.

Edit: I see the top field is added to results on line 83. Might be worth a comment as to why we pop here.

Copy link

ContributorAuthor

jimfultonSep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I got rid of cur_fields. It wasn't needed (anymore).

sqlalchemy_bigquery/_types.py Outdated

		forfieldinfields:
		results+= [field]
		iffield.field_type=="RECORD":
		cur_fields.append(field)

Copy link

Collaborator

tswastSep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I don't quite understand whatcur_fields is doing. Is there a better name we can pick for this? Maybe it's referring toancestors?

Copy link

ContributorAuthor

jimfultonSep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Haha, it's not doing anything.

This is based on

python-bigquery-sqlalchemy/pybigquery/sqlalchemy_bigquery.py

Lines 503 to 523 indb64424

	def_get_columns_helper(self,columns,cur_columns):
	"""
	Recurse into record type and return all the nested field names.
	As contributed by @sumedhsakdeo on issue #17
	"""
	results= []
	forcolincolumns:
	results+= [
	SchemaField(
	name=".".join(col.nameforcolincur_columns+ [col]),
	field_type=col.field_type,
	mode=col.mode,
	description=col.description,
	fields=col.fields,
	)
	]
	ifcol.field_type=="RECORD":
	cur_columns.append(col)
	results+=self._get_columns_helper(col.fields,cur_columns)
	cur_columns.pop()
	returnresults

, which I inherited.

I've refactored it quite a bit and failed to notice that this wasn't needed any more. Fixed.

tests/unit/test__struct.py Outdated

		(_col().NAME,"`t`.`person`.NAME"),
		(_col().children,"`t`.`person`.children"),
		(
		_col().children[0].label("anon_1"),# SQLAlchemy doesn't add the label

Copy link

Collaborator

tswastSep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Should we file an issue for this to investigate later? If so, let's addTODO and link to the issue.

Copy link

ContributorAuthor

jimfultonSep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Done

Jim Fultonand others added7 commits

September 8, 2021 13:01

Localize logic for getting subtye column specifications

da43fd2

explain semi-private name mangling

f04cac2

Make name magling more explicit

5af05bb

explain why we have different implementations of _field_index for SQL…

09866c6

…Alchemy 1.3 and 1/4

get rid of cur_fields, we're not using it anymore.

054c227

Also, check for both RECORD and STRUCT fild types, in case the APIever starts returning STRUCT.

Add a todo to find out why Sqlalchemy doesn't generate an alias when …

1a79305

…accessing array items

userrepr rather thanstr to shpow an object in an error message

5e2ae32

Co-authored-by: Tim Swast <swast@google.com>

tswast approved these changes

Sep 9, 2021

View reviewed changes

sqlalchemy_bigquery/_struct.py

		global_get_subtype_col_spec

		type_compiler=base.dialect.type_compiler(base.dialect())
		_get_subtype_col_spec=type_compiler.process

Copy link

Collaborator

tswastSep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Fancy! I didn't realize a function could replace itself. I like it.

jimfulton merged commit6624b10 intogoogleapis:main

Sep 9, 2021

jimfulton deleted the struct branch

September 9, 2021 15:50

jimfulton mentioned this pull request

Sep 9, 2021

Add support for array and struct literals#67

Closed

gcf-merge-on-greenbot pushed a commit that referenced this pull request

Sep 9, 2021

chore: release 1.2.0 (#338)

f6d2799

🤖 I have created a release \*beep\* \*boop\*---## [1.2.0](https://www.github.com/googleapis/python-bigquery-sqlalchemy/compare/v1.1.0...v1.2.0) (2021-09-09)### Features* STRUCT and ARRAY support ([#318](https://www.github.com/googleapis/python-bigquery-sqlalchemy/issues/318)) ([6624b10](https://www.github.com/googleapis/python-bigquery-sqlalchemy/commit/6624b10ded73bbca6f40af73aaeaceb95c381b63))### Bug Fixes* the unnest function lost needed type information ([#298](https://www.github.com/googleapis/python-bigquery-sqlalchemy/issues/298)) ([1233182](https://www.github.com/googleapis/python-bigquery-sqlalchemy/commit/123318269876e7f76c7f0f2daa5f5b365026cd3f))---This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).

AnastasiaTamazlykar reviewed

Nov 15, 2024

View reviewed changes

sqlalchemy_bigquery/base.py

		defvisit_getitem_binary(self,binary,operator_,**kw):
		left=self.process(binary.left,**kw)
		right=self.process(binary.right,**kw)
		returnf"{left}[OFFSET({right})]"

Copy link

AnastasiaTamazlykarNov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

SAFE_OFFSET support would be nice too

Labels

api: bigquery

Issues related to the googleapis/python-bigquery-sqlalchemy API.

cla: yes

This human has signed the Contributor License Agreement.

Movatterモバイル変換

feat: STRUCT and ARRAY support#318

feat: STRUCT and ARRAY support#318

Uh oh!

Conversation

jimfulton commentedAug 30, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

jimfulton commentedAug 31, 2021

Uh oh!

snippet-botbot commentedSep 1, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

jimfulton commentedSep 2, 2021• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

tswast left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tswast left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jimfulton commentedAug 30, 2021•
edited
Loading

snippet-botbot commentedSep 1, 2021•
edited
Loading

jimfulton commentedSep 2, 2021•
edited
Loading