- Notifications
You must be signed in to change notification settings - Fork321
feat: support timestamp_precision in table schema#2333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Conversation
google/cloud/bigquery/schema.py Outdated
| - 6 (Default, for TIMESTAMP type with microsecond precision) | ||
| - 12 (For TIMESTAMP type with picosecond precision) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
should there be an enum or something for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This is an option, I did not choose it because the backend defined it to be an integer, and I think we can let the backend handle value validation. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
In other repos, this is something I'd define an enum for, and then accept either the enum or a raw int value
But I don't have too much context on this repo, so up to you if that makes sense here. Consistency with the rest of the code is probably more relevant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I see, I think adding enum makes sense here - with enum we can prevent invalid numbers here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Updated to use Enum and updated documentation.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
google/cloud/bigquery/schema.py Outdated
| - 6 (Default, for TIMESTAMP type with microsecond precision) | ||
| - 12 (For TIMESTAMP type with picosecond precision) |
daniel-sancheNov 25, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
No mention of None?
(It says 6 is the default. Is the server enforcing that, or the client? Can this even return None?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The server does return None, if we do not set the value. This docstring is copied from the proto files. I just tried to set value to 6, and received the following error: Invalid value for timestampPrecision: 6 is not a valid value
So here we might need to give up consistency with the proto and make sure the doc is user friendly. WDYT? I will also open a bug with the API team.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
opened internal bug 463739109 and updated docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Usually in protos, None is equivalent to 0. So I'd recommend removing the Optional here, and have it return 0 if unset.
But yeah, we should make sure we understand the expected values here first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This is REST so empty is indeed an accepted value
Uh oh!
There was an error while loading.Please reload this page.
google/cloud/bigquery/schema.py Outdated
| - 6 (Default, for TIMESTAMP type with microsecond precision) | ||
| - 12 (For TIMESTAMP type with picosecond precision) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
In other repos, this is something I'd define an enum for, and then accept either the enum or a raw int value
But I don't have too much context on this repo, so up to you if that makes sense here. Consistency with the rest of the code is probably more relevant
google/cloud/bigquery/schema.py Outdated
| - 6 (Default, for TIMESTAMP type with microsecond precision) | ||
| - 12 (For TIMESTAMP type with picosecond precision) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Usually in protos, None is equivalent to 0. So I'd recommend removing the Optional here, and have it return 0 if unset.
But yeah, we should make sure we understand the expected values here first
| self.assertTrue(_table_exists(table)) | ||
| self.assertEqual(table.table_id,table_id) | ||
| self.assertEqual(table.schema,SCHEMA_PICOSECOND) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
can we have a test that reads back a timestamp, and makes sure its in the expected range? Or am I misunderstanding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This PR only involves creating and reading table schema that has picosecond timestamp. I think we can add the tests in the PR supporting writing to and reading from the table.
daniel-sanche commentedDec 9, 2025
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Code Review
This pull request adds support fortimestamp_precision in the table schema, which is a useful enhancement. The implementation is clean, consistent with the existing codebase, and includes both unit and system tests to verify the new functionality. I've found one minor typo in a test method name that should be corrected for consistency.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
google/cloud/bigquery/schema.py Outdated
| @property | ||
| deftimestamp_precision(self): | ||
| """Union[enums.TimestampPrecision, int, None]: Precision (maximum number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Like I mentioned in a different comment, we should make sure it's stored in a single format, to make things easier when we go back to read it. I'd recommend making this just return the enum type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
done, when accessing the property, the user will always get an enum, while the api representation is int or None. We also limited the acceptable inputs to enum or None.
google/cloud/bigquery/schema.py Outdated
| @property | ||
| deftimestamp_precision(self): | ||
| """Union[enums.TimestampPrecision, int, None]: Precision (maximum number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Also, this doctring doesn't seem to match the format of the others. Shouldn't it be in a returns block? And I'd suggest adding the type as an annotation, so it can be caught by mypy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
both docstring and type annotation done
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
google/cloud/bigquery/schema.py Outdated
| range_element_type:Union[FieldElementType,str,None]=None, | ||
| rounding_mode:Union[enums.RoundingMode,str,None]=None, | ||
| foreign_type_definition:Optional[str]=None, | ||
| timestamp_precision:Union[enums.TimestampPrecision,int,None]=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This type annotation isn't actually right with the current code, since it won't ever be a TimestampPrecision instance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This annotation is for the type accepted, so I think it is accurate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
updated to reflect the new types accepted: None and enum. The reason why None is accepted is to ensure an empty api representation when there is no user input, so the string representation remains the same if timestamp_precision is not set
google/cloud/bigquery/enums.py Outdated
| columns to match the field names in the schema.""" | ||
| classTimestampPrecision(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Looking at the code again after our talk, I still think this should be turned into a true enum.
With the current code, TimestampPrecision is just a shortcut to creating int or None values.The argument type can't be TimeStampPrecision. This means all the docstrings have to describe all the values asUnion[int | None], and then do more work to describe how to build one or what those special values mean, and we lose a lot of the value of encapsulating it in an enum to begin with
If we change this to actually be an enum, we will have to convert it back to it's raw value before sending it to the backend. But it should make typing a lot cleaner
daniel-sancheDec 9, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
But if you want to leave it as-is, that's fine. We'd just need to update the typing and docstrings to be accurate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
As we have discussed offline, I have updated to use enum.Enum, and took the liberty to support None to ensure backward compatibility regarding string representation
google/cloud/bigquery/schema.py Outdated
| self.description, | ||
| self.fields, | ||
| policy_tags, | ||
| self.timestamp_precision, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
if you decide to do an Enum subclass, this could look like: self.timestamp_precision.value
google/cloud/bigquery/schema.py Outdated
| elseNone | ||
| ) | ||
| iftimestamp_precisionisnotNone: | ||
| self._properties["timestampPrecision"]=timestamp_precision |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
if you make TimestampPrecision into an enum, you can save this as a primitive here, and then rebuild in the property
self._properties["timestampPrecision"] = timestamp_precision.value if isinstance(TimestampPrecision) else timestamp_precision
daniel-sanche left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
LGTM, with some questions/comments
| # The API would return a string despite we send an integer. To ensure | ||
| # success of resending received schema, we convert string to integer | ||
| # to ensure consistency. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This seems a bit surprising. The backend returns"12"? Or is this converted somewhere locally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Yes, the backend expects an integer but returns a string. I will open a bug with the BigQuery team.
| andtype(api_repr.get("timestampPrecision"))isstr | ||
| ): | ||
| api_repr["timestampPrecision"]=int(api_repr["timestampPrecision"]) | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I was going to point out there was an error here, because the docstrings sayMapping but you're only checking againstdict. But I guess the actual type annotation saysdict. So there's definitely an inconsistency there, but maybe this logic is ok.
Still, I think this could be a bit cleaner as a try/catch
try: # ensure timestamp is in int format api_repr["timestampPrecision"] = int(api_repr["timestampPrecision"])except TypeError, KeyError: # ignore unset timestamps pass
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
Fixes #<issue_number_goes_here> 🦕