GcsSource

Cloud Storage location for input content.

JSON representation
{"inputUris":[string],"dataSchema":string}
Fields
inputUris[]

string

Required. Cloud Storage URIs to input files. Each URI can be up to 2000 characters long. URIs can match the full object path (for example,gs://bucket/directory/object.json) or a pattern matching one or more files, such asgs://bucket/directory/*.json.

A request can contain at most 100 files (or 100,000 files ifdataSchema iscontent). Each file can be up to 2 GB (or 100 MB ifdataSchema iscontent).

dataSchema

string

The schema to use when parsing the data from the source.

Supported values for document imports:

  • document (default): One JSONDocument per line. Each document must have a validDocument.id.
  • content: Unstructured data (e.g. PDF, HTML). Each file matched byinputUris becomes a document, with the ID set to the first 128 bits of SHA256(URI) encoded as a hex string.
  • custom: One custom data JSON per row in arbitrary format that conforms to the definedSchema of the data store. This can only be used by the GENERIC Data Store vertical.
  • csv: A CSV file with header conforming to the definedSchema of the data store. Each entry after the header is imported as a Document. This can only be used by the GENERIC Data Store vertical.

Supported values for user event imports:

  • user_event (default): One JSONUserEvent per line.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-06-27 UTC.