Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
Ctrl+K

Python Data Source#

DataSource.name()

Returns a string represents the format name of this data source.

DataSource.reader(schema)

Returns aDataSourceReader instance for reading data.

DataSource.schema()

Returns the schema of the data source.

DataSource.streamReader(schema)

Returns aDataSourceStreamReader instance for reading streaming data.

DataSource.writer(schema, overwrite)

Returns aDataSourceWriter instance for writing data.

DataSourceReader.partitions()

Returns an iterator of partitions for this data source.

DataSourceReader.read(partition)

Generates data for a given partition and returns an iterator of tuples or rows.

DataSourceRegistration.register(dataSource)

Register a Python user-defined data source.

DataSourceStreamReader.commit(end)

Informs the source that Spark has completed processing all data for offsets less than or equal toend and will only request offsets greater thanend in the future.

DataSourceStreamReader.initialOffset()

Return the initial offset of the streaming data source.

DataSourceStreamReader.latestOffset()

Returns the most recent offset available.

DataSourceStreamReader.partitions(start, end)

Returns a list of InputPartition given the start and end offsets.

DataSourceStreamReader.read(partition)

Generates data for a given partition and returns an iterator of tuples or rows.

DataSourceStreamReader.stop()

Stop this source and free any resources it has allocated.

DataSourceWriter.abort(messages)

Aborts this writing job due to task failures.

DataSourceWriter.commit(messages)

Commits this writing job with a list of commit messages.

DataSourceWriter.write(iterator)

Writes data into the data source.


[8]ページ先頭

©2009-2025 Movatter.jp