Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
Ctrl+K

pyspark.TaskContext#

classpyspark.TaskContext[source]#

Contextual information about a task which can be read or mutated duringexecution. To access the TaskContext for a running task, use:TaskContext.get().

New in version 2.2.0.

Examples

>>>frompysparkimportTaskContext

Get a task context instance fromRDD.

>>>spark.sparkContext.setLocalProperty("key1","value")>>>taskcontext=spark.sparkContext.parallelize([1]).map(lambda_:TaskContext.get()).first()>>>isinstance(taskcontext.attemptNumber(),int)True>>>isinstance(taskcontext.partitionId(),int)True>>>isinstance(taskcontext.stageId(),int)True>>>isinstance(taskcontext.taskAttemptId(),int)True>>>taskcontext.getLocalProperty("key1")'value'>>>isinstance(taskcontext.cpus(),int)True

Get a task context instance from a dataframe via Python UDF.

>>>frompyspark.sqlimportRow>>>frompyspark.sql.functionsimportudf>>>@udf("STRUCT<anum: INT, partid: INT, stageid: INT, taskaid: INT, prop: STRING, cpus: INT>")...deftaskcontext_as_row():...taskcontext=TaskContext.get()...returnRow(...anum=taskcontext.attemptNumber(),...partid=taskcontext.partitionId(),...stageid=taskcontext.stageId(),...taskaid=taskcontext.taskAttemptId(),...prop=taskcontext.getLocalProperty("key2"),...cpus=taskcontext.cpus())...>>>spark.sparkContext.setLocalProperty("key2","value")>>>[(anum,partid,stageid,taskaid,prop,cpus)]=(...spark.range(1).select(taskcontext_as_row()).first()...)>>>isinstance(anum,int)True>>>isinstance(partid,int)True>>>isinstance(stageid,int)True>>>isinstance(taskaid,int)True>>>prop'value'>>>isinstance(cpus,int)True

Get a task context instance from a dataframe via Pandas UDF.

>>>importpandasaspd>>>frompyspark.sql.functionsimportpandas_udf>>>@pandas_udf("STRUCT<"..."anum: INT, partid: INT, stageid: INT, taskaid: INT, prop: STRING, cpus: INT>")...deftaskcontext_as_row(_):...taskcontext=TaskContext.get()...returnpd.DataFrame({..."anum":[taskcontext.attemptNumber()],..."partid":[taskcontext.partitionId()],..."stageid":[taskcontext.stageId()],..."taskaid":[taskcontext.taskAttemptId()],..."prop":[taskcontext.getLocalProperty("key3")],..."cpus":[taskcontext.cpus()]...})...>>>spark.sparkContext.setLocalProperty("key3","value")>>>[(anum,partid,stageid,taskaid,prop,cpus)]=(...spark.range(1).select(taskcontext_as_row("id")).first()...)>>>isinstance(anum,int)True>>>isinstance(partid,int)True>>>isinstance(stageid,int)True>>>isinstance(taskaid,int)True>>>prop'value'>>>isinstance(cpus,int)True

Methods

attemptNumber()

How many times this task has been attempted.

cpus()

CPUs allocated to the task.

get()

Return the currently activeTaskContext.

getLocalProperty(key)

Get a local property set upstream in the driver, or None if it is missing.

partitionId()

The ID of the RDD partition that is computed by this task.

resources()

Resources allocated to the task.

stageId()

The ID of the stage that this task belong to.

taskAttemptId()

An ID that is unique to this task attempt (within the sameSparkContext, no two task attempts will share the same attempt ID).


[8]ページ先頭

©2009-2025 Movatter.jp