Movatterモバイル変換


[0]ホーム

URL:


Loading
  1. Elastic Docs/
  2. Reference/
  3. Ingestion tools/
  4. Elastic integrations/
  5. Apache

Apache Spark Integration

Version1.5.0 (View all)
Subscription level
What's this?
Basic
Developed by
What's this?
Elastic
Ingestion method(s)Jolokia
Minimum Kibana version(s)9.0.0
8.13.0

Apache Spark is an open-source, distributed computing system that provides a fast and general-purpose cluster-computing framework. It offers in-memory data processing capabilities, which significantly enhances the performance of big data analytics applications. Spark provides support for a variety of programming languages including Scala, Python, Java, and R, and comes with built-in modules for SQL, streaming, machine learning, and graph processing. This makes it a versatile tool for a wide range of data processing and analysis tasks.

Use the Apache Spark integration to:

  • Collect metrics related to the application, driver, executor and node.
  • Create visualizations to monitor, measure, and analyze usage trends and key data, deriving business insights.
  • Create alerts to reduce the MTTD and MTTR by referencing relevant logs when troubleshooting an issue.

This integration has been tested againstApache Spark version 3.5.0.

The Apache Spark integration collects metrics data.

Metrics provide insight into the statistics of Apache Spark. TheMetric data streams collected by the Apache Spark integration includeapplication,driver,executor, andnode, allowing users to monitor and troubleshoot the performance of their Apache Spark instance.

Data streams:

  • application: Collects information related to the number of cores used, application name, runtime in milliseconds and current status of the application.
  • driver: Collects information related to the driver details, job durations, task execution, memory usage, executor status and JVM metrics.
  • executor: Collects information related to the operations, memory usage, garbage collection, file handling, and threadpool activity.
  • node: Collects information related to the application count, waiting applications, worker metrics, executor count, core usage and memory usage.

Note:

  • Users can monitor and view the metrics inside the ingested documents for Apache Spark under themetrics-* index pattern inDiscover.

You need Elasticsearch for storing and searching your data and Kibana for visualizing and managing it. You can use our hosted Elasticsearch Service on Elastic Cloud, which is recommended, or self-manage the Elastic Stack on your own hardware.

To ingest data from Apache Spark, you must know the full hosts for the Main and Worker nodes.

To proceed with the Jolokia setup, Apache Spark should be installed as a standalone setup. Make sure that the spark folder is installed in the/usr/local path. If not, then specify the path of spark folder in the further steps. You can install the standalone setup from the official download page ofApache Spark.

To gather Spark statistics, we need to download and enable Jolokia JVM Agent.

cd /usr/share/java/wget -O jolokia-agent.jar http://search.maven.org/remotecontent?filepath=org/jolokia/jolokia-jvm/1.3.6/jolokia-jvm-1.3.6-agent.jar

As far, as Jolokia JVM Agent is downloaded, we should configure Apache Spark, to use it as JavaAgent and expose metrics via HTTP/JSON. Edit spark-env.sh. It should be in/usr/local/spark/conf and add following parameters (Assuming that spark install folder is/usr/local/spark, if not change the path to one on which Spark is installed):

export SPARK_MASTER_OPTS="$SPARK_MASTER_OPTS -javaagent:/usr/share/java/jolokia-agent.jar=config=/usr/local/spark/conf/jolokia-master.properties"

Now, create/usr/local/spark/conf/jolokia-master.properties file with following content:

host=0.0.0.0port=7777agentContext=/jolokiabacklog=100policyLocation=file:///usr/local/spark/conf/jolokia.policyhistoryMaxEntries=10debug=falsedebugMaxEntries=100maxDepth=15maxCollectionSize=1000maxObjects=0

Now we need to create /usr/local/spark/conf/jolokia.policy with following content:

<?xml version="1.0" encoding="utf-8"?><restrict>  <http>    <method>get</method>    <method>post</method>  </http>  <commands>    <command>read</command>  </commands></restrict>

Configure Agent with following in conf/bigdata.ini file:

[Spark-Master]stats: http://127.0.0.1:7777/jolokia/read

Restart Spark master.

Follow the same set of steps for Spark Worker, Driver and Executor.

For step-by-step instructions on how to set up an integration, refer to theGetting Started.

After the integration is successfully configured, click on theAssets tab of the Apache Spark Integration to display the available dashboards. Select the dashboard for your configured data stream, which should be populated with the required data.

Ifhost.ip appears conflicted under themetrics-* data view, this issue can be resolved byreindexing theApplication,Driver,Executor andNode data stream.

Theapplication data stream collects metrics related to the number of cores used, application name, runtime in milliseconds, and current status of the application.

Example
{    "@timestamp": "2023-09-28T09:24:33.812Z",    "agent": {        "ephemeral_id": "20d060ec-da41-4f14-a187-d020b9fbec7d",        "id": "a6bdbb4a-4bac-4243-83cb-dba157f24987",        "name": "docker-fleet-agent",        "type": "metricbeat",        "version": "8.8.0"    },    "apache_spark": {        "application": {            "cores": 8,            "mbean": "metrics:name=application.PythonWordCount.1695893057562.cores,type=gauges",            "name": "PythonWordCount.1695893057562"        }    },    "data_stream": {        "dataset": "apache_spark.application",        "namespace": "ep",        "type": "metrics"    },    "ecs": {        "version": "8.11.0"    },    "elastic_agent": {        "id": "a6bdbb4a-4bac-4243-83cb-dba157f24987",        "snapshot": false,        "version": "8.8.0"    },    "event": {        "agent_id_status": "verified",        "dataset": "apache_spark.application",        "duration": 23828342,        "ingested": "2023-09-28T09:24:37Z",        "kind": "metric",        "module": "apache_spark",        "type": [            "info"        ]    },    "host": {        "architecture": "x86_64",        "containerized": true,        "hostname": "docker-fleet-agent",        "id": "e8978f2086c14e13b7a0af9ed0011d19",        "ip": [            "172.20.0.7"        ],        "mac": [            "02-42-C0-A8-F5-07"        ],        "name": "docker-fleet-agent",        "os": {            "codename": "focal",            "family": "debian",            "kernel": "3.10.0-1160.90.1.el7.x86_64",            "name": "Ubuntu",            "platform": "ubuntu",            "type": "linux",            "version": "20.04.6 LTS (Focal Fossa)"        }    },    "metricset": {        "name": "jmx",        "period": 60000    },    "service": {        "address": "http://apache-spark-main:7777/jolokia/%3FignoreErrors=true&canonicalNaming=false",        "type": "jolokia"    }}

ECS Field Reference

Refer to the followingdocument for detailed information on ECS fields.

Exported fields
FieldDescriptionTypeMetric Type
@timestampEvent timestamp.date
agent.idUnique identifier of this agent (if one exists). Example: For Beats this would be beat.id.keyword
apache_spark.application.coresNumber of cores.longgauge
apache_spark.application.mbeanThe name of the jolokia mbean.keyword
apache_spark.application.nameName of the application.keyword
apache_spark.application.runtime.msTime taken to run the application (ms).longgauge
apache_spark.application.statusCurrent status of the application.keyword
cloud.account.idThe cloud account or organization id used to identify different entities in a multi-tenant environment. Examples: AWS account id, Google Cloud ORG Id, or other unique identifier.keyword
cloud.availability_zoneAvailability zone in which this host, resource, or service is located.keyword
cloud.instance.idInstance ID of the host machine.keyword
cloud.providerName of the cloud provider. Example values are aws, azure, gcp, or digitalocean.keyword
cloud.regionRegion in which this host, resource, or service is located.keyword
container.idUnique container id.keyword
data_stream.datasetData stream dataset.constant_keyword
data_stream.namespaceData stream namespace.constant_keyword
data_stream.typeData stream type.constant_keyword
host.nameName of the host. It can contain what hostname returns on Unix systems, the fully qualified domain name (FQDN), or a name specified by the user. The recommended value is the lowercase FQDN of the host.keyword
service.addressAddress where data about this service was collected from. This should be a URI, network address (ipv4:port or [ipv6]:port) or a resource path (sockets).keyword

Thedriver data stream collects metrics related to the driver details, job durations, task execution, memory usage, executor status, and JVM metrics.

Example
{    "@timestamp": "2023-09-29T12:04:40.050Z",    "agent": {        "ephemeral_id": "e3534e18-b92f-4b1b-bd39-43ff9c8849d4",        "id": "a76f5e50-2a98-4b96-80f6-026ad822e3e8",        "name": "docker-fleet-agent",        "type": "metricbeat",        "version": "8.8.0"    },    "apache_spark": {        "driver": {            "application_name": "app-20230929120427-0000",            "jvm": {                "cpu": {                    "time": 25730000000                }            },            "mbean": "metrics:name=app-20230929120427-0000.driver.JVMCPU.jvmCpuTime,type=gauges"        }    },    "data_stream": {        "dataset": "apache_spark.driver",        "namespace": "ep",        "type": "metrics"    },    "ecs": {        "version": "8.11.0"    },    "elastic_agent": {        "id": "a76f5e50-2a98-4b96-80f6-026ad822e3e8",        "snapshot": false,        "version": "8.8.0"    },    "event": {        "agent_id_status": "verified",        "dataset": "apache_spark.driver",        "duration": 177706950,        "ingested": "2023-09-29T12:04:41Z",        "kind": "metric",        "module": "apache_spark",        "type": [            "info"        ]    },    "host": {        "architecture": "x86_64",        "containerized": true,        "hostname": "docker-fleet-agent",        "id": "e8978f2086c14e13b7a0af9ed0011d19",        "ip": [            "172.26.0.7"        ],        "mac": [            "02-42-AC-1A-00-07"        ],        "name": "docker-fleet-agent",        "os": {            "codename": "focal",            "family": "debian",            "kernel": "3.10.0-1160.90.1.el7.x86_64",            "name": "Ubuntu",            "platform": "ubuntu",            "type": "linux",            "version": "20.04.6 LTS (Focal Fossa)"        }    },    "metricset": {        "name": "jmx",        "period": 60000    },    "service": {        "address": "http://apache-spark-main:7779/jolokia/%3FignoreErrors=true&canonicalNaming=false",        "type": "jolokia"    }}

ECS Field Reference

Refer to the followingdocument for detailed information on ECS fields.

Exported fields
FieldDescriptionTypeMetric Type
@timestampEvent timestamp.date
agent.idUnique identifier of this agent (if one exists). Example: For Beats this would be beat.id.keyword
apache_spark.driver.application_nameName of the application.keyword
apache_spark.driver.dag_scheduler.job.activeNumber of active jobs.longgauge
apache_spark.driver.dag_scheduler.job.allTotal number of jobs.longgauge
apache_spark.driver.dag_scheduler.stages.failedNumber of failed stages.longgauge
apache_spark.driver.dag_scheduler.stages.runningNumber of running stages.longgauge
apache_spark.driver.dag_scheduler.stages.waitingNumber of waiting stageslonggauge
apache_spark.driver.disk.space_usedAmount of the disk space utilized in MB.longgauge
apache_spark.driver.executor_metrics.gc.major.countTotal major GC count. For example, the garbage collector is one of MarkSweepCompact, PS MarkSweep, ConcurrentMarkSweep, G1 Old Generation and so on.longgauge
apache_spark.driver.executor_metrics.gc.major.timeElapsed total major GC time. The value is expressed in milliseconds.longgauge
apache_spark.driver.executor_metrics.gc.minor.countTotal minor GC count. For example, the garbage collector is one of Copy, PS Scavenge, ParNew, G1 Young Generation and so on.longgauge
apache_spark.driver.executor_metrics.gc.minor.timeElapsed total minor GC time. The value is expressed in milliseconds.longgauge
apache_spark.driver.executor_metrics.heap_memory.off.executionPeak off heap execution memory in use, in bytes.longgauge
apache_spark.driver.executor_metrics.heap_memory.off.storagePeak off heap storage memory in use, in bytes.longgauge
apache_spark.driver.executor_metrics.heap_memory.off.unifiedPeak off heap memory (execution and storage).longgauge
apache_spark.driver.executor_metrics.heap_memory.on.executionPeak on heap execution memory in use, in bytes.longgauge
apache_spark.driver.executor_metrics.heap_memory.on.storagePeak on heap storage memory in use, in bytes.longgauge
apache_spark.driver.executor_metrics.heap_memory.on.unifiedPeak on heap memory (execution and storage).longgauge
apache_spark.driver.executor_metrics.memory.direct_poolPeak memory that the JVM is using for direct buffer pool.longgauge
apache_spark.driver.executor_metrics.memory.jvm.heapPeak memory usage of the heap that is used for object allocation.longcounter
apache_spark.driver.executor_metrics.memory.jvm.off_heapPeak memory usage of non-heap memory that is used by the Java virtual machine.longcounter
apache_spark.driver.executor_metrics.memory.mapped_poolPeak memory that the JVM is using for mapped buffer poollonggauge
apache_spark.driver.executor_metrics.process_tree.jvm.rss_memoryResident Set Size: number of pages the process has in real memory. This is just the pages which count toward text, data, or stack space. This does not include pages which have not been demand-loaded in, or which are swapped out.longgauge
apache_spark.driver.executor_metrics.process_tree.jvm.v_memoryVirtual memory size in bytes.longgauge
apache_spark.driver.executor_metrics.process_tree.other.rss_memorylonggauge
apache_spark.driver.executor_metrics.process_tree.other.v_memorylonggauge
apache_spark.driver.executor_metrics.process_tree.python.rss_memorylonggauge
apache_spark.driver.executor_metrics.process_tree.python.v_memorylonggauge
apache_spark.driver.executors.allTotal number of executors.longgauge
apache_spark.driver.executors.decommission_unfinishedTotal number of decommissioned unfinished executors.longcounter
apache_spark.driver.executors.exited_unexpectedlyTotal number of executors exited unexpectedly.longcounter
apache_spark.driver.executors.gracefully_decommissionedTotal number of executors gracefully decommissioned.longcounter
apache_spark.driver.executors.killed_by_driverTotal number of executors killed by driver.longcounter
apache_spark.driver.executors.max_neededMaximum number of executors needed.longgauge
apache_spark.driver.executors.pending_to_removeTotal number of executors pending to be removed.longgauge
apache_spark.driver.executors.targetTotal number of target executors.longgauge
apache_spark.driver.executors.to_addTotal number of executors to be added.longgauge
apache_spark.driver.hive_external_catalog.file_cache_hitsTotal number of file cache hits.longcounter
apache_spark.driver.hive_external_catalog.files_discoveredTotal number of files discovered.longcounter
apache_spark.driver.hive_external_catalog.hive_client_callsTotal number of Hive Client calls.longcounter
apache_spark.driver.hive_external_catalog.parallel_listing_job.countNumber of jobs running parallely.longcounter
apache_spark.driver.hive_external_catalog.partitions_fetchedNumber of partitions fetched.longcounter
apache_spark.driver.job_durationDuration of the job.longgauge
apache_spark.driver.jobs.failedNumber of failed jobs.longcounter
apache_spark.driver.jobs.succeededNumber of successful jobs.longcounter
apache_spark.driver.jvm.cpu.timeElapsed CPU time the JVM spent.longgauge
apache_spark.driver.mbeanThe name of the jolokia mbean.keyword
apache_spark.driver.memory.max_memMaximum amount of memory available for storage, in MB.longgauge
apache_spark.driver.memory.off_heap.maxMaximum amount of off heap memory available, in MB.longgauge
apache_spark.driver.memory.off_heap.remainingRemaining amount of off heap memory, in MB.longgauge
apache_spark.driver.memory.off_heap.usedTotal amount of off heap memory used, in MB.longgauge
apache_spark.driver.memory.on_heap.maxMaximum amount of on heap memory available, in MB.longgauge
apache_spark.driver.memory.on_heap.remainingRemaining amount of on heap memory, in MB.longgauge
apache_spark.driver.memory.on_heap.usedTotal amount of on heap memory used, in MB.longgauge
apache_spark.driver.memory.remainingRemaining amount of storage memory, in MB.longgauge
apache_spark.driver.memory.usedTotal amount of memory used for storage, in MB.longgauge
apache_spark.driver.spark.streaming.event_time.watermarklonggauge
apache_spark.driver.spark.streaming.input_rate.totalTotal rate of the input.doublegauge
apache_spark.driver.spark.streaming.latencylonggauge
apache_spark.driver.spark.streaming.processing_rate.totalTotal rate of processing.doublegauge
apache_spark.driver.spark.streaming.states.rows.totalTotal number of rows.longgauge
apache_spark.driver.spark.streaming.states.used_bytesTotal number of bytes utilized.longgauge
apache_spark.driver.stages.completed_countTotal number of completed stages.longcounter
apache_spark.driver.stages.failed_countTotal number of failed stages.longcounter
apache_spark.driver.stages.skipped_countTotal number of skipped stages.longcounter
apache_spark.driver.tasks.completedNumber of completed tasks.longcounter
apache_spark.driver.tasks.executors.black_listedNumber of blacklisted executors for the tasks.longcounter
apache_spark.driver.tasks.executors.excludedNumber of excluded executors for the tasks.longcounter
apache_spark.driver.tasks.executors.unblack_listedNumber of unblacklisted executors for the tasks.longcounter
apache_spark.driver.tasks.executors.unexcludedNumber of unexcluded executors for the tasks.longcounter
apache_spark.driver.tasks.failedNumber of failed tasks.longcounter
apache_spark.driver.tasks.killedNumber of killed tasks.longcounter
apache_spark.driver.tasks.skippedNumber of skipped tasks.longcounter
cloud.account.idThe cloud account or organization id used to identify different entities in a multi-tenant environment. Examples: AWS account id, Google Cloud ORG Id, or other unique identifier.keyword
cloud.availability_zoneAvailability zone in which this host, resource, or service is located.keyword
cloud.instance.idInstance ID of the host machine.keyword
cloud.providerName of the cloud provider. Example values are aws, azure, gcp, or digitalocean.keyword
cloud.regionRegion in which this host, resource, or service is located.keyword
container.idUnique container id.keyword
data_stream.datasetData stream dataset.constant_keyword
data_stream.namespaceData stream namespace.constant_keyword
data_stream.typeData stream type.constant_keyword
host.nameName of the host. It can contain what hostname returns on Unix systems, the fully qualified domain name (FQDN), or a name specified by the user. The recommended value is the lowercase FQDN of the host.keyword
service.addressAddress where data about this service was collected from. This should be a URI, network address (ipv4:port or [ipv6]:port) or a resource path (sockets).keyword

Theexecutor data stream collects metrics related to the operations, memory usage, garbage collection, file handling, and threadpool activity.

Example
{    "@timestamp": "2023-09-28T09:26:45.771Z",    "agent": {        "ephemeral_id": "3a3db920-eb4b-4045-b351-33526910ae8a",        "id": "a6bdbb4a-4bac-4243-83cb-dba157f24987",        "name": "docker-fleet-agent",        "type": "metricbeat",        "version": "8.8.0"    },    "apache_spark": {        "executor": {            "application_name": "app-20230928092630-0000",            "id": "0",            "jvm": {                "cpu_time": 20010000000            },            "mbean": "metrics:name=app-20230928092630-0000.0.JVMCPU.jvmCpuTime,type=gauges"        }    },    "data_stream": {        "dataset": "apache_spark.executor",        "namespace": "ep",        "type": "metrics"    },    "ecs": {        "version": "8.11.0"    },    "elastic_agent": {        "id": "a6bdbb4a-4bac-4243-83cb-dba157f24987",        "snapshot": false,        "version": "8.8.0"    },    "event": {        "agent_id_status": "verified",        "dataset": "apache_spark.executor",        "duration": 2849184715,        "ingested": "2023-09-28T09:26:49Z",        "kind": "metric",        "module": "apache_spark",        "type": [            "info"        ]    },    "host": {        "architecture": "x86_64",        "containerized": true,        "hostname": "docker-fleet-agent",        "id": "e8978f2086c14e13b7a0af9ed0011d19",        "ip": [            "172.20.0.7"        ],        "mac": [            "02-42-AC-14-00-07"        ],        "name": "docker-fleet-agent",        "os": {            "codename": "focal",            "family": "debian",            "kernel": "3.10.0-1160.90.1.el7.x86_64",            "name": "Ubuntu",            "platform": "ubuntu",            "type": "linux",            "version": "20.04.6 LTS (Focal Fossa)"        }    },    "metricset": {        "name": "jmx",        "period": 60000    },    "service": {        "address": "http://apache-spark-main:7780/jolokia/%3FignoreErrors=true&canonicalNaming=false",        "type": "jolokia"    }}

ECS Field Reference

Refer to the followingdocument for detailed information on ECS fields.

Exported fields
FieldDescriptionTypeMetric Type
@timestampEvent timestamp.date
agent.idUnique identifier of this agent (if one exists). Example: For Beats this would be beat.id.keyword
apache_spark.executor.application_nameName of application.keyword
apache_spark.executor.bytes.readTotal number of bytes read.longcounter
apache_spark.executor.bytes.writtenTotal number of bytes written.longcounter
apache_spark.executor.disk_bytes_spilledTotal number of disk bytes spilled.longcounter
apache_spark.executor.file_cache_hitsTotal number of file cache hits.longcounter
apache_spark.executor.files_discoveredTotal number of files discovered.longcounter
apache_spark.executor.filesystem.file.large_read_opsTotal number of large read operations from the files.longgauge
apache_spark.executor.filesystem.file.read_bytesTotal number of bytes read from the files.longgauge
apache_spark.executor.filesystem.file.read_opsTotal number of read operations from the files.longgauge
apache_spark.executor.filesystem.file.write_bytesTotal number of bytes written from the files.longgauge
apache_spark.executor.filesystem.file.write_opsTotal number of write operations from the files.longgauge
apache_spark.executor.filesystem.hdfs.large_read_opsTotal number of large read operations from HDFS.longgauge
apache_spark.executor.filesystem.hdfs.read_bytesTotal number of read bytes from HDFS.longgauge
apache_spark.executor.filesystem.hdfs.read_opsTotal number of read operations from HDFS.longgauge
apache_spark.executor.filesystem.hdfs.write_bytesTotal number of write bytes from HDFS.longgauge
apache_spark.executor.filesystem.hdfs.write_opsTotal number of write operations from HDFS.longgauge
apache_spark.executor.gc.major.countTotal major GC count. For example, the garbage collector is one of MarkSweepCompact, PS MarkSweep, ConcurrentMarkSweep, G1 Old Generation and so on.longgauge
apache_spark.executor.gc.major.timeElapsed total major GC time. The value is expressed in milliseconds.longgauge
apache_spark.executor.gc.minor.countTotal minor GC count. For example, the garbage collector is one of Copy, PS Scavenge, ParNew, G1 Young Generation and so on.longgauge
apache_spark.executor.gc.minor.timeElapsed total minor GC time. The value is expressed in milliseconds.longgauge
apache_spark.executor.heap_memory.off.executionPeak off heap execution memory in use, in bytes.longgauge
apache_spark.executor.heap_memory.off.storagePeak off heap storage memory in use, in bytes.longgauge
apache_spark.executor.heap_memory.off.unifiedPeak off heap memory (execution and storage).longgauge
apache_spark.executor.heap_memory.on.executionPeak on heap execution memory in use, in bytes.longgauge
apache_spark.executor.heap_memory.on.storagePeak on heap storage memory in use, in bytes.longgauge
apache_spark.executor.heap_memory.on.unifiedPeak on heap memory (execution and storage).longgauge
apache_spark.executor.hive_client_callsTotal number of Hive Client calls.longcounter
apache_spark.executor.idID of executor.keyword
apache_spark.executor.jvm.cpu_timeElapsed CPU time the JVM spent.longgauge
apache_spark.executor.jvm.gc_timeElapsed time the JVM spent in garbage collection while executing this task.longcounter
apache_spark.executor.mbeanThe name of the jolokia mbean.keyword
apache_spark.executor.memory.direct_poolPeak memory that the JVM is using for direct buffer pool.longgauge
apache_spark.executor.memory.jvm.heapPeak memory usage of the heap that is used for object allocation.longgauge
apache_spark.executor.memory.jvm.off_heapPeak memory usage of non-heap memory that is used by the Java virtual machine.longgauge
apache_spark.executor.memory.mapped_poolPeak memory that the JVM is using for mapped buffer poollonggauge
apache_spark.executor.memory_bytes_spilledThe number of in-memory bytes spilled by this task.longcounter
apache_spark.executor.parallel_listing_job_countNumber of jobs running parallely.longcounter
apache_spark.executor.partitions_fetchedNumber of partitions fetched.longcounter
apache_spark.executor.process_tree.jvm.rss_memoryResident Set Size: number of pages the process has in real memory. This is just the pages which count toward text, data, or stack space. This does not include pages which have not been demand-loaded in, or which are swapped out.longgauge
apache_spark.executor.process_tree.jvm.v_memoryVirtual memory size in bytes.longgauge
apache_spark.executor.process_tree.other.rss_memoryResident Set Size for other kind of process.longgauge
apache_spark.executor.process_tree.other.v_memoryVirtual memory size for other kind of process in bytes.longgauge
apache_spark.executor.process_tree.python.rss_memoryResident Set Size for Python.longgauge
apache_spark.executor.process_tree.python.v_memoryVirtual memory size for Python in bytes.longgauge
apache_spark.executor.records.readTotal number of records read.longcounter
apache_spark.executor.records.writtenTotal number of records written.longcounter
apache_spark.executor.result.serialization_timeElapsed time spent serializing the task result. The value is expressed in milliseconds.longcounter
apache_spark.executor.result.sizeThe number of bytes this task transmitted back to the driver as the TaskResult.longcounter
apache_spark.executor.run_timeElapsed time in the running this tasklongcounter
apache_spark.executor.shuffle.bytes_writtenNumber of bytes written in shuffle operations.longcounter
apache_spark.executor.shuffle.client.used.direct_memoryAmount of direct memory used by the shuffle client.longgauge
apache_spark.executor.shuffle.client.used.heap_memoryAmount of heap memory used by the shuffle client.longgauge
apache_spark.executor.shuffle.fetch_wait_timeTime the task spent waiting for remote shuffle blocks.longcounter
apache_spark.executor.shuffle.local.blocks_fetchedNumber of local (as opposed to read from a remote executor) blocks fetched in shuffle operations.longcounter
apache_spark.executor.shuffle.local.bytes_readNumber of bytes read in shuffle operations from local disk (as opposed to read from a remote executor).longcounter
apache_spark.executor.shuffle.records.readNumber of records read in shuffle operations.longcounter
apache_spark.executor.shuffle.records.writtenNumber of records written in shuffle operations.longcounter
apache_spark.executor.shuffle.remote.blocks_fetchedNumber of remote blocks fetched in shuffle operations.longcounter
apache_spark.executor.shuffle.remote.bytes_readNumber of remote bytes read in shuffle operations.longcounter
apache_spark.executor.shuffle.remote.bytes_read_to_diskNumber of remote bytes read to disk in shuffle operations. Large blocks are fetched to disk in shuffle read operations, as opposed to being read into memory, which is the default behavior.longcounter
apache_spark.executor.shuffle.server.used.direct_memoryAmount of direct memory used by the shuffle server.longgauge
apache_spark.executor.shuffle.server.used.heap_memoryAmount of heap memory used by the shuffle server.longcounter
apache_spark.executor.shuffle.total.bytes_readNumber of bytes read in shuffle operations (both local and remote)longcounter
apache_spark.executor.shuffle.write.timeTime spent blocking on writes to disk or buffer cache. The value is expressed in nanoseconds.longcounter
apache_spark.executor.succeeded_tasksThe number of tasks succeeded.longcounter
apache_spark.executor.threadpool.active_tasksNumber of tasks currently executing.longgauge
apache_spark.executor.threadpool.complete_tasksNumber of tasks that have completed in this executor.longgauge
apache_spark.executor.threadpool.current_pool_sizeThe size of the current thread pool of the executor.longgauge
apache_spark.executor.threadpool.max_pool_sizeThe maximum size of the thread pool of the executor.longcounter
apache_spark.executor.threadpool.started_tasksThe number of tasks started in the thread pool of the executor.longcounter
cloud.account.idThe cloud account or organization id used to identify different entities in a multi-tenant environment. Examples: AWS account id, Google Cloud ORG Id, or other unique identifier.keyword
cloud.availability_zoneAvailability zone in which this host, resource, or service is located.keyword
cloud.instance.idInstance ID of the host machine.keyword
cloud.providerName of the cloud provider. Example values are aws, azure, gcp, or digitalocean.keyword
cloud.regionRegion in which this host, resource, or service is located.keyword
container.idUnique container id.keyword
data_stream.datasetData stream dataset.constant_keyword
data_stream.namespaceData stream namespace.constant_keyword
data_stream.typeData stream type.constant_keyword
host.nameName of the host. It can contain what hostname returns on Unix systems, the fully qualified domain name (FQDN), or a name specified by the user. The recommended value is the lowercase FQDN of the host.keyword
service.addressAddress where data about this service was collected from. This should be a URI, network address (ipv4:port or [ipv6]:port) or a resource path (sockets).keyword

Thenode data stream collects metrics related to the application count, waiting applications, worker metrics, executor count, core usage, and memory usage.

Example
{    "@timestamp": "2022-04-12T04:42:49.581Z",    "agent": {        "ephemeral_id": "ae57925e-eeca-4bf4-ae20-38f82db1378b",        "id": "f051059f-86be-46d5-896d-ff1b2cdab179",        "name": "docker-fleet-agent",        "type": "metricbeat",        "version": "8.1.0"    },    "apache_spark": {        "node": {            "main": {                "applications": {                    "count": 0,                    "waiting": 0                },                "workers": {                    "alive": 0,                    "count": 0                }            }        }    },    "data_stream": {        "dataset": "apache_spark.node",        "namespace": "ep",        "type": "metrics"    },    "ecs": {        "version": "8.11.0"    },    "elastic_agent": {        "id": "f051059f-86be-46d5-896d-ff1b2cdab179",        "snapshot": false,        "version": "8.1.0"    },    "event": {        "agent_id_status": "verified",        "dataset": "apache_spark.node",        "duration": 8321835,        "ingested": "2022-04-12T04:42:53Z",        "kind": "metric",        "module": "apache_spark",        "type": [            "info"        ]    },    "host": {        "architecture": "x86_64",        "containerized": true,        "hostname": "docker-fleet-agent",        "ip": [            "192.168.32.5"        ],        "mac": [            "02-42-AC-14-00-07"        ],        "name": "docker-fleet-agent",        "os": {            "codename": "focal",            "family": "debian",            "kernel": "5.4.0-107-generic",            "name": "Ubuntu",            "platform": "ubuntu",            "type": "linux",            "version": "20.04.3 LTS (Focal Fossa)"        }    },    "metricset": {        "name": "jmx",        "period": 60000    },    "service": {        "address": "http://apache-spark-main:7777/jolokia/%3FignoreErrors=true&canonicalNaming=false",        "type": "jolokia"    }}

ECS Field Reference

Refer to the followingdocument for detailed information on ECS fields.

Exported fields
FieldDescriptionTypeMetric Type
@timestampEvent timestamp.date
agent.idUnique identifier of this agent (if one exists). Example: For Beats this would be beat.id.keyword
apache_spark.node.main.applications.countTotal number of apps.longgauge
apache_spark.node.main.applications.waitingNumber of apps waiting.longgauge
apache_spark.node.main.workers.aliveNumber of alive workers.longgauge
apache_spark.node.main.workers.countTotal number of workers.longgauge
apache_spark.node.worker.cores.freeNumber of cores free.longgauge
apache_spark.node.worker.cores.usedNumber of cores used.longgauge
apache_spark.node.worker.executorsNumber of executors.longgauge
apache_spark.node.worker.memory.freeNumber of cores free.longgauge
apache_spark.node.worker.memory.usedAmount of memory utilized in MB.longgauge
cloud.account.idThe cloud account or organization id used to identify different entities in a multi-tenant environment. Examples: AWS account id, Google Cloud ORG Id, or other unique identifier.keyword
cloud.availability_zoneAvailability zone in which this host, resource, or service is located.keyword
cloud.instance.idInstance ID of the host machine.keyword
cloud.providerName of the cloud provider. Example values are aws, azure, gcp, or digitalocean.keyword
cloud.regionRegion in which this host, resource, or service is located.keyword
container.idUnique container id.keyword
data_stream.datasetData stream dataset.constant_keyword
data_stream.namespaceData stream namespace.constant_keyword
data_stream.typeData stream type.constant_keyword
host.nameName of the host. It can contain what hostname returns on Unix systems, the fully qualified domain name (FQDN), or a name specified by the user. The recommended value is the lowercase FQDN of the host.keyword
service.addressAddress where data about this service was collected from. This should be a URI, network address (ipv4:port or [ipv6]:port) or a resource path (sockets).keyword

This integration includes one or more Kibana dashboards that visualizes the data collected by the integration. The screenshots below illustrate how the ingested data is displayed.

Apache Spark screenshot
Changelog
VersionDetailsMinimum Kibana version
1.5.0Enhancement (View pull request)
Improve documentation to align with new guidelines.
9.0.0
8.13.0
1.4.0Enhancement (View pull request)
Add support for Kibana9.0.0.
9.0.0
8.13.0
1.3.1Bug fix (View pull request)
Update links to getting started docs
8.13.0
1.3.0Enhancement (View pull request)
Add processor support for application, driver, executor and node data streams.
8.13.0
1.2.0Enhancement (View pull request)
ECS version updated to 8.11.0. Update the kibana constraint to ^8.13.0. Modified the field definitions to remove ECS fields made redundant by the ecs@mappings component template.
8.13.0
1.1.0Enhancement (View pull request)
Add global filter on data_stream.dataset to improve performance.
8.8.0
1.0.3Enhancement (View pull request)
Update README to follow documentation guidelines.
8.8.0
1.0.2Enhancement (View pull request)
Inline "by reference" visualizations
8.8.0
1.0.1Bug fix (View pull request)
Update the link to the correct reindexing procedure.
8.8.0
1.0.0Enhancement (View pull request)
Make Apache Spark GA.
8.8.0
0.8.0Enhancement (View pull request)
Update the package format_version to 3.0.0.
8.8.0
0.7.9Bug fix (View pull request)
Add filters in visualizations.
8.8.0
0.7.8Enhancement (View pull request)
Enable time series data streams for the metrics datasets. This dramatically reduces storage for metrics and is expected to progressively improve query performance. For more details, seehttps://www.elastic.co/guide/en/elasticsearch/reference/current/tsds.html.
8.8.0
0.7.7Enhancement (View pull request)
Add metric_type for node data stream.
8.1.0
0.7.6Enhancement (View pull request)
Added dimension mapping for Node datastream.
8.1.0
0.7.5Enhancement (View pull request)
Add metric_type mappings for executor data stream.
8.1.0
0.7.4Enhancement (View pull request)
Added dimension mapping for Executor datastream.
8.1.0
0.7.3Enhancement (View pull request)
Add metric_type mapping for driver datastream.
8.1.0
0.7.2Enhancement (View pull request)
Added dimension mapping for driver datastream.
8.1.0
0.7.1Enhancement (View pull request)
Add metric type for application data stream.
8.1.0
0.7.0Enhancement (View pull request)
Added dimension mapping for Application datastream.
8.1.0
0.6.4Bug fix (View pull request)
Fix the metric type of input_rate field for driver datastream.
8.1.0
0.6.3Enhancement (View pull request)
Update Apache Spark logo.
8.1.0
0.6.2Bug fix (View pull request)
Resolve the conflicts in host.ip field
8.1.0
0.6.1Bug fix (View pull request)
Remove incorrect filter from the visualizations
8.1.0
0.6.0Enhancement (View pull request)
Rename ownership from obs-service-integrations to obs-infraobs-integrations
8.1.0
0.5.0Enhancement (View pull request)
Migrate visualizations to lens.
8.1.0
0.4.1Enhancement (View pull request)
Added categories and/or subcategories.
8.1.0
0.4.0Enhancement (View pull request)
Update ECS version to 8.5.1
8.1.0
0.3.0Enhancement (View pull request)
Update readme
8.1.0
0.2.1Bug fix (View pull request)
Remove unnecessary fields from fields.yml
8.1.0
0.2.0Enhancement (View pull request)
Add dashboards and visualizations
8.1.0
0.1.1Enhancement (View pull request)
Refactor the "nodes" data stream to adjust its name to "node" (singular)
0.1.0Enhancement (View pull request)
Implement "executor" data stream

Enhancement (View pull request)
Implement "driver" data stream

Enhancement (View pull request)
Implement "application" data stream

Enhancement (View pull request)
Implement "nodes" data stream

[8]ページ先頭

©2009-2026 Movatter.jp