Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
OurBuilding Ambient Agents with LangGraph course is now available on LangChain Academy!
Open on GitHub

Spark

Apache Spark is a unified analytics engine forlarge-scale data processing. It provides high-level APIs in Scala, Java,Python, and R, and an optimized engine that supports general computationgraphs for data analysis. It also supports a rich set of higher-leveltools includingSpark SQL for SQL and DataFrames,pandas API on Sparkfor pandas workloads,MLlib for machine learning,GraphX for graph processing, andStructured Streaming for stream processing.

Document loaders

PySpark

It loads data from aPySpark DataFrame.

See ausage example.

from langchain_community.document_loadersimport PySparkDataFrameLoader

Tools/Toolkits

Spark SQL toolkit

Toolkit for interacting withSpark SQL.

See ausage example.

from langchain_community.agent_toolkitsimport SparkSQLToolkit, create_spark_sql_agent
from langchain_community.utilities.spark_sqlimport SparkSQL

Spark SQL individual tools

You can use individual tools from the Spark SQL Toolkit:

  • InfoSparkSQLTool: tool for getting metadata about a Spark SQL
  • ListSparkSQLTool: tool for getting tables names
  • QueryCheckerTool: tool uses an LLM to check if a query is correct
  • QuerySparkSQLTool: tool for querying a Spark SQL
from langchain_community.tools.spark_sql.toolimport InfoSparkSQLTool
from langchain_community.tools.spark_sql.toolimport ListSparkSQLTool
from langchain_community.tools.spark_sql.toolimport QueryCheckerTool
from langchain_community.tools.spark_sql.toolimport QuerySparkSQLTool

[8]ページ先頭

©2009-2025 Movatter.jp