Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
OurBuilding Ambient Agents with LangGraph course is now available on LangChain Academy!
Open In ColabOpen on GitHub

Google Bigtable

Bigtable is a key-value and wide-column store, ideal for fast access to structured, semi-structured, or unstructured data. Extend your database application to build AI-powered experiences leveraging Bigtable's Langchain integrations.

This notebook goes over how to useBigtable tosave, load and delete langchain documents withBigtableLoader andBigtableSaver.

Learn more about the package onGitHub.

Open In Colab

Before You Begin

To run this notebook, you will need to do the following:

After confirmed access to database in the runtime environment of this notebook, filling the following values and run the cell before running example scripts.

# @markdown Please specify an instance and a table for demo purpose.
INSTANCE_ID="my_instance"# @param {type:"string"}
TABLE_ID="my_table"# @param {type:"string"}

🦜🔗 Library Installation

The integration lives in its ownlangchain-google-bigtable package, so we need to install it.

%pip install-upgrade--quiet langchain-google-bigtable

Colab only: Uncomment the following cell to restart the kernel or use the button to restart the kernel. For Vertex AI Workbench you can restart the terminal using the button on top.

# # Automatically restart kernel after installs so that your environment can access the new packages
# import IPython

# app = IPython.Application.instance()
# app.kernel.do_shutdown(True)

☁ Set Your Google Cloud Project

Set your Google Cloud project so that you can leverage Google Cloud resources within this notebook.

If you don't know your project ID, try the following:

# @markdown Please fill in the value below with your Google Cloud project ID and then run the cell.

PROJECT_ID="my-project-id"# @param {type:"string"}

# Set the project id
!gcloud configset project{PROJECT_ID}

🔐 Authentication

Authenticate to Google Cloud as the IAM user logged into this notebook in order to access your Google Cloud Project.

  • If you are using Colab to run this notebook, use the cell below and continue.
  • If you are using Vertex AI Workbench, check out the setup instructionshere.
from google.colabimport auth

auth.authenticate_user()

Basic Usage

Using the saver

Save langchain documents withBigtableSaver.add_documents(<documents>). To initializeBigtableSaver class you need to provide 2 things:

  1. instance_id - An instance of Bigtable.
  2. table_id - The name of the table within the Bigtable to store langchain documents.
from langchain_core.documentsimport Document
from langchain_google_bigtableimport BigtableSaver

test_docs=[
Document(
page_content="Apple Granny Smith 150 0.99 1",
metadata={"fruit_id":1},
),
Document(
page_content="Banana Cavendish 200 0.59 0",
metadata={"fruit_id":2},
),
Document(
page_content="Orange Navel 80 1.29 1",
metadata={"fruit_id":3},
),
]

saver= BigtableSaver(
instance_id=INSTANCE_ID,
table_id=TABLE_ID,
)

saver.add_documents(test_docs)
API Reference:Document

Querying for Documents from Bigtable

For more details on connecting to a Bigtable table, please check thePython SDK documentation.

Load documents from table

Load langchain documents withBigtableLoader.load() orBigtableLoader.lazy_load().lazy_load returns a generator that only queries database during the iteration. To initializeBigtableLoader class you need to provide:

  1. instance_id - An instance of Bigtable.
  2. table_id - The name of the table within the Bigtable to store langchain documents.
from langchain_google_bigtableimport BigtableLoader

loader= BigtableLoader(
instance_id=INSTANCE_ID,
table_id=TABLE_ID,
)

for docin loader.lazy_load():
print(doc)
break

Delete documents

Delete a list of langchain documents from Bigtable table withBigtableSaver.delete(<documents>).

from langchain_google_bigtableimport BigtableSaver

docs= loader.load()
print("Documents before delete: ", docs)

onedoc= test_docs[0]
saver.delete([onedoc])
print("Documents after delete: ", loader.load())

Advanced Usage

Limiting the returned rows

There are two ways to limit the returned rows:

  1. Using afilter
  2. Using arow_set
import google.cloud.bigtable.row_filtersas row_filters

filter_loader= BigtableLoader(
INSTANCE_ID, TABLE_ID,filter=row_filters.ColumnQualifierRegexFilter(b"os_build")
)


from google.cloud.bigtable.row_setimport RowSet

row_set= RowSet()
row_set.add_row_range_from_keys(
start_key="phone#4c410523#20190501", end_key="phone#4c410523#201906201"
)

row_set_loader= BigtableLoader(
INSTANCE_ID,
TABLE_ID,
row_set=row_set,
)

Custom client

The client created by default is the default client, using only admin=True option. To use a non-default, acustom client can be passed to the constructor.

from google.cloudimport bigtable

custom_client_loader= BigtableLoader(
INSTANCE_ID,
TABLE_ID,
client=bigtable.Client(...),
)

Custom content

The BigtableLoader assumes there is a column family calledlangchain, that has a column calledcontent, that contains values encoded in UTF-8. These defaults can be changed like so:

from langchain_google_bigtableimport Encoding

custom_content_loader= BigtableLoader(
INSTANCE_ID,
TABLE_ID,
content_encoding=Encoding.ASCII,
content_column_family="my_content_family",
content_column_name="my_content_column_name",
)

Metadata mapping

By default, themetadata map on theDocument object will contain a single key,rowkey, with the value of the row's rowkey value. To add more items to that map, use metadata_mapping.

import json

from langchain_google_bigtableimport MetadataMapping

metadata_mapping_loader= BigtableLoader(
INSTANCE_ID,
TABLE_ID,
metadata_mappings=[
MetadataMapping(
column_family="my_int_family",
column_name="my_int_column",
metadata_key="key_in_metadata_map",
encoding=Encoding.INT_BIG_ENDIAN,
),
MetadataMapping(
column_family="my_custom_family",
column_name="my_custom_column",
metadata_key="custom_key",
encoding=Encoding.CUSTOM,
custom_decoding_func=lambdainput: json.loads(input.decode()),
custom_encoding_func=lambdainput:str.encode(json.dumps(input)),
),
],
)

Metadata as JSON

If there is a column in Bigtable that contains a JSON string that you would like to have added to the output document metadata, it is possible to add the following parameters to BigtableLoader. Note, the default value formetadata_as_json_encoding is UTF-8.

metadata_as_json_loader= BigtableLoader(
INSTANCE_ID,
TABLE_ID,
metadata_as_json_encoding=Encoding.ASCII,
metadata_as_json_family="my_metadata_as_json_family",
metadata_as_json_name="my_metadata_as_json_column_name",
)

Customize BigtableSaver

The BigtableSaver is also customizable similar to BigtableLoader.

saver= BigtableSaver(
INSTANCE_ID,
TABLE_ID,
client=bigtable.Client(...),
content_encoding=Encoding.ASCII,
content_column_family="my_content_family",
content_column_name="my_content_column_name",
metadata_mappings=[
MetadataMapping(
column_family="my_int_family",
column_name="my_int_column",
metadata_key="key_in_metadata_map",
encoding=Encoding.INT_BIG_ENDIAN,
),
MetadataMapping(
column_family="my_custom_family",
column_name="my_custom_column",
metadata_key="custom_key",
encoding=Encoding.CUSTOM,
custom_decoding_func=lambdainput: json.loads(input.decode()),
custom_encoding_func=lambdainput:str.encode(json.dumps(input)),
),
],
metadata_as_json_encoding=Encoding.ASCII,
metadata_as_json_family="my_metadata_as_json_family",
metadata_as_json_name="my_metadata_as_json_column_name",
)

Related


[8]ページ先頭

©2009-2025 Movatter.jp