lakeFS
lakeFS provides scalable version control over the data lake, and uses Git-like semantics to create and access those versions.
This notebooks covers how to load document objects from alakeFS
path (whether it's an object or a prefix).
Initializing the lakeFS loader
ReplaceENDPOINT
,LAKEFS_ACCESS_KEY
, andLAKEFS_SECRET_KEY
values with your own.
from langchain_community.document_loadersimport LakeFSLoader
API Reference:LakeFSLoader
ENDPOINT=""
LAKEFS_ACCESS_KEY=""
LAKEFS_SECRET_KEY=""
lakefs_loader= LakeFSLoader(
lakefs_access_key=LAKEFS_ACCESS_KEY,
lakefs_secret_key=LAKEFS_SECRET_KEY,
lakefs_endpoint=ENDPOINT,
)
Specifying a path
You can specify a prefix or a complete object path to control which files to load.
Specify the repository, reference (branch, commit id, or tag), and path in the correspondingREPO
,REF
, andPATH
to load the documents from:
REPO=""
REF=""
PATH=""
lakefs_loader.set_repo(REPO)
lakefs_loader.set_ref(REF)
lakefs_loader.set_path(PATH)
docs= lakefs_loader.load()
docs
Related
- Document loaderconceptual guide
- Document loaderhow-to guides