InternLM/agentlegoPublic

NotificationsYou must be signed in to change notification settings
Fork32
Star382

Enhance LLM agents with rich tool APIs

License

Apache-2.0 license

382 stars 32 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 123 Commits
.dev_scripts		.dev_scripts
.github/workflows		.github/workflows
agentlego		agentlego
docs		docs
examples		examples
requirements		requirements
tests		tests
webui		webui
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yml		.readthedocs.yml
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
README_zh-CN.md		README_zh-CN.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Repository files navigation

English |简体中文

Introduction

AgentLego is an open-source library of versatile tool APIs to extend and enhance large language model (LLM) based agents, with the following highlight features:

Rich set of tools for multimodal extensions of LLM agents including visual perception, image generation and editing, speech processing and visual-language reasoning, etc.
Flexible tool interface that allows users to easily extend custom tools with arbitrary types of arguments and outputs.
Easy integration with LLM-based agent frameworks likeLangChain,Transformers Agents,Lagent.
Support tool serving and remote accessing, which is especially useful for tools with heavy ML models (e.g. ViT) or special environment requirements (e.g. GPU and CUDA).

AgentLego._720p.mp4

Quick Starts

Installation

Install the AgentLego package

pip install agentlego

Install tool-specific dependencies

Some tools requires extra packages, please check the readme file of the tool, and confirm all requirements aresatisfied.

For example, if we want to use theImageDescription tool. We need to check theSet up section ofreadme and install the requirements.

pip install -U openmimmim install -U mmpretrain

Use tools directly

fromagentlegoimportlist_tools,load_toolprint(list_tools())# list tools in AgentLegoimage_caption_tool=load_tool('ImageDescription',device='cuda')print(image_caption_tool.description)image='./examples/demo.png'caption=image_caption_tool(image)

Integrated into agent frameworks

Supported Tools

General ability

Calculator: Calculate by Python interpreter.
GoogleSearch: Search on Google.

Speech related

TextToSpeech: Speak the input text into audio.
SpeechToText: Transcribe an audio into text.

Image-processing related

ImageDescription: Describe the input image.
OCR: Recognize the text from a photo.
VQA: Answer the question according to the image.
HumanBodyPose: Estimate the pose or keypoints of human in an image.
HumanFaceLandmark: Estimate the landmark or keypoints of human faces in an image.
ImageToCanny: Extract the edge image from an image.
ImageToDepth: Generate the depth image of an image.
ImageToScribble: Generate a sketch scribble of an image.
ObjectDetection: Detect all objects in the image.
TextToBbox: Detect specific objects described by the given text in the image.
Segment Anything series
- SegmentAnything: Segment all items in the image.
- SegmentObject: Segment the certain objects in the image according to the given object name.

AIGC related

TextToImage: Generate an image from the input text.
ImageExpansion: Expand the peripheral area of an image based on its content.
ObjectRemove: Remove the certain objects in the image.
ObjectReplace: Replace the certain objects in the image.
ImageStylization: Modify an image according to the instructions.
ControlNet series
- CannyTextToImage: Generate an image from a canny edge image and a description.
- DepthTextToImage: Generate an image from a depth image and a description.
- PoseToImage: Generate an image from a human pose image and a description.
- ScribbleTextToImage: Generate an image from a sketch scribble image and a description.
ImageBind series
- AudioToImage: Generate an image according to audio.
- ThermalToImage: Generate an image according a thermal image.
- AudioImageToImage: Generate am image according to a audio and image.
- AudioTextToImage: Generate an image from a audio and text prompt.

Licence

This project is released under theApache 2.0 license. Users should also ensure compliance with the licenses governing the models used in this project.

About

Enhance LLM agents with rich tool APIs

Packages

No packages published

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Folders and files

Latest commit

History

Repository files navigation

Introduction

Quick Starts

Installation

Use tools directly

Integrated into agent frameworks

Supported Tools

Licence

About

Topics

Resources

License

Stars

Watchers

Forks

Packages

Contributors9

Languages

Movatterモバイル変換

License

InternLM/agentlego

Folders and files

Latest commit

History

Repository files navigation

Introduction

Quick Starts

Installation

Use tools directly

Integrated into agent frameworks

Supported Tools

Licence

About

Topics

Resources

License

Stars

Watchers

Forks

Packages0

Contributors9

Languages

Packages