Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
ScienceDaily
Your source for the latest research news
New! Sign up for our freeemail newsletter.
Science News
from research organizations

Language agents help large language models 'think' better and cheaper

Date:
September 24, 2024
Source:
Washington University in St. Louis
Summary:
Researchers have devised an agent to help large language models 'think.'
Share:
FULL STORY

The large language models that have increasingly taken over the tech world are not "cheap" in many ways. The most prominent LLMs, GPT-4 for instance, took some $100 million to build in the form of legal costs of accessing training data, computational power costs for what could be billions or trillions of parameters, the energy and water needed to fuel computation, and the many coders developing the training algorithms that must run cycle after cycle so the machine will "learn."

But, if a researcher needs to do a specialized task that a machine could do more efficiently and they don't have access to a large institution like Washington University in St. Louis that offers access to generative AI tools, what other options are available? Say, a parent wants to prep their child for a difficult test and needs to show many examples of how to solve complicated math problems.

Building their own LLM is an onerous prospect for costs mentioned above and making direct use of the big models like GPT-4 and Llama 3.1 might not immediately be suited for the complex reasoning in logic and math their task requires.

It would help if there were a more cost-effective version of a LLM thinker available to the masses, a generic brand for generative AI.

Researchers at WashU decided to tackle this challenge by building an autonomous agent to instruct the reasoning process of large language models. This agent generates a single set of instructions for each task and those instructions turn out to be extremely effective for improving the reasoning process of different LLMs across all task instances, according to research from the lab of Chenguang Wang, assistant professor in computer science and engineering, in collaboration with Dawn Song, a professor at the University California, Berkeley.

Researchers included WashU PhD students Nicholas Crispino, Kyle Montgomery, and research analyst Fankun Zeng, who presented their work at a recent conference for machine learning.

This "agent" is a large LLM that serves as a tool to think over the instructions from the web, said Crispino. Given basic task information such as the dataset name, and a few input-only examples, the agent then produces high quality step-by-step instructions for tasks.

Those instructions guide the reasoning of the smaller LLMs on certain tasks. It's a more affordable way to do generative AI because they only have to use the large LLM once per data set, then they hand instructions over to a smaller LLM that can take over.

"We can use the expensive model once and make these nice instructions to guide the reasoning or thinking process of a cheaper model," Crispino said.

"Our method boosts the performance of state-of-the-art large language models by a large margin," Montgomery added.

They tested their cost-effective method, called Zero-Shot AgentInstruct, on language processing tasks and compared its performance to zero-shot prompting methods using LLMs Vicuna-13b, Llama-2-70b-chat, and GPT-3.5 Turbo.

Compared to "zero-shot chain of thought" prompting, which works via adding the prompt, "let's think step by step," Zero-Shot AgentInstruct showed better performance across a variety of tasks evaluated on 29 datasets (including 53 subsets).

"Our improvement in thinking and reasoning is striking, particularly in math and logic," Wang said.

Essentially, they are making use of the powerful LLM models to distill tasks into step-by-step reasoning paths for the other model, like an experienced teacher sharing their knowledge with students.

"We're seeing how far we can push the reasoning capabilities of smaller models using larger models without training," Crispino said.


Story Source:

Materials provided byWashington University in St. Louis. Original written by Leah Shaffer.Note: Content may be edited for style and length.


Cite This Page:

Washington University in St. Louis. "Language agents help large language models 'think' better and cheaper." ScienceDaily. ScienceDaily, 24 September 2024. <www.sciencedaily.com/releases/2024/09/240924165737.htm>.
Washington University in St. Louis. (2024, September 24). Language agents help large language models 'think' better and cheaper.ScienceDaily. Retrieved February 17, 2026 from www.sciencedaily.com/releases/2024/09/240924165737.htm
Washington University in St. Louis. "Language agents help large language models 'think' better and cheaper." ScienceDaily. www.sciencedaily.com/releases/2024/09/240924165737.htm (accessed February 17, 2026).

Explore More

from ScienceDaily

RELATED STORIES

Jan. 21, 2026 — Scientists have discovered that the human brain understands spoken language in a way that closely resembles how advanced AI language models work. By tracking brain activity as people listened to a ...
Dec. 9, 2024 — Pretrained large-scale AI models need to 'forget' specific information for privacy and computational efficiency, but no methods exist for doing so in black-box vision-language models, where ...
Nov. 27, 2024 — Large language models, a type of AI that analyses text, can predict the results of proposed neuroscience studies more accurately than human experts, finds a new study. The findings demonstrate that ...
May 28, 2024 — With generative artificial intelligence (GenAI) transforming the social interaction landscape in recent years, large language models (LLMs), which use deep-learning algorithms to train GenAI ...
Sep. 14, 2023 — The era of artificial-intelligence chatbots that seem to understand and use language the way we humans do has begun. Under the hood, these chatbots use large language models, a particular kind of ...
Feb. 7, 2023 — Researchers have explained how large language models like GPT-3 are able to learn new tasks without updating their parameters, despite not being trained to perform those tasks. They found that these ...

TRENDING ATSCITECHDAILY.com

Oxford Breakthrough Reveals the Secret Ingredient Inside Lithium-Ion Batteries

Astronomers Stunned by Rocky Planet in the Wrong Place

A Massive Star Suddenly Vanished and Left a Black Hole Behind

This Unexpected Ingredient Makes Bread Much Healthier

 Print  Email  Share

Breaking

this hour

Trending Topics

this week

SPACE & TIME
NASA
Space Telescopes
Mars
MATTER & ENERGY
Albert Einstein
Materials Science
Spintronics
COMPUTERS & MATH
Computer Modeling
Computers and Internet
Internet

Strange & Offbeat

 

SPACE & TIME
Universe May End in a “big Crunch,” New Dark Energy Data Suggests
Ramanujan’s 100-Year-Old Pi Formula Is Still Revealing the Universe
Hidden Dimensions Could Explain Where Mass Comes from
MATTER & ENERGY
Fusion Reactors May Create Dark Matter Particles
NASA's Webb Finds Life’s Building Blocks Frozen in a Galaxy Next Door
Physicists Prove the Universe Isn’t a Simulation After All
COMPUTERS & MATH
“Existential Risk” – Why Scientists Are Racing to Define Consciousness
Researchers Tested AI Against 100,000 Humans on Creativity
What If AI Becomes Conscious and We Never Know


[8]ページ先頭

©2009-2026 Movatter.jp