Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

PaLM

From Wikipedia, the free encyclopedia
Large language model developed by Google
PaLM
DeveloperGoogle AI
PredecessorLaMDA
SuccessorGoogle Gemini
Available inEnglish
TypeLarge language model
Websiteai.google/discover/palm2/ Edit this on Wikidata

PaLM (Pathways Language Model) is a 540 billion-parameter dense decoder-onlytransformer-basedlarge language model (LLM) developed byGoogle AI.[1] Researchers also trained smaller versions of PaLM (with 8 and 62 billion parameters) to test the effects of model scale.[2]

Model

[edit]

PaLM is capable of a wide range of tasks, includingcommonsense reasoning,arithmetic reasoning,joke explanation,code generation, andtranslation.[2][3][4][5] When combined withchain-of-thought prompting, PaLM achieved significantly better performance ondatasets requiring multi-step reasoning, such asword problems andlogic-based questions.[1][2]

The model was first announced in April 2022 and remained private until March 2023, when Google launched anAPI for PaLM and several other technologies.[6] The API was initially available to a limited number of developers who joined a waitlist before it was released to the public.[7]

Google andDeepMind developed a version of PaLM 540B (with 540 billion parameters) calledMed-PaLM, which isfine-tuned on medical data and outperforms previous models on medicalquestion-answering benchmarks.[8][9] Med-PaLM was the first to obtain a passing score onU.S. medical licensing questions, and in addition to answering both multiple choice and open-ended questions accurately, it providesreasoning and is able to evaluate its own responses.[10]

Google also extended PaLM using avision transformer to createPaLM-E, avision-language model that can be used forrobotic manipulation without the need for retraining orfine-tuning.[11][12][13]

In May 2023, Google announced PaLM 2 at the annualGoogle I/O keynote.[14] PaLM 2 is reported to be a 340 billion-parameter model trained on 3.6 trillion tokens.[15]

In June 2023, Google announced AudioPaLM for speech-to-speech translation, which uses the PaLM-2 architecture and initialization.[16]

Training

[edit]

PaLM is pre-trained on a high-qualitycorpus of 780 billion tokens that comprise variousnatural language tasks and use cases. This dataset includes filtered webpages, books,Wikipedia articles, news articles, source code obtained from open source repositories onGitHub, andsocial media conversations.[1][2] It is based on the dataset used to trainGoogle'sLaMDA model.[2] The social media conversation portion of the dataset makes up 50% of the corpus, which aids the model in its conversational capabilities.[2]

PaLM 540B was trained over twoTPU v4 Pods with 3,072 TPU v4 chips in each Pod attached to 768 hosts, connected using a combination of model anddata parallelism, which was the largest TPU configuration.[2][17] This allowed for efficient training at scale, using 6,144 chips, and marked a record for the highest training efficiency achieved for LLMs at this scale: a hardwareFLOPs utilization of 57.8%.[3]

See also

[edit]

References

[edit]
  1. ^abcNarang, Sharan; Chowdhery, Aakanksha."Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance".ai.googleblog.com. Retrieved17 March 2023.
  2. ^abcdefgChowdhery, Aakanksha; Narang, Sharan; Devlin, Jacob; et al. (2022). "PaLM: Scaling Language Modeling with Pathways".arXiv:2204.02311 [cs.CL].
  3. ^abAnadiotis, George (12 April 2022)."Google sets the bar for AI language models with PaLM".VentureBeat. Retrieved17 March 2023.
  4. ^Bastian, Matthias (5 April 2022)."Google PaLM: Giant language AI can explain jokes".the decoder. Retrieved17 March 2023.
  5. ^"Google: Why Is No One Talking About PaLM".seekingalpha.com. 12 December 2022. Retrieved17 March 2023.
  6. ^Vincent, James (14 March 2023)."Google opens up its AI language model PaLM to challenge OpenAI and GPT-3".The Verge. Retrieved17 March 2023.
  7. ^Huffman, Scott; Woodward, Josh."PaLM API & MakerSuite: an approachable way to start prototyping and building generative AI applications". Retrieved17 March 2023.
  8. ^Singhal, Karan; Azizi, Shekoofeh; Tu, Tao; et al. (2022). "Large Language Models Encode Clinical Knowledge".arXiv:2212.13138 [cs.CL].
  9. ^"MedPaLM: New Chatbots Will Soon Be Better Than Waiting For A Doctor".The Medical Futurist. 17 January 2023. Retrieved17 March 2023.
  10. ^Matias, Yossi; Corrado, Greg (14 March 2023)."Our latest health AI research updates".Google. Retrieved17 March 2023.
  11. ^Driess, Danny; Xia, Fei; Sajjadi, Mehdi S. M.; et al. (2023). "PaLM-E: An Embodied Multimodal Language Model".arXiv:2303.03378 [cs.LG].
  12. ^Driess, Danny; Florence, Pete."PaLM-E: An embodied multimodal language model".ai.googleblog.com. Retrieved17 March 2023.
  13. ^Edwards, Benj (7 March 2023)."Google's PaLM-E is a generalist robot brain that takes commands".Ars Technica. Retrieved17 March 2023.
  14. ^Lardinois, Frederic (May 10, 2023)."Google launches PaLM 2, its next-gen large language model".TechCrunch.Archived from the original on May 10, 2023. RetrievedMay 10, 2023.
  15. ^Elias, Jennifer (16 May 2023)."Google's newest A.I. model uses nearly five times more text data for training than its predecessor".CNBC. Retrieved18 May 2023.
  16. ^"AudioPaLM".google-research.github.io. Retrieved2023-06-30.
  17. ^"An empirical analysis of compute-optimal large language model training".www.deepmind.com. 12 April 2022. Retrieved17 March 2023.
Computer
programs
AlphaGo
Versions
Competitions
In popular culture
Other
Machine
learning
Neural networks
Other
Generative
AI
Chatbots
Models
Other
See also
a subsidiary ofAlphabet
Company
Divisions
Subsidiaries
Active
Defunct
Programs
Events
Infrastructure
People
Current
Former
Criticism
General
Incidents
Other
Software
A–C
D–N
O–Z
Operating systems
Machine learning models
Neural networks
Computer programs
Formats and codecs
Programming languages
Search algorithms
Domain names
Typefaces
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
Y
Hardware
Pixel
Smartphones
Smartwatches
Tablets
Laptops
Other
Nexus
Smartphones
Tablets
Other
Other
Advertising
Antitrust
Intellectual
property
Privacy
Other
Related
Concepts
Products
Android
Street View coverage
YouTube
Other
Documentaries
Books
Popular culture
Other
Concepts
Applications
Implementations
Audio–visual
Text
Decisional
People
Architectures
General terms
Text analysis
Text segmentation
Automatic summarization
Machine translation
Distributional semantics models
Language resources,
datasets and corpora
Types and
standards
Data
Automatic identification
and data capture
Topic model
Computer-assisted
reviewing
Natural language
user interface
Related
Retrieved from "https://en.wikipedia.org/w/index.php?title=PaLM&oldid=1319096815"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp