- Notifications
You must be signed in to change notification settings - Fork30
A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery (EMNLP'24)
License
yuzhimanhua/Awesome-Scientific-Language-Models
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
A curated list of pre-trained language models in scientific domains (e.g.,mathematics,physics,chemistry,materials science,biology,medicine,geoscience), covering different model sizes (from100M to100B parameters) and modalities (e.g.,language,graph,vision,table,molecule,protein,genome,climate time series).
The repository is part of our survey paperA Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery and will be continuously updated.
NOTE 1: To avoid ambiguity, when we talk about the number of parameters in a model, "Base" refers to 110M (i.e., BERT-Base), and "Large" refers to 340M (i.e., BERT-Large). Other numbers will be written explicitly.
NOTE 2: In each subsection, papers are sorted chronologically. If a paper has a preprint (e.g., arXiv or bioRxiv) version, its publication date is according to the preprint service. Otherwise, its publication date is according to the conference proceeding or journal.
NOTE 3: We appreciate contributions. If you have any suggested papers, feel free to reach out toyuzhang@tamu.edu or submit apull request. For format consistency, we will include a paper after (1) it has a version with author names AND (2) its GitHub and/or Hugging Face links are available.
- General
- Mathematics
- Physics
- Chemistry and Materials Science
- Biology and Medicine
- Geography, Geology, and Environmental Science
(SciBERT)SciBERT: A Pretrained Language Model for Scientific Text
EMNLP 2019
[Paper] [GitHub] [Model (Base)](SciGPT2)Explaining Relationships between Scientific Documents
ACL 2021
[Paper] [GitHub] [Model (117M)](CATTS)TLDR: Extreme Summarization of Scientific Documents
EMNLP 2020 Findings
[Paper] [GitHub] [Model (406M)](SciNewsBERT)SciClops: Detecting and Contextualizing Scientific Claims for Assisting Manual Fact-Checking
CIKM 2021
[Paper] [Model (Base)](ScholarBERT)The Diminishing Returns of Masked Language Models to Science
ACL 2023 Findings
[Paper] [Model (Large)] [Model (770M)](AcademicRoBERTa)A Japanese Masked Language Model for Academic Domain
COLING 2022 Workshop
[Paper] [GitHub] [Model (125M)](Galactica)Galactica: A Large Language Model for Science
arXiv 2022
[Paper] [Model (125M)] [Model (1.3B)] [Model (6.7B)] [Model (30B)] [Model (120B)](DARWIN)DARWIN Series: Domain Specific Large Language Models for Natural Science
arXiv 2023
[Paper] [GitHub] [Model (7B)](FORGE)FORGE: Pre-training Open Foundation Models for Science
SC 2023
[Paper] [GitHub] [Model (1.4B, General)] [Model (1.4B, Biology/Medicine)] [Model (1.4B, Chemistry)] [Model (1.4B, Engineering)] [Model (1.4B, Materials Science)] [Model (1.4B, Physics)] [Model (1.4B, Social Science/Art)] [Model (13B, General)] [Model (22B, General)](SciGLM)SciInstruct: a Self-Reflective Instruction Annotated Dataset for Training Scientific Language Models
NeurIPS 2024
[Paper] [GitHub] [Model (6B)](INDUS)INDUS: Effective and Efficient Language Models for Scientific Applications
EMNLP 2024
[Paper] [Model (38M)] [Model (125M)](SciDFM)SciDFM: A Large Language Model with Mixture-of-Experts for Science
arXiv 2024
[Paper] [Model (18.2B)]
(SPECTER)SPECTER: Document-level Representation Learning using Citation-informed Transformers
ACL 2020
[Paper] [GitHub] [Model (Base)](OAG-BERT)OAG-BERT: Towards a Unified Backbone Language Model for Academic Knowledge Services
KDD 2022
[Paper] [GitHub](ASPIRE)Multi-Vector Models with Textual Guidance for Fine-Grained Scientific Document Similarity
NAACL 2022
[Paper] [GitHub] [Model (Base)](SciNCL)Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings
EMNLP 2022
[Paper] [GitHub] [Model (Base)](SPECTER 2.0)SciRepEval: A Multi-Format Benchmark for Scientific Document Representations
EMNLP 2023
[Paper] [GitHub] [Model (113M)](SciPatton)Patton: Language Model Pretraining on Text-Rich Networks
ACL 2023
[Paper] [GitHub](SciMult)Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding
EMNLP 2023 Findings
[Paper] [GitHub] [Model (138M)]
(GenBERT)Injecting Numerical Reasoning Skills into Language Models
ACL 2020
[Paper] [GitHub](MathBERT)MathBERT: A Pre-trained Language Model for General NLP Tasks in Mathematics Education
arXiv 2021
[Paper] [GitHub] [Model (Base)](MWP-BERT)MWP-BERT: Numeracy-Augmented Pre-training for Math Word Problem Solving
NAACL 2022 Findings
[Paper] [GitHub] [Model (Base)](BERT-TD)Seeking Patterns, Not just Memorizing Procedures: Contrastive Learning for Solving Math Word Problems
ACL 2022 Findings
[Paper] [GitHub](GSM8K-GPT)Training Verifiers to Solve Math Word Problems
arXiv 2021
[Paper] [GitHub](DeductReasoner)Learning to Reason Deductively: Math Word Problem Solving as Complex Relation Extraction
ACL 2022
[Paper] [GitHub] [Model (125M)](NaturalProver)NaturalProver: Grounded Mathematical Proof Generation with Language Models
NeurIPS 2022
[Paper] [GitHub](Minerva)Solving Quantitative Reasoning Problems with Language Models
NeurIPS 2022
[Paper](Bhaskara)Lila: A Unified Benchmark for Mathematical Reasoning
EMNLP 2022
[Paper] [GitHub] [Model (2.7B)](WizardMath)WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
arXiv 2023
[Paper] [GitHub] [Model (7B)] [Model (13B)] [Model (70B)](MAmmoTH)MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning
ICLR 2024
[Paper] [GitHub] [Model (7B, LLaMA-2)] [Model (7B, Mistral)] [Model (13B, LLaMA-2)] [Model (70B, LLaMA-2)](MetaMath)MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
ICLR 2024
[Paper] [GitHub] [Model (7B, LLaMA-2)] [Model (7B, Mistral)] [Model (13B, LLaMA-2)] [Model (70B, LLaMA-2)](ToRA)ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving
ICLR 2024
[Paper] [GitHub] [Model (7B)] [Model (13B)] [Model (70B)](MathCoder)MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning
ICLR 2024
[Paper] [GitHub] [Model (7B)] [Model (13B)](Llemma)Llemma: An Open Language Model For Mathematics
ICLR 2024
[Paper] [GitHub] [Model (7B)] [Model (34B)](OVM)OVM, Outcome-Supervised Value Models for Planning in Mathematical Reasoning
NAACL 2024 Findings
[Paper] [GitHub] [Model (7B, LLaMA-2)] [Model (7B, Mistral)](DeepSeekMath)DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
arXiv 2024
[Paper] [GitHub] [Model (7B)](InternLM-Math)InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning
arXiv 2024
[Paper] [GitHub] [Model (7B)] [Model (20B)](OpenMath)OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset
NeurIPS 2024
[Paper] [Model (7B, Mistral)] [Model (70B, LLaMA-2)](Rho-Math)Rho-1: Not All Tokens Are What You Need
NeurIPS 2024
[Paper] [GitHub] [Model (1B)] [Model (7B)](MAmmoTH2)MAmmoTH2: Scaling Instructions from the Web
NeurIPS 2024
[Paper] [GitHub] [Model (7B, Mistral)] [Model (8B, LLaMA-3)] [Model (8x7B, Mixtral)](TheoremLlama)TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts
EMNLP 2024
[Paper] [GitHub] [Model (8B)]
(Inter-GPS)Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning
ACL 2021
[Paper] [GitHub](Geoformer)UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression
EMNLP 2022
[Paper] [GitHub](SCA-GPS)A Symbolic Character-Aware Model for Solving Geometry Problems
ACM MM 2023
[Paper] [GitHub](UniMath-Flan-T5)UniMath: A Foundational and Multimodal Mathematical Reasoner
EMNLP 2023
[Paper] [GitHub](G-LLaVA)G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model
arXiv 2023
[Paper] [GitHub] [Model (7B)] [Model (13B)]
(TAPAS)TAPAS: Weakly Supervised Table Parsing via Pre-training
ACL 2020
[Paper] [GitHub] [Model (Base)] [Model (Large)](TaBERT)TaBERT: Learning Contextual Representations for Natural Language Utterances and Structured Tables
ACL 2020
[Paper] [GitHub] [Model (Base)] [Model (Large)](GraPPa)GraPPa: Grammar-Augmented Pre-training for Table Semantic Parsing
ICLR 2021
[Paper] [GitHub] [Model (355M)](TUTA)TUTA: Tree-Based Transformers for Generally Structured Table Pre-training
KDD 2021
[Paper] [GitHub](RCI)Capturing Row and Column Semantics in Transformer Based Question Answering over Tables
NAACL 2021
[Paper] [GitHub] [Model (12M)](TABBIE)TABBIE: Pretrained Representations of Tabular Data
NAACL 2021
[Paper] [GitHub](TAPEX)TAPEX: Table Pre-training via Learning a Neural SQL Executor
ICLR 2022
[Paper] [GitHub] [Model (140M)] [Model (406M)](FORTAP)FORTAP: Using Formulas for Numerical-Reasoning-Aware Table Pretraining
ACL 2022
[Paper] [GitHub](OmniTab)OmniTab: Pretraining with Natural and Synthetic Data for Few-shot Table-Based Question Answering
NAACL 2022
[Paper] [GitHub] [Model (406M)](ReasTAP)ReasTAP: Injecting Table Reasoning Skills During Pre-training via Synthetic Reasoning Examples
EMNLP 2022
[Paper] [GitHub] [Model (406M)](Table-GPT)Table-GPT: Table-tuned GPT for Diverse Table Tasks
SIGMOD 2024
[Paper](TableLlama)TableLlama: Towards Open Large Generalist Models for Tables
NAACL 2024
[Paper] [GitHub] [Model (7B)](TableLLM)TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios
arXiv 2024
[Paper] [GitHub] [Model (7B)] [Model (13B)]
(astroBERT)Building astroBERT, a Language Model for Astronomy & Astrophysics
arXiv 2021
[Paper] [Model (Base)](AstroLLaMA)AstroLLaMA: Towards Specialized Foundation Models in Astronomy
AACL 2023 Workshop
[Paper] [Model (7B)](AstroLLaMA-Chat)AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets
Research Notes of the AAS 2024
[Paper] [Model (7B)](PhysBERT)PhysBERT: A Text Embedding Model for Physics Scientific Literature
APL Machine Learning 2024
[Paper] [Model (Base)](Astro-HEP-BERT)Astro-HEP-BERT: A Bidirectional Language Model for Studying the Meanings of Concepts in Astrophysics and High Energy Physics
arXiv 2024
[Paper] [Model (Base)]
(ChemBERT)Automated Chemical Reaction Extraction from Scientific Literature
Journal of Chemical Information and Modeling 2022
[Paper] [GitHub] [Model (Base)](MatSciBERT)MatSciBERT: A Materials Domain Language Model for Text Mining and Information Extraction
npj Computational Materials 2022
[Paper] [GitHub] [Model (Base)](MatBERT)Quantifying the Advantage of Domain-Specific Pre-training on Named Entity Recognition Tasks in Materials Science
Patterns 2022
[Paper] [GitHub](BatteryBERT)BatteryBERT: A Pretrained Language Model for Battery Database Enhancement
Journal of Chemical Information and Modeling 2022
[Paper] [GitHub] [Model (Base)](MaterialsBERT)A General-Purpose Material Property Data Extraction Pipeline from Large Polymer Corpora using Natural Language Processing
npj Computational Materials 2023
[Paper] [Model (Base)](Recycle-BERT)Recycle-BERT: Extracting Knowledge about Plastic Waste Recycling by Natural Language Processing
ACS Sustainable Chemistry & Engineering 2023
[Paper] [GitHub](CatBERTa)Catalyst Property Prediction with CatBERTa: Unveiling Feature Exploration Strategies through Large Language Models
ACS Catalysis 2023
[Paper] [GitHub](LLM-Prop)LLM-Prop: Predicting Physical and Electronic Properties of Crystalline Solids from Their Text Descriptions
arXiv 2023
[Paper] [GitHub](ChemDFM)ChemDFM: Dialogue Foundation Model for Chemistry
arXiv 2024
[Paper] [GitHub] [Model (13B)](CrystalLLM)Fine-Tuned Language Models Generate Stable Inorganic Materials as Text
ICLR 2024
[Paper] [GitHub](ChemLLM)ChemLLM: A Chemical Large Language Model
arXiv 2024
[Paper] [Model (7B)](LlaSMol)LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset
COLM 2024
[Paper] [GitHub] [Model (6.7B, Galactica)] [Model (7B, LLaMA-2)] [Model (7B, Mistral)](KALE-LM)KALE-LM: Unleash The Power Of AI For Science Via Knowledge And Logic Enhanced Large Model
arXiv 2024
[Paper] [Model (8B)]
(Text2Mol)Text2Mol: Cross-Modal Molecule Retrieval with Natural Language Queries
EMNLP 2021
[Paper] [GitHub](KV-PLM)A Deep-learning System Bridging Molecule Structure and Biomedical Text with Comprehension Comparable to Human Professionals
Nature Communications 2022
[Paper] [GitHub] [Model (Base)](MolT5)Translation between Molecules and Natural Language
EMNLP 2022
[Paper] [GitHub] [Model (60M)] [Model (220M)] [Model (770M)](MoMu)A Molecular Multimodal Foundation Model Associating Molecule Graphs with Natural Language
arXiv 2022
[Paper] [GitHub](MoleculeSTM)Multi-modal Molecule Structure-text Model for Text-Based Retrieval and Editing
Nature Machine Intelligence 2023
[Paper] [GitHub](Text+Chem T5)Unifying Molecular and Textual Representations via Multi-task Language Modelling
ICML 2023
[Paper] [GitHub] [Model (60M)] [Model (220M)](GIMLET)GIMLET: A Unified Graph-Text Model for Instruction-Based Molecule Zero-Shot Learning
NeurIPS 2023
[Paper] [GitHub] [Model (60M)](MolFM)MolFM: A Multimodal Molecular Foundation Model
arXiv 2023
[Paper] [GitHub](MolCA)MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter
EMNLP 2023
[Paper] [GitHub](MolLM)MolLM: A Unified Language Model for Integrating Biomedical Text with 2D and 3D Molecular Representations
Bioinformatics 2024
[Paper] [GitHub](InstructMol)InstructMol: Multi-Modal Integration for Building a Versatile and Reliable Molecular Assistant in Drug Discovery
COLING 2025
[Paper] [GitHub](3D-MoLM)Towards 3D Molecule-Text Interpretation in Language Models
ICLR 2024
[Paper] [GitHub]
- (GIT-Mol)GIT-Mol: A Multi-modal Large Language Model for Molecular Science with Graph, Image, and Text
Computers in Biology and Medicine 2024
[Paper] [GitHub]
(SMILES-BERT)SMILES-BERT: Large Scale Unsupervised Pre-training for Molecular Property Prediction
ACM BCB 2019
[Paper] [GitHub](MAT)Molecule Attention Transformer
arXiv 2020
[Paper] [GitHub](ChemBERTa)ChemBERTa: Large-Scale Self-Supervised Pretraining for Molecular Property Prediction
arXiv 2020
[Paper] [GitHub] [Model (125M)](MolBERT)Molecular Representation Learning with Language Models and Domain-Relevant Auxiliary Tasks
arXiv 2020
[Paper] [GitHub] [Model (Base)](rxnfp)Mapping the Space of Chemical Reactions using Attention-Based Neural Networks
Nature Machine Intelligence 2021
[Paper] [GitHub] [Model (Base)](RXNMapper)Extraction of Organic Chemistry Grammar from Unsupervised Learning of Chemical Reactions
Science Advances 2021
[Paper] [GitHub](MoLFormer)Large-Scale Chemical Language Representations Capture Molecular Structure and Properties
Nature Machine Intelligence 2022
[Paper] [GitHub] [Model (47M)](Chemformer)Chemformer: A Pre-trained Transformer for Computational Chemistry
Machine Learning: Science and Technology 2022
[Paper] [GitHub] [Model (45M)] [Model (230M)](R-MAT)Relative Molecule Self-Attention Transformer
Journal of Cheminformatics 2024
[Paper] [GitHub](MolGPT)MolGPT: Molecular Generation using a Transformer-Decoder Model
Journal of Chemical Information and Modeling 2022
[Paper] [GitHub](T5Chem)Unified Deep Learning Model for Multitask Reaction Predictions with Explanation
Journal of Chemical Information and Modeling 2022
[Paper] [GitHub](ChemGPT)Neural Scaling of Deep Chemical Models
Nature Machine Intelligence 2023
[Paper] [Model (4.7M)] [Model (19M)] [Model (1.2B)](Uni-Mol)Uni-Mol: A Universal 3D Molecular Representation Learning Framework
ICLR 2023
[Paper] [GitHub](TransPolymer)TransPolymer: A Transformer-Based Language Model for Polymer Property Predictions
npj Computational Materials 2023
[Paper] [GitHub](polyBERT)polyBERT: A Chemical Language Model to Enable Fully Machine-Driven Ultrafast Polymer Informatics
Nature Communications 2023
[Paper] [GitHub] [Model (86M)](MFBERT)Large-Scale Distributed Training of Transformers for Chemical Fingerprinting
Journal of Chemical Information and Modeling 2022
[Paper] [GitHub](SPMM)Bidirectional Generation of Structure and Properties Through a Single Molecular Foundation Model
Nature Communications 2024
[Paper] [GitHub](BARTSmiles)BARTSmiles: Generative Masked Language Models for Molecular Representations
Journal of Chemical Information and Modeling 2024
[Paper] [GitHub] [Model (406M)](MolGen)Domain-Agnostic Molecular Generation with Self-feedback
ICLR 2024
[Paper] [GitHub] [Model (406M, BART)] [Model (7B, LLaMA)](SELFormer)SELFormer: Molecular Representation Learning via SELFIES Language Models
Machine Learning: Science and Technology 2023
[Paper] [GitHub] [Model (58M)] [Model (87M)](PolyNC)PolyNC: A Natural and Chemical Language Model for the Prediction of Unified Polymer Properties
Chemical Science 2024
[Paper] [GitHub] [Model (220M)]
Acknowledgment: We referred to Wang et al.'s survey paperPre-trained Language Models in Biomedical Domain: A Systematic Survey and He et al.'s survey paperFoundation Model for Advancing Healthcare: Challenges, Opportunities, and Future Directions when writing some parts of this section.
(BioBERT)BioBERT: A Pre-trained Biomedical Language Representation Model for Biomedical Text Mining
Bioinformatics 2020
[Paper] [GitHub] [Model (Base)] [Model (Large)](BioELMo)Probing Biomedical Embeddings from Language Models
NAACL 2019 Workshop
[Paper] [GitHub] [Model (93M)](ClinicalBERT, Alsentzer et al.)Publicly Available Clinical BERT Embeddings
NAACL 2019 Workshop
[Paper] [GitHub] [Model (Base)](ClinicalBERT, Huang et al.)ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission
arXiv 2019
[Paper] [GitHub] [Model (Base)](BlueBERT, f.k.a. NCBI-BERT)Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets
ACL 2019 Workshop
[Paper] [GitHub] [Model (Base)] [Model (Large)](BEHRT)BEHRT: Transformer for Electronic Health Records
Scientific Reports 2020
[Paper] [GitHub](EhrBERT)Fine-Tuning Bidirectional Encoder Representations from Transformers (BERT)–Based Models on Large-Scale Electronic Health Record Notes: An Empirical Study
JMIR Medical Informatics 2019
[Paper] [GitHub](Clinical XLNet)Clinical XLNet: Modeling Sequential Clinical Notes and Predicting Prolonged Mechanical Ventilation
EMNLP 2020 Workshop
[Paper] [GitHub](ouBioBERT)Pre-training Technique to Localize Medical BERT and Enhance Biomedical BERT
arXiv 2020
[Paper] [GitHub] [Model (Base)](COVID-Twitter-BERT)COVID-Twitter-BERT: A Natural Language Processing Model to Analyse COVID-19 Content on Twitter
Frontiers in Artificial Intelligence 2023
[Paper] [GitHub] [Model (Large)](Med-BERT)Med-BERT: Pretrained Contextualized Embeddings on Large-Scale Structured Electronic Health Records for Disease Prediction
npj Digital Medicine 2021
[Paper] [GitHub](Bio-ELECTRA)On the Effectiveness of Small, Discriminatively Pre-trained Language Representation Models for Biomedical Text Mining
EMNLP 2020 Workshop
[Paper] [GitHub] [Model (Base)](BiomedBERT, f.k.a. PubMedBERT)Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing
ACM Transactions on Computing for Healthcare 2021
[Paper] [Model (Base)] [Model (Large)](MCBERT)Conceptualized Representation Learning for Chinese Biomedical Text Mining
arXiv 2020
[Paper] [GitHub] [Model (Base)](BRLTM)Bidirectional Representation Learning from Transformers using Multimodal Electronic Health Record Data to Predict Depression
JBHI 2021
[Paper] [GitHub](BioRedditBERT)COMETA: A Corpus for Medical Entity Linking in the Social Media
EMNLP 2020
[Paper] [GitHub] [Model (Base)](BioMegatron)BioMegatron: Larger Biomedical Domain Language Model
EMNLP 2020
[Paper] [GitHub] [Model (345M)](SapBERT)Self-Alignment Pretraining for Biomedical Entity Representations
NAACL 2021
[Paper] [GitHub] [Model (Base)](ClinicalTransformer)Clinical Concept Extraction using Transformers
JAMIA 2020
[Paper] [GitHub] [Model (Base, BERT)] [Model (125M, RoBERTa)] [Model (12M, ALBERT)] [Model (Base, ELECTRA)] [Model (Base, XLNet)] [Model (149M, Longformer)] [Model (86M, DeBERTa)](BioRoBERTa)Pretrained Language Models for Biomedical and Clinical Tasks: Understanding and Extending the State-of-the-Art
EMNLP 2020 Workshop
[Paper] [GitHub] [Model (125M)] [Model (355M)](RAD-BERT)Highly Accurate Classification of Chest Radiographic Reports using a Deep Learning Natural Language Model Pre-trained on 3.8 Million Text Reports
Bioinformatics 2020
[Paper] [GitHub](BioMedBERT)BioMedBERT: A Pre-trained Biomedical Language Model for QA and IR
COLING 2020
[Paper] [GitHub](LBERT)LBERT: Lexically Aware Transformer-Based Bidirectional Encoder Representation Model for Learning Universal Bio-Entity Relations
Bioinformatics 2021
[Paper] [GitHub](ELECTRAMed)ELECTRAMed: A New Pre-trained Language Representation Model for Biomedical NLP
arXiv 2021
[Paper] [GitHub] [Model (Base)](KeBioLM)Improving Biomedical Pretrained Language Models with Knowledge
NAACL 2021 Workshop
[Paper] [GitHub](SciFive)SciFive: A Text-to-Text Transformer Model for Biomedical Literature
arXiv 2021
[Paper] [GitHub] [Model (220M)] [Model (770M)](BioALBERT)Benchmarking for Biomedical Natural Language Processing Tasks with a Domain Specific ALBERT
BMC Bioinformatics 2022
[Paper] [GitHub] [Model (12M)] [Model (18M)](Clinical-Longformer)Clinical-Longformer and Clinical-BigBird: Transformers for Long Clinical Sequences
arXiv 2022
[Paper] [GitHub] [Model (149M, Longformer)] [Model (Base, BigBird)](BioBART)BioBART: Pretraining and Evaluation of A Biomedical Generative Language Model
ACL 2022 Workshop
[Paper] [GitHub] [Model (140M)] [Model (406M)](BioGPT)BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining
Briefings in Bioinformatics 2022
[Paper] [GitHub] [Model (355M)] [Model (1.5B)](Med-PaLM)Large Language Models Encode Clinical Knowledge
Nature 2023
[Paper](GatorTron)A Large Language Model for Electronic Health Records
npj Digital Medicine 2022
[Paper] [GitHub] [Model (345M)] [Model (3.9B)] [Model (8.9B)](ChatDoctor)ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) using Medical Domain Knowledge
Cureus 2023
[Paper] [GitHub](DoctorGLM)DoctorGLM: Fine-tuning your Chinese Doctor is not a Herculean Task
arXiv 2023
[Paper] [GitHub](BenTsao, f.k.a. HuaTuo)HuaTuo: Tuning LLaMA Model with Chinese Medical Knowledge
arXiv 2023
[Paper] [GitHub](MedAlpaca)MedAlpaca - An Open-Source Collection of Medical Conversational AI Models and Training Data
arXiv 2023
[Paper] [GitHub] [Model (7B)] [Model (13B)](PMC-LLaMA)PMC-LLaMA: Towards Building Open-source Language Models for Medicine
JAMIA 2024
[Paper] [GitHub] [Model (7B)] [Model (13B)](Med-PaLM 2)Toward Expert-Level Medical Question Answering with Large Language Models
Nature Medicine 2025
[Paper](HuatuoGPT)HuatuoGPT, towards Taming Language Model to Be a Doctor
EMNLP 2023 Findings
[Paper] [GitHub] [Model (7B)] [Model (13B)](MedCPT)MedCPT: Contrastive Pre-trained Transformers with Large-scale PubMed Search Logs for Zero-shot Biomedical Information Retrieval
Bioinformatics 2023
[Paper] [GitHub] [Model (Base)](Zhongjing)Zhongjing: Enhancing the Chinese Medical Capabilities of Large Language Model through Expert Feedback and Real-world Multi-turn Dialogue
AAAI 2024
[Paper] [GitHub] [Model (13B)](DISC-MedLLM)DISC-MedLLM: Bridging General Large Language Models and Real-World Medical Consultation
arXiv 2023
[Paper] [GitHub] [Model (13B)](DRG-LLaMA)DRG-LLaMA: Tuning LLaMA Model to Predict Diagnosis-related Group for Hospitalized Patients
npj Digital Medicine 2024
[Paper] [GitHub](Qilin-Med)Qilin-Med: Multi-stage Knowledge Injection Advanced Medical Large Language Model
arXiv 2023
[Paper] [GitHub](AlpaCare)AlpaCare: Instruction-tuned Large Language Models for Medical Application
arXiv 2023
[Paper] [GitHub] [Model (7B, LLaMA)] [Model (7B, LLaMA-2)] [Model (13B, LLaMA)] [Model (13B, LLaMA-2)](BianQue)BianQue: Balancing the Questioning and Suggestion Ability of Health LLMs with Multi-turn Health Conversations Polished by ChatGPT
arXiv 2023
[Paper] [GitHub] [Model (6B)](HuatuoGPT-II)HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs
COLM 2024
[Paper] [GitHub] [Model (7B)] [Model (13B)] [Model (34B)](Taiyi)Taiyi: A Bilingual Fine-Tuned Large Language Model for Diverse Biomedical Tasks
JAMIA 2024
[Paper] [GitHub] [Model (7B)](MEDITRON)MEDITRON-70B: Scaling Medical Pretraining for Large Language Models
arXiv 2023
[Paper] [GitHub] [Model (7B)] [Model (70B)](PLLaMa)PLLaMa: An Open-source Large Language Model for Plant Science
arXiv 2024
[Paper] [GitHub] [Model (7B)] [Model (13B)](BioMistral)BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains
ACL 2024 Findings
[Paper] [Model (7B)](Me-LLaMA)Me-LLaMA: Foundation Large Language Models for Medical Applications
arXiv 2024
[Paper] [GitHub](BiMediX)BiMediX: Bilingual Medical Mixture of Experts LLM
EMNLP 2024 Findings
[Paper] [GitHub] [Model (8x7B)](MMedLM)Towards Building Multilingual Language Model for Medicine
Nature Communications 2024
[Paper] [GitHub] [Model (7B, InternLM)] [Model (1.8B, InternLM2)] [Model (7B, InternLM2)] [Model (8B, LLaMA-3)](BioMedLM, f.k.a. PubMedGPT)BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text
arXiv 2024
[Paper] [GitHub] [Model (2.7B)](Hippocrates)Hippocrates: An Open-Source Framework for Advancing Large Language Models in Healthcare
arXiv 2024
[Paper] [Model (7B, LLaMA-2)] [Model (7B, Mistral)](BMRetriever)BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers
EMNLP 2024
[Paper] [GitHub] [Model (410M, Pythia)] [Model (1B, Pythia)] [Model (2B, Gemma)] [Model (7B, Mistral)](UltraMedical)UltraMedical: Building Specialized Generalists in Biomedicine
NeurIPS 2024
[Paper] [GitHub] [Model (8B, LLaMA-3)] [Model (70B, LLaMA-3)] [Model (8B, LLaMA-3.1)](Panacea)Panacea: A Foundation Model for Clinical Trial Search, Summarization, Design, and Recruitment
arXiv 2024
[Paper] [GitHub] [Model (7B)](HuatuoGPT-o1)HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
arXiv 2024
[Paper] [GitHub] [Model (8B, LLaMA-3.1)] [Model (70B, LLaMA-3.1)] [Model (7B, Qwen2.5)] [Model (72B, Qwen2.5)]
(G-BERT)Pre-training of Graph Augmented Transformers for Medication Recommendation
IJCAI 2019
[Paper] [GitHub](CODER)CODER: Knowledge Infused Cross-Lingual Medical Term Embedding for Term Normalization
JBI 2022
[Paper] [GitHub] [Model (Base)](MoP)Mixture-of-Partitions: Infusing Large Biomedical Knowledge Graphs into BERT
EMNLP 2021
[Paper] [GitHub](BioLinkBERT)LinkBERT: Pretraining Language Models with Document Links
ACL 2022
[Paper] [GitHub] [Model (Base)] [Model (Large)](DRAGON)Deep Bidirectional Language-Knowledge Graph Pretraining
NeurIPS 2022
[Paper] [GitHub] [Model (360M)]
(ConVIRT)Contrastive Learning of Medical Visual Representations from Paired Images and Text
MLHC 2022
[Paper] [GitHub](MMBERT)MMBERT: Multimodal BERT Pretraining for Improved Medical VQA
ISBI 2021
[Paper] [GitHub](MedViLL)Multi-modal Understanding and Generation for Medical Images and Text via Vision-Language Pre-training
JBHI 2022
[Paper] [GitHub](GLoRIA)GLoRIA: A Multimodal Global-Local Representation Learning Framework for Label-efficient Medical Image Recognition
ICCV 2021
[Paper] [GitHub](LoVT)Joint Learning of Localized Representations from Medical Images and Reports
ECCV 2022
[Paper] [GitHub](BioViL)Making the Most of Text Semantics to Improve Biomedical Vision-Language Processing
ECCV 2022
[Paper] [GitHub](M3AE)Multi-Modal Masked Autoencoders for Medical Vision-and-Language Pre-training
MICCAI 2022
[Paper] [GitHub] [Model](ARL)Align, Reason and Learn: Enhancing Medical Vision-and-Language Pre-training with Knowledge
ACM MM 2022
[Paper] [GitHub](CheXzero)Expert-Level Detection of Pathologies from Unannotated Chest X-ray Images via Self-Supervised Learning
Nature Biomedical Engineering 2022
[Paper] [GitHub] [Model](MGCA)Multi-Granularity Cross-modal Alignment for Generalized Medical Visual Representation Learning
NeurIPS 2022
[Paper] [GitHub] [Model](MedCLIP)MedCLIP: Contrastive Learning from Unpaired Medical Images and Text
EMNLP 2022
[Paper] [GitHub](BioViL-T)Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing
CVPR 2023
[Paper] [GitHub] [Model](BiomedCLIP)BiomedCLIP: A Multimodal Biomedical Foundation Model Pretrained from Fifteen Million Scientific Image-Text Pairs
NEJM AI 2024
[Paper] [Model](PMC-CLIP)PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents
MICCAI 2023
[Paper] [GitHub] [Model](Xplainer)Xplainer: From X-Ray Observations to Explainable Zero-Shot Diagnosis
MICCAI 2023
[Paper] [GitHub](RGRG)Interactive and Explainable Region-Guided Radiology Report Generation
CVPR 2023
[Paper] [GitHub] [Model](BiomedGPT)A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks
Nature Medicine 2024
[Paper] [GitHub] [Model (33M)] [Model (93M)] [Model (182M)](Med-UniC)Med-UniC: Unifying Cross-Lingual Medical Vision-Language Pre-Training by Diminishing Bias
NeurIPS 2023
[Paper] [GitHub](LLaVA-Med)LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day
NeurIPS 2023
[Paper] [GitHub] [Model (7B)](MI-Zero)Visual Language Pretrained Multiple Instance Zero-Shot Transfer for Histopathology Images
CVPR 2023
[Paper] [GitHub] [Model](XrayGPT)XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models
ACL 2024 Workshop
[Paper] [GitHub](MONET)Transparent Medical Image AI via an Image–Text Foundation Model Grounded in Medical Literature
Nature Medicine 2024
[Paper] [GitHub](QuiltNet)Quilt-1M: One Million Image-Text Pairs for Histopathology
NeurIPS 2023
[Paper] [GitHub] [Model](MUMC)Masked Vision and Language Pre-training with Unimodal and Multimodal Contrastive Losses for Medical Visual Question Answering
MICCAI 2023
[Paper] [GitHub](M-FLAG)M-FLAG: Medical Vision-Language Pre-training with Frozen Language Models and Latent Space Geometry Optimization
MICCAI 2023
[Paper] [GitHub](PRIOR)PRIOR: Prototype Representation Joint Learning from Medical Images and Reports
ICCV 2023
[Paper] [GitHub](Med-PaLM M)Towards Generalist Biomedical AI
NEJM AI 2024
[Paper] [GitHub](CITE)Text-Guided Foundation Model Adaptation for Pathological Image Classification
MICCAI 2023
[Paper] [GitHub](Med-Flamingo)Med-Flamingo: A Multimodal Medical Few-shot Learner
ML4H 2023
[Paper] [GitHub](RadFM)Towards Generalist Foundation Model for Radiology by Leveraging Web-scale 2D&3D Medical Data
arXiv 2023
[Paper] [GitHub] [Model](PLIP)A Visual–Language Foundation Model for Pathology Image Analysis using Medical Twitter
Nature Medicine 2023
[Paper] [GitHub] [Model](MaCo)Enhancing Representation in Radiography-Reports Foundation Model: A Granular Alignment Algorithm Using Masked Contrastive Learning
Nature Communications 2024
[Paper] [GitHub](CXR-CLIP)CXR-CLIP: Toward Large Scale Chest X-ray Language-Image Pre-training
MICCAI 2023
[Paper] [GitHub](Qilin-Med-VL)Qilin-Med-VL: Towards Chinese Large Vision-Language Model for General Healthcare
arXiv 2023
[Paper] [GitHub] [Model](BioCLIP)BioCLIP: A Vision Foundation Model for the Tree of Life
CVPR 2024
[Paper] [GitHub] [Model](M3D)M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models
arXiv 2024
[Paper] [GitHub] [Model](Med-Gemini)Capabilities of Gemini Models in Medicine
arXiv 2024
[Paper](Med-Gemini-2D/3D/Polygenic)Advancing Multimodal Medical Capabilities of Gemini
arXiv 2024
[Paper](Mammo-CLIP)Mammo-CLIP: A Vision Language Foundation Model to Enhance Data Efficiency and Robustness in Mammography
MICCAI 2024
[Paper] [GitHub] [Model](BiomedParse)A Foundation Model for Joint Segmentation, Detection and Recognition of Biomedical Objects across Nine Modalities
Nature Methods 2025
[Paper] [GitHub] [Model](HuatuoGPT-Vision)Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale
EMNLP 2024
[Paper] [GitHub] [Model (7B)] [Model (34B)]
(ProtTrans)ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning
TPAMI 2021
[Paper] [GitHub] [Model (420M, BERT)] [Model (224M, ALBERT)] [Model (409M, XLNet)] [Model (420M, ELECTRA)] [Model (3B, T5)] [Model (11B, T5)](ESM-1b)Biological Structure and Function Emerge from Scaling Unsupervised Learning to 250 Million Protein Sequences
PNAS 2021
[Paper] [GitHub] [Model (650M)](ESM-1v)Language Models Enable Zero-Shot Prediction of the Effects of Mutations on Protein Function
NeurIPS 2021
[Paper] [GitHub] [Model (650M)](AminoBERT)Single-Sequence Protein Structure Prediction using a Language Model and Deep Learning
Nature Biotechnology 2022
[Paper] [GitHub](ProteinBERT)ProteinBERT: A Universal Deep-Learning Model of Protein Sequence and Function
Bioinformatics 2022
[Paper] [GitHub] [Model (16M)](ProtGPT2)ProtGPT2 is a Deep Unsupervised Language Model for Protein Design
Nature Communications 2022
[Paper] [Model (738M)](ESM-IF1)Learning Inverse Folding from Millions of Predicted Structures
ICML 2022
[Paper] [GitHub] [Model (142M)](ProGen)Large Language Models Generate Functional Protein Sequences across Diverse Families
Nature Biotechnology 2023
[Paper] [GitHub] [Model (1.6B)](ProGen2)ProGen2: Exploring the Boundaries of Protein Language Models
Cell Systems 2023
[Paper] [GitHub] [Model (151M)] [Model (764M)] [Model (2.7B)] [Model (6.4B)](ESM-2)Evolutionary-Scale Prediction of Atomic-Level Protein Structure with a Language Model
Science 2023
[Paper] [GitHub] [Model (8M)] [Model (35M)] [Model (150M)] [Model (650M)] [Model (3B)] [Model (15B)](Ankh)Ankh: Optimized Protein Language Model Unlocks General-Purpose Modelling
arXiv 2023
[Paper] [GitHub] [Model (450M)] [Model (1.1B)](ProtST)ProtST: Multi-Modality Learning of Protein Sequences and Biomedical Texts
ICML 2023
[Paper] [GitHub](LM-Design)Structure-informed Language Models Are Protein Designers
ICML 2023
[Paper] [GitHub] [Model (659M)](ProteinDT)A Text-Guided Protein Design Framework
arXiv 2023
[Paper] [GitHub](gLM)Genomic Language Model Predicts Protein Co-Regulation and Function
Nature Communications 2024
[Paper] [GitHub] [Model (1B)](Prot2Text)Prot2Text: Multimodal Protein's Function Generation with GNNs and Transformers
AAAI 2024
[Paper] [GitHub] [Model (256M)] [Model (283M)] [Model (398M)] [Model (898M)](BioMedGPT)BioMedGPT: Open Multimodal Generative Pre-trained Transformer for BioMedicine
arXiv 2023
[Paper] [GitHub] [Model (10B)](SaProt)SaProt: Protein Language Modeling with Structure-Aware Vocabulary
ICLR 2024
[Paper] [GitHub] [Model (35M)] [Model (650M)](BioT5)BioT5: Enriching Cross-modal Integration in Biology with Chemical Knowledge and Natural Language Associations
EMNLP 2023
[Paper] [GitHub] [Model (220M)](xTrimoPGLM)xTrimoPGLM: Unified 100B-Scale Pre-trained Transformer for Deciphering the Language of Protein
arXiv 2024
[Paper] [GitHub] [Model (1B)] [Model (3B)] [Model (10B)] [Model (100B)](ProLLaMA)ProLLaMA: A Protein Large Language Model for Multi-Task Protein Language Processing
arXiv 2024
[Paper] [GitHub] [Model (7B)](ProteinCLIP)ProteinCLIP: Enhancing Protein Language Models with Natural Language
bioRxiv 2024
[Paper] [GitHub](ESM-3)Simulating 500 Million Years of Evolution with a Language Model
Science 2025
[Paper] [GitHub] [Model (98B)]
(DNABERT)DNABERT: Pre-trained Bidirectional Encoder Representations from Transformers Model for DNA-Language in Genome
Bioinformatics 2021
[Paper] [GitHub] [Model (Base)](GenSLMs)GenSLMs: Genome-Scale Language Models Reveal SARS-CoV-2 Evolutionary Dynamics
The International Journal of High Performance Computing Applications 2023
[Paper] [GitHub](Nucleotide Transformer)Nucleotide Transformer: Building and Evaluating Robust Foundation Models for Human Genomics
Nature Methods 2024
[Paper] [GitHub] [Model (50M)] [Model (100M)] [Model (250M)] [Model (500M)](SpeciesLM)Species-Aware DNA Language Models Capture Regulatory Elements and Their Evolution
Genome Biology 2024
[Paper] [GitHub] [Model (89M)](GENA-LM)GENA-LM: A Family of Open-Source Foundational DNA Language Models for Long Sequences
Nucleic Acids Research 2025
[Paper] [GitHub] [Model (Base, BERT)] [Model (Large, BERT)] [Model (Base, BigBird)](DNABERT-2)DNABERT-2: Efficient Foundation Model and Benchmark for Multi-Species Genome
ICLR 2024
[Paper] [GitHub] [Model (Base)](HyenaDNA)HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution
NeurIPS 2023
[Paper] [GitHub] [Model (0.4M)] [Model (3.3M)] [Model (6.6M)](DNAGPT)DNAGPT: A Generalized Pre-trained Tool for Versatile DNA Sequence Analysis Tasks
arXiv 2023
[Paper] [GitHub] [Model (0.1B)] [Model (3B)](GPN-MSA)GPN-MSA: An Alignment-Based DNA Language Model for Genome-Wide Variant Effect Prediction
Nature Biotechnology 2025
[Paper] [GitHub] [Model (86M)](ENBED)Understanding the Natural Language of DNA using Encoder-Decoder Foundation Models with Byte-level Precision
Bioinformatics Advances 2024
[Paper] [GitHub](LucaOne)LucaOne: Generalized Biological Foundation Model with Unified Nucleic Acid and Protein Language
bioRxiv 2024
[Paper] [GitHub](AIDO.DNA)Accurate and General DNA Representations Emerge from Genome Foundation Models at Scale
bioRxiv 2024
[Paper] [GitHub] [Model (7B)]
(RNABERT)Informative RNA-base Embedding for Functional RNA Structural Alignment and Clustering by Deep Representation Learning
NAR Genomics and Bioinformatics 2022
[Paper] [GitHub](RNA-FM)Interpretable RNA Foundation Model from Unannotated Data for Highly Accurate RNA Structure and Function Predictions
arXiv 2022
[Paper] [GitHub](SpliceBERT)Self-Supervised Learning on Millions of Primary RNA Sequences from 72 Vertebrates Improves Sequence-Based RNA Splicing Prediction
Briefings in Bioinformatics 2024
[Paper] [GitHub] [Model (19.4M)](RNA-MSM)Multiple Sequence-Alignment-Based RNA Language Model and its Application to Structural Inference
Nucleic Acids Research 2024
[Paper] [GitHub](CodonBERT)CodonBERT Large Language Model for mRNA Vaccines
Genome Research 2024
[Paper] [GitHub](UTR-LM)A 5' UTR Language Model for Decoding Untranslated Regions of mRNA and Function Predictions
Nature Machine Intelligence 2024
[Paper] [GitHub](GenerRNA)GenerRNA: A Generative Pre-trained Language Model for de novo RNA Design
PLoS One 2024
[Paper] [Model (350M)](RNAErnie)Multi-Purpose RNA Language Modelling with Motif-Aware Pretraining and Type-Guided Fine-Tuning
Nature Machine Intelligence 2024
[Paper] [GitHub] [Model (105M)](RNA-TorsionBERT)RNA-TorsionBERT: Leveraging Language Models for RNA 3D Torsion Angles Prediction
Bioinformatics 2025
[Paper] [GitHub](PlantRNA-FM)An Interpretable RNA Foundation Model for Exploring Functional RNA Motifs in Plants
Nature Machine Intelligence 2024
[Paper] [GitHub] [Model (35M)](AIDO.RNA)A Large-Scale Foundation Model for RNA Function and Structure Prediction
bioRxiv 2024
[Paper] [GitHub] [Model (1.6B)]
(scBERT)scBERT as a Large-scale Pretrained Deep Language Model for Cell Type Annotation of Single-cell RNA-seq Data
Nature Machine Intelligence 2022
[Paper] [GitHub](scGPT)scGPT: Towards Building a Foundation Model for Single-Cell Multi-omics using Generative AI
Nature Methods 2024
[Paper] [GitHub](scFoundation)Large Scale Foundation Model on Single-cell Transcriptomics
Nature Methods 2024
[Paper] [GitHub] [Model (100M)](Geneformer)Transfer Learning Enables Predictions in Network Biology
Nature 2023
[Paper] [Model (10M)] [Model (40M)](CellLM)Large-Scale Cell Representation Learning via Divide-and-Conquer Contrastive Learning
arXiv 2023
[Paper] [GitHub](CellPLM)CellPLM: Pre-training of Cell Language Model Beyond Single Cells
ICLR 2024
[Paper] [GitHub] [Model (82M)](scMulan)scMulan: A Multitask Generative Pre-trained Language Model for Single-Cell Analysis
RECOMB 2024
[Paper] [GitHub]
(ClimateBERT)ClimateBERT: A Pretrained Language Model for Climate-Related Text
arXiv 2021
[Paper] [GitHub] [Model (82M)](SpaBERT)SpaBERT: A Pretrained Language Model from Geographic Data for Geo-Entity Representation
EMNLP 2022 Findings
[Paper] [GitHub] [Model (Base)] [Model (Large)](MGeo)MGeo: Multi-Modal Geographic Pre-training Method
SIGIR 2023
[Paper] [GitHub](K2)K2: A Foundation Language Model for Geoscience Knowledge Understanding and Utilization
WSDM 2024
[Paper] [GitHub] [Model (7B)](OceanGPT)OceanGPT: A Large Language Model for Ocean Science Tasks
ACL 2024
[Paper] [GitHub] [Model (7B)](ClimateBERT-NetZero)ClimateBERT-NetZero: Detecting and Assessing Net Zero and Reduction Targets
EMNLP 2023
[Paper] [Model (82M)](GeoLM)GeoLM: Empowering Language Models for Geospatially Grounded Language Understanding
EMNLP 2023
[Paper] [GitHub](GeoGalactica)GeoGalactica: A Scientific Large Language Model in Geoscience
arXiv 2024
[Paper] [GitHub] [Model (30B)](UrbanKGent)UrbanKGent: A Unified Large Language Model Agent Framework for Urban Knowledge Graph Construction
NeurIPS 2024
[Paper] [GitHub] [Model (7B, LLaMA-2)] [Model (13B, LLaMA-2)] [Model (8B, LLaMA-3)](UrbanGPT)UrbanGPT: Spatio-Temporal Large Language Models
KDD 2024
[Paper] [GitHub] [Model (7B)](JiuZhou)JiuZhou: Open Foundation Language Models and Effective Pre-training Framework for Geoscience
International Journal of Digital Earth 2025
[Paper] [GitHub] [Model (7B)]
(ERNIE-GeoL)ERNIE-GeoL: A Geography-and-Language Pre-trained Model and its Applications in Baidu Maps
KDD 2022
[Paper](PK-Chat)PK-Chat: Pointer Network Guided Knowledge Driven Generative Dialogue Model
arXiv 2023
[Paper] [GitHub]
(GeoCLIP)GeoCLIP: CLIP-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization
NeurIPS 2023
[Paper] [GitHub](UrbanCLIP)UrbanCLIP: Learning Text-Enhanced Urban Region Profiling with Contrastive Language-Image Pretraining from the Web
WWW 2024
[Paper] [GitHub]
(FourCastNet)FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators
arXiv 2022
[Paper] [GitHub](Pangu-Weather)Accurate Medium-Range Global Weather Forecasting with 3D Neural Networks
Nature 2023
[Paper] [GitHub](GraphCast)Learning Skillful Medium-Range Global Weather Forecasting
Science 2023
[Paper] [GitHub](ClimaX)ClimaX: A Foundation Model for Weather and Climate
ICML 2023
[Paper] [GitHub](FengWu)FengWu: Pushing the Skillful Global Medium-Range Weather Forecast beyond 10 Days Lead
arXiv 2023
[Paper] [GitHub](W-MAE)W-MAE: Pre-trained Weather Model with Masked Autoencoder for Multi-Variable Weather Forecasting
arXiv 2023
[Paper] [GitHub](FuXi)FuXi: A Cascade Machine Learning Forecasting System for 15-day Global Weather Forecast
npj Climate and Atmospheric Science 2023
[Paper] [GitHub](Stormer)Scaling Transformer Neural Networks for Skillful and Reliable Medium-Range Weather Forecasting
NeurIPS 2024
[Paper] [GitHub](Aurora)A Foundation Model for the Earth System
arXiv 2024
[Paper] [GitHub](Prithvi WxC)Prithvi WxC: Foundation Model for Weather and Climate
arXiv 2024
[Paper] [GitHub] [Model (2.3B)]
If you find this repository useful, please cite the following paper:
@inproceedings{zhang2024comprehensive, title={A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery}, author={Zhang, Yu and Chen, Xiusi and Jin, Bowen and Wang, Sheng and Ji, Shuiwang and Wang, Wei and Han, Jiawei}, booktitle={EMNLP'24}, pages={8783--8817}, year={2024}}