Denis2054/Transformers-for-NLP-and-Computer-Vision-3rd-EditionPublic

NotificationsYou must be signed in to change notification settings
Fork177
Star476

Transformers 3rd Edition

License

MIT license

476 stars 177 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 1,360 Commits
Chapter01		Chapter01
Chapter02		Chapter02
Chapter03		Chapter03
Chapter04		Chapter04
Chapter05		Chapter05
Chapter06		Chapter06
Chapter07		Chapter07
Chapter08		Chapter08
Chapter09		Chapter09
Chapter10		Chapter10
Chapter11		Chapter11
Chapter12		Chapter12
Chapter13		Chapter13
Chapter14		Chapter14
Chapter15		Chapter15
Chapter16		Chapter16
Chapter17		Chapter17
Chapter18		Chapter18
Chapter19		Chapter19
Chapter20		Chapter20
Notebook images		Notebook images
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
Transformers_3rd_Edition.jpg		Transformers_3rd_Edition.jpg
errata.md		errata.md

Repository files navigation

Transformers for Natural Language Processing and Computer Vision: Take Generative AI and LLMs to the next level with Hugging Face, Google Vertex AI, ChatGPT, GPT-4V, and DALL-E 3 3rd Edition

by Denis Rothman

This repo is continually updated and upgraded.

Last updated: August 14, 2025
📝 For details on updates and improvements, see theChangelog.

🚩If you see anything that doesn't run as expected, raise an issue, and we'll work on it!

Look for 🐬 to explorenew bonus notebooks such as and DeepSeek-R1 and OpenAI o1 reasoning models, Midjourney's API, Google Vertex AI Gemini's API, OpenAI asynchronous batch API calls!
Look for 🎏 to explore existing notebooks for thelatest model or platform releases, such as OpenAI's latest models (GPT-4o and o1).
Look for 🛠 to run existing notebooks withnew dependency versions and platform API constraints and tweaks.

Transformers-for-NLP-and-Computer-Vision-3rd-Edition

This is the code repository forTransformers for Natural Language Processing and Computer Vision, published by Packt.

Explore Generative AI and Large Language Models with Hugging Face, ChatGPT, GPT-4V, and DALL-E 3

About the book

Transformers for Natural Language Processing and Computer Vision, Third Edition, exploresLarge Language Model (LLM) architectures, applications, and various platforms (Hugging Face, OpenAI, and Google Vertex AI) used forNatural Language Processing (NLP) andComputer Vision (CV).

Dive into generative vision transformers and multimodal model architectures and build applications, such as image and video-to-text classifiers. Go further by combining different models and platforms and learning about AI agent replication.

What you will learn

Learn how to pretrain and fine-tune LLMs
Learn how to work with multiple platforms, such as Hugging Face, OpenAI, and Google Vertex AI
Learn about different tokenizers and the best practices for preprocessing language data
Implement Retrieval Augmented Generation and rules bases to mitigate hallucinations
Visualize transformer model activity for deeper insights using BertViz, LIME, and SHAP
Create and implement cross-platform chained models, such as HuggingGPT
Go in-depth into vision transformers with CLIP, DALL-E 2, DALL-E 3, and GPT-4V

Chapters

What Are Transformers?
Getting Started with the Architecture of the Transformer Model
Emergent vs Downstream Tasks: The Unseen Depths of Transformers
Advancements in Translations with Google Trax, Google Translate, and Gemini
Diving into Fine-Tuning through BERT
Pretraining a Transformer from Scratch through RoBERTa
The Generative AI Revolution with ChatGPT
Fine-Tuning OpenAI GPT Models
Shattering the Black Box with Interpretable Tools
Investigating the Role of Tokenizers in Shaping Transformer Models
Leveraging LLM Embeddings as an Alternative to Fine-Tuning
Toward Syntax-Free Semantic Role Labeling with ChatGPT and GPT-4
Summarization with T5 and ChatGPT
Exploring Cutting-Edge LLMs with Vertex AI and PaLM 2
Guarding the Giants: Mitigating Risks in Large Language Models
Beyond Text: Vision Transformers in the Dawn of Revolutionary AI
Transcending the Image-Text Boundary with Stable Diffusion
Hugging Face AutoTrain: Training Vision Models without Coding
On the Road to Functional AGI with HuggingGPT and its Peers
Beyond Human-Designed Prompts with Generative Ideation

Appendix

Appendix: Answers to the Questions

Platforms

You can run the notebooks directly from the table below:

Chapter	Colab	Kaggle	Gradient	StudioLab

Part I The Foundations of Transformer Models
Chapter 1: What are Transformers?
🛠O_1_and_Accelerators.ipynb ChatGPT_Plus_writes_and_explains_AI.ipynb
Getting started with DeepSeek-R1 Reasoning models. Integrated into HuggingFace Hub and Together.
🐬DeepSeek_Hugging_Face.ipynb
Chapter 2: Getting Started with the Architecture of the Transformer Model
🛠Multi_Head_Attention_Sub_Layer.ipynb positional_encoding.ipynb
Explaining DeepSeek's Training innovations; Part 1: RL
🐬DeepSeek_R1_Zero_RL.ipynb
Explaining DeepSeek's Training innovations; Part 2: RoPE
🐬DeepSeek_attention_head_RoPE.ipynb
Chapter 3: Emergent vs Downstream Tasks: the Unseen Depths of Transformers
From_training_to_emergence.ipynb Transformer_tasks_with_Hugging_Face.ipynb
Chapter 4: Advancements in Translations with Google Trax, Google Translate, and Google Bard
WMT_translations.ipynb Trax_Google_Translate.ipynb
Chapter 5: Diving into Fine-Tuning through BERT
BERT_Fine_Tuning_Sentence_Classification_GPU.ipynb
Chapter 6: Pretraining a Transformer from Scratch through RoBERTa
🎏 KantaiBERT.ipynb 🎏🛠 Customer_Support_for_X.ipynb
Part II: The Rise of Suprahuman NLP
Chapter 7: The Generative AI Revolution with ChatGPT
OpenAI_Models.ipynb OpenAI_GPT_4_Assistant.ipynb 🎏Getting_Started_GPT_4_API.ipynb(GPT-4o) 🎏GPT_4_RAG.ipynb(GPT-4o)
OpenAI Reasoning models: the o1 API
🐬OpenAI_Reasoning_models_o1_API.ipynb
OpenAI Reasoning models: the o1-preview API
🐬OpenAI_Reasoning_models_o3_API.ipynb
Chapter 8: Fine-tuning OpenAI Models
Fine_tuning_OpenAI_Models.ipynb 🎏Fine_tuning_GPT_4o_mini_SQuAd.ipynb
Fine-Tuning GPT-4.1
🎏Fine_tuning_GPT_4.1_mini_SQuAd.ipynb
🐬RAG as an alternative to fine-tuning: Building Scalable Knowledge-based RAG-drive Generative AI
Click here to access an open-source library to implement RAG
Chapter 9: Shattering the Black Box with Interpretable tools
BertViz_Interactive.ipynb Hugging_Face_SHAP.ipynb
Chapter 10: Investigating the Role of Tokenizers in Shaping Transformer Models
Tokenizers.ipynb Sub_word_tokenizers.ipynb 🛠Exploring_tokenizers.ipynb
Chapter 11: Leveraging LLM Embeddings as an Alternative to Fine-Tuning
🛠Embedding_with_NLKT_Gensim.ipynb 🎏Question_answering_with_embeddings.ipynb 🛠Transfer_Learning_with_Ada_Embeddings.ipynb
Chapter 12: Towards Syntax-Free Semantic Role Labeling with BERT and OpenAI's ChatGPT
Semantic_Role_Labeling_GPT-4.ipynb
Chapter 13: Summarization with T5 and ChatGPT
🛠Summerizing_Text_T5.ipynb Summarizing_ChatGPT.ipynb
Chapter 14: Exploring Cutting-Edge NLP with Google Vertex AI(PaLM and🐬Gemini with gemini-1.5-flash-001
Google_Vertex_AI.ipynb 🐬Google_Vertex_AI_Gemini.ipynb
Gemini 2.5 Flash showcase of Generative AI tasks
🐬Google_Gemini_2.5_Flash.ipynb
Chapter 15: Guarding the Giants: Mitigating Risks in Large Language Models<
🎏Auto_Big_bench.ipynb(GPT-4o,synchronous) 🎏Auto_Big_bench.ipynb(GPT-4o-mini,synchronous) 🐬GPT API Speed++ with Asynchronous Batch Calls! 🛠WandB_Prompts_Quickstart.ipynb Encoder_decoder_transformer.ipynb Mitigating_Generative_AI.ipynb
Part III: Generative Computer Vision: A New Way to See the World
Chapter 16: Vision Transformers in the Dawn of Revolutionary AI
ViT_CLIP.ipynb Getting_Started_DALL_E_API.ipynb 🎏GPT-4V.ipynb(GPT-4o)
Chapter 17: Transcending the Image-Text Boundary with Stable Diffusion
Stable_Diffusion_Keras.ipynb Stable__Vision_Stability_AI.ipynb Stable__Vision_Stability_AI_Animation.ipynb Text_to_video_synthesis.ipynb TimeSformer.ipynb
Stable Diffusion with Hugging Face
🐬Stable_Diffusion_Hugging_Face.ipynb
Chapter 18: Automated Vision Transformer Training
🛠Hugging_Face_AutoTrain.ipynb
Chapter 19: On the Road to Functional AGI with HuggingGPT and its Peers
Computer_Vision_Analysis.ipynb
Chapter 20: Generative AI Ideation Vertex AI, Langchain, and Stable Diffusion
Automated_Design.ipynb Midjourney_bot.ipynb 🎏Automated_Ideation.ipynb 🐬 MyMidjourney_API.ipynb

Raise an issue

You cancreate an issue We will be glad to provide support!in this repository if you encounter one in the notebooks.

Get my copy

If you feel this book is for you, get yourcopy today!

Know more on the Discord server

You can get more engaged on the Discord server for more latest updates and discussions in the community atDiscord

Download a free PDF

If you have already purchased a print or Kindle version of this book, you can get a DRM-free PDF version at no cost. Simply click on the link to claim yourFree PDF

We also provide a PDF file that has color images of the screenshots/diagrams used in this book atColorImages

Get to Know the Author

Denis Rothman graduated from Sorbonne University and Paris-Cité University, designing one of the first patented encoding and embedding systems and teaching at Paris-I Panthéon Sorbonne.He authored one of the first patented word encoding and AI bots/robots. He began his career delivering a Natural Language Processing (NLP) chatbot for Moët et Chandon(LVMH) and an AI tactical defense optimizer for Airbus (formerly Aerospatiale).Denis then authored an AI optimizer for IBM and luxury brands, leading to an Advanced Planning and Scheduling (APS) solution used worldwide.LinkedIn

About

Transformers 3rd Edition

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Folders and files

Latest commit

History

Repository files navigation

Transformers for Natural Language Processing and Computer Vision: Take Generative AI and LLMs to the next level with Hugging Face, Google Vertex AI, ChatGPT, GPT-4V, and DALL-E 3 3rd Edition

Transformers-for-NLP-and-Computer-Vision-3rd-Edition

About the book

What you will learn

Table of Contents

Chapters

Appendix

Platforms

Raise an issue

Get my copy

Know more on the Discord server

Download a free PDF

Get to Know the Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Contributors2

Languages

Movatterモバイル変換

License

Denis2054/Transformers-for-NLP-and-Computer-Vision-3rd-Edition

Folders and files

Latest commit

History

Repository files navigation

Transformers for Natural Language Processing and Computer Vision: Take Generative AI and LLMs to the next level with Hugging Face, Google Vertex AI, ChatGPT, GPT-4V, and DALL-E 3 3rd Edition

Transformers-for-NLP-and-Computer-Vision-3rd-Edition

About the book

What you will learn

Table of Contents

Chapters

Appendix

Platforms

Raise an issue

Get my copy

Know more on the Discord server

Download a free PDF

Get to Know the Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Contributors2

Languages

Packages