Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Transformers 3rd Edition

License

NotificationsYou must be signed in to change notification settings

Denis2054/Transformers-for-NLP-and-Computer-Vision-3rd-Edition

Repository files navigation

by Denis Rothman

drawing

This repo is continually updated and upgraded.

Last updated: August 14, 2025
📝 For details on updates and improvements, see theChangelog.

🚩If you see anything that doesn't run as expected, raise an issue, and we'll work on it!

Look for 🐬 to explorenew bonus notebooks such as and DeepSeek-R1 and OpenAI o1 reasoning models, Midjourney's API, Google Vertex AI Gemini's API, OpenAI asynchronous batch API calls!
Look for 🎏 to explore existing notebooks for thelatest model or platform releases, such as OpenAI's latest models (GPT-4o and o1).
Look for 🛠 to run existing notebooks withnew dependency versions and platform API constraints and tweaks.

Transformers-for-NLP-and-Computer-Vision-3rd-Edition

This is the code repository forTransformers for Natural Language Processing and Computer Vision, published by Packt.

Explore Generative AI and Large Language Models with Hugging Face, ChatGPT, GPT-4V, and DALL-E 3

About the book

Transformers for Natural Language Processing and Computer Vision, Third Edition, exploresLarge Language Model (LLM) architectures, applications, and various platforms (Hugging Face, OpenAI, and Google Vertex AI) used forNatural Language Processing (NLP) andComputer Vision (CV).

Dive into generative vision transformers and multimodal model architectures and build applications, such as image and video-to-text classifiers. Go further by combining different models and platforms and learning about AI agent replication.

What you will learn

  • Learn how to pretrain and fine-tune LLMs
  • Learn how to work with multiple platforms, such as Hugging Face, OpenAI, and Google Vertex AI
  • Learn about different tokenizers and the best practices for preprocessing language data
  • Implement Retrieval Augmented Generation and rules bases to mitigate hallucinations
  • Visualize transformer model activity for deeper insights using BertViz, LIME, and SHAP
  • Create and implement cross-platform chained models, such as HuggingGPT
  • Go in-depth into vision transformers with CLIP, DALL-E 2, DALL-E 3, and GPT-4V

Table of Contents

Chapters

  1. What Are Transformers?
  2. Getting Started with the Architecture of the Transformer Model
  3. Emergent vs Downstream Tasks: The Unseen Depths of Transformers
  4. Advancements in Translations with Google Trax, Google Translate, and Gemini
  5. Diving into Fine-Tuning through BERT
  6. Pretraining a Transformer from Scratch through RoBERTa
  7. The Generative AI Revolution with ChatGPT
  8. Fine-Tuning OpenAI GPT Models
  9. Shattering the Black Box with Interpretable Tools
  10. Investigating the Role of Tokenizers in Shaping Transformer Models
  11. Leveraging LLM Embeddings as an Alternative to Fine-Tuning
  12. Toward Syntax-Free Semantic Role Labeling with ChatGPT and GPT-4
  13. Summarization with T5 and ChatGPT
  14. Exploring Cutting-Edge LLMs with Vertex AI and PaLM 2
  15. Guarding the Giants: Mitigating Risks in Large Language Models
  16. Beyond Text: Vision Transformers in the Dawn of Revolutionary AI
  17. Transcending the Image-Text Boundary with Stable Diffusion
  18. Hugging Face AutoTrain: Training Vision Models without Coding
  19. On the Road to Functional AGI with HuggingGPT and its Peers
  20. Beyond Human-Designed Prompts with Generative Ideation

Appendix

Appendix: Answers to the Questions

Platforms

You can run the notebooks directly from the table below:

ChapterColabKaggleGradientStudioLab
Part I The Foundations of Transformer Models
Chapter 1: What are Transformers?
  • 🛠O_1_and_Accelerators.ipynb
  • ChatGPT_Plus_writes_and_explains_AI.ipynb
Open In ColabOpen In ColabKaggleKaggleGradientGradientOpen In SageMaker Studio LabOpen In SageMaker Studio Lab
Getting started with DeepSeek-R1 Reasoning models. Integrated into HuggingFace Hub and Together.
  • 🐬DeepSeek_Hugging_Face.ipynb
Open In ColabKaggleGradientOpen In SageMaker Studio Lab
Chapter 2: Getting Started with the Architecture of the Transformer Model
  • 🛠Multi_Head_Attention_Sub_Layer.ipynb
  • positional_encoding.ipynb
Open In ColabOpen In ColabKaggleKaggleGradientGradientOpen In SageMaker Studio LabOpen In SageMaker Studio Lab
Explaining DeepSeek's Training innovations; Part 1: RL
  • 🐬DeepSeek_R1_Zero_RL.ipynb
Open In ColabKaggleGradientOpen In SageMaker Studio Lab
Explaining DeepSeek's Training innovations; Part 2: RoPE
  • 🐬DeepSeek_attention_head_RoPE.ipynb
Open In ColabKaggleGradientOpen In SageMaker Studio Lab
Chapter 3: Emergent vs Downstream Tasks: the Unseen Depths of Transformers
  • From_training_to_emergence.ipynb
  • Transformer_tasks_with_Hugging_Face.ipynb
Open In ColabOpen In ColabKaggleKaggleGradientGradientOpen In SageMaker Studio LabOpen In SageMaker Studio Lab
Chapter 4: Advancements in Translations with Google Trax, Google Translate, and Google Bard
  • WMT_translations.ipynb
  • Trax_Google_Translate.ipynb
Open In ColabOpen In ColabKaggleKaggleGradientGradientOpen In SageMaker Studio LabOpen In SageMaker Studio Lab
Chapter 5: Diving into Fine-Tuning through BERT
  • BERT_Fine_Tuning_Sentence_Classification_GPU.ipynb
Open In ColabKaggleGradientOpen In SageMaker Studio Lab
Chapter 6: Pretraining a Transformer from Scratch through RoBERTa
  • 🎏 KantaiBERT.ipynb
  • 🎏🛠 Customer_Support_for_X.ipynb
Open In ColabOpen In ColabKaggleKaggleGradientGradientOpen In SageMaker Studio LabOpen In SageMaker Studio Lab
Part II: The Rise of Suprahuman NLP
Chapter 7: The Generative AI Revolution with ChatGPT
  • OpenAI_Models.ipynb
  • OpenAI_GPT_4_Assistant.ipynb
  • 🎏Getting_Started_GPT_4_API.ipynb(GPT-4o)
  • 🎏GPT_4_RAG.ipynb(GPT-4o)
Open In ColabOpen In ColabOpen In ColabOpen In ColabKaggleKaggleKaggleKaggleGradientGradientGradientGradientOpen In SageMaker Studio LabOpen In SageMaker Studio LabOpen In SageMaker Studio LabOpen In SageMaker Studio Lab
OpenAI Reasoning models: the o1 API
  • 🐬OpenAI_Reasoning_models_o1_API.ipynb
Open In ColabKaggleGradientOpen In SageMaker Studio Lab
OpenAI Reasoning models: the o1-preview API
  • 🐬OpenAI_Reasoning_models_o3_API.ipynb
Open In ColabKaggleGradientOpen In SageMaker Studio Lab
Chapter 8: Fine-tuning OpenAI Models
  • Fine_tuning_OpenAI_Models.ipynb
  • 🎏Fine_tuning_GPT_4o_mini_SQuAd.ipynb
Open In ColabOpen In ColabKaggleKaggleGradientGradientOpen In SageMaker Studio LabOpen In SageMaker Studio Lab
Fine-Tuning GPT-4.1
  • 🎏Fine_tuning_GPT_4.1_mini_SQuAd.ipynb
Open In ColabKaggleGradientOpen In SageMaker Studio Lab
🐬RAG as an alternative to fine-tuning: Building Scalable Knowledge-based RAG-drive Generative AI
Click here to access an open-source library to implement RAG
Chapter 9: Shattering the Black Box with Interpretable tools
  • BertViz_Interactive.ipynb
  • Hugging_Face_SHAP.ipynb
Open In ColabOpen In ColabKaggleKaggleGradientGradientOpen In SageMaker Studio LabOpen In SageMaker Studio Lab
Chapter 10: Investigating the Role of Tokenizers in Shaping Transformer Models
  • Tokenizers.ipynb
  • Sub_word_tokenizers.ipynb
  • 🛠Exploring_tokenizers.ipynb
Open In ColabOpen In ColabOpen In ColabKaggleKaggleKaggleGradientGradientGradientOpen In SageMaker Studio LabOpen In SageMaker Studio LabOpen In SageMaker Studio Lab
Chapter 11: Leveraging LLM Embeddings as an Alternative to Fine-Tuning
  • 🛠Embedding_with_NLKT_Gensim.ipynb
  • 🎏Question_answering_with_embeddings.ipynb
  • 🛠Transfer_Learning_with_Ada_Embeddings.ipynb
Open In ColabOpen In ColabOpen In ColabKaggleKaggleKaggleGradientGradientGradientOpen In SageMaker Studio LabOpen In SageMaker Studio LabOpen In SageMaker Studio Lab
Chapter 12: Towards Syntax-Free Semantic Role Labeling with BERT and OpenAI's ChatGPT
  • Semantic_Role_Labeling_GPT-4.ipynb
Open In ColabKaggleGradientOpen In SageMaker Studio Lab
Chapter 13: Summarization with T5 and ChatGPT
  • 🛠Summerizing_Text_T5.ipynb
  • Summarizing_ChatGPT.ipynb
Open In ColabOpen In ColabKaggleKaggleGradientGradientOpen In SageMaker Studio LabOpen In SageMaker Studio Lab
Chapter 14: Exploring Cutting-Edge NLP with Google Vertex AI(PaLM and🐬Gemini with gemini-1.5-flash-001
  • Google_Vertex_AI.ipynb
  • 🐬Google_Vertex_AI_Gemini.ipynb
Open In ColabOpen In ColabKaggleKaggleGradientGradientOpen In SageMaker Studio LabOpen In SageMaker Studio Lab
Gemini 2.5 Flash showcase of Generative AI tasks
  • 🐬Google_Gemini_2.5_Flash.ipynb
Open In ColabKaggleGradientOpen In SageMaker Studio Lab
Chapter 15: Guarding the Giants: Mitigating Risks in Large Language Models<
  • 🎏Auto_Big_bench.ipynb(GPT-4o,synchronous)
  • 🎏Auto_Big_bench.ipynb(GPT-4o-mini,synchronous)
  • 🐬GPT API Speed++ with Asynchronous Batch Calls!
  • 🛠WandB_Prompts_Quickstart.ipynb
  • Encoder_decoder_transformer.ipynb
  • Mitigating_Generative_AI.ipynb
Open In ColabOpen In ColabOpen In ColabOpen In ColabOpen In ColabOpen In ColabKaggleKaggleKaggleKaggleKaggleKaggleGradientGradientGradientGradientGradientGradientOpen In SageMaker Studio LabOpen In SageMaker Studio LabOpen In SageMaker Studio LabOpen In SageMaker Studio LabOpen In SageMaker Studio LabOpen In SageMaker Studio Lab
Part III: Generative Computer Vision: A New Way to See the World
Chapter 16: Vision Transformers in the Dawn of Revolutionary AI
  • ViT_CLIP.ipynb
  • Getting_Started_DALL_E_API.ipynb
  • 🎏GPT-4V.ipynb(GPT-4o)
Open In ColabOpen In ColabOpen In ColabKaggleKaggleKaggleGradientGradientGradientOpen In SageMaker Studio LabOpen In SageMaker Studio LabOpen In SageMaker Studio Lab
Chapter 17: Transcending the Image-Text Boundary with Stable Diffusion
  • Stable_Diffusion_Keras.ipynb
  • Stable__Vision_Stability_AI.ipynb
  • Stable__Vision_Stability_AI_Animation.ipynb
  • Text_to_video_synthesis.ipynb
  • TimeSformer.ipynb
Open In ColabOpen In ColabOpen In ColabOpen In ColabOpen In ColabKaggleKaggleKaggleKaggleKaggleGradientGradientGradientGradientGradientOpen In SageMaker Studio LabOpen In SageMaker Studio LabOpen In SageMaker Studio LabOpen In SageMaker Studio LabOpen In SageMaker Studio Lab
Stable Diffusion with Hugging Face
  • 🐬Stable_Diffusion_Hugging_Face.ipynb
Open In ColabKaggleGradientOpen In SageMaker Studio Lab
Chapter 18: Automated Vision Transformer Training
  • 🛠Hugging_Face_AutoTrain.ipynb
Open In ColabKaggleGradientOpen In SageMaker Studio Lab
Chapter 19: On the Road to Functional AGI with HuggingGPT and its Peers
  • Computer_Vision_Analysis.ipynb
Open In ColabKaggleGradientOpen In SageMaker Studio Lab
Chapter 20: Generative AI Ideation Vertex AI, Langchain, and Stable Diffusion
  • Automated_Design.ipynb
  • Midjourney_bot.ipynb
  • 🎏Automated_Ideation.ipynb
  • 🐬 MyMidjourney_API.ipynb
Open In ColabOpen In ColabOpen In ColabOpen In ColabKaggleKaggleKaggleKaggleGradientGradientGradientGradientOpen In SageMaker Studio LabOpen In SageMaker Studio LabOpen In SageMaker Studio LabOpen In SageMaker Studio Lab

Raise an issue

You cancreate an issue We will be glad to provide support!Supportin this repository if you encounter one in the notebooks.

Get my copy

If you feel this book is for you, get yourcopy today!Coding

Know more on the Discord serverCoding

You can get more engaged on the Discord server for more latest updates and discussions in the community atDiscord

Download a free PDFCoding

If you have already purchased a print or Kindle version of this book, you can get a DRM-free PDF version at no cost. Simply click on the link to claim yourFree PDFCoding

We also provide a PDF file that has color images of the screenshots/diagrams used in this book atColorImagesCoding

Get to Know the Author

Denis Rothman graduated from Sorbonne University and Paris-Cité University, designing one of the first patented encoding and embedding systems and teaching at Paris-I Panthéon Sorbonne.He authored one of the first patented word encoding and AI bots/robots. He began his career delivering a Natural Language Processing (NLP) chatbot for Moët et Chandon(LVMH) and an AI tactical defense optimizer for Airbus (formerly Aerospatiale).Denis then authored an AI optimizer for IBM and luxury brands, leading to an Advanced Planning and Scheduling (APS) solution used worldwide.LinkedIn


[8]ページ先頭

©2009-2025 Movatter.jp