Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Optimized LSTM-based character-level text generator trained on Shakespeare, achieving 3.5x faster training with mixed precision.

License

NotificationsYou must be signed in to change notification settings

Umer-Farooq-CS/RNN-Character-Level-Text-Generation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PythonPyTorchCUDALicense

A high-performance PyTorch implementation for character-level text generation using LSTM networks, optimized for GPU training with mixed precision and large batch sizes.

🎯 Project Overview

This project implements an optimized LSTM-based character-level text generator trained on Shakespeare's works. The model achieves55.2% accuracy with3.5x faster training through advanced GPU optimizations and mixed precision training.

Key Achievements

  • 🚀3.5x faster training with mixed precision (AMP)
  • 🎭High-quality Shakespeare-like text generation
  • 5.8M parameter model with optimized architecture
  • 🔧GPU memory optimization for large batch sizes
  • 📊Comprehensive evaluation metrics and analysis

📋 Table of Contents

🚀 Features

  • Mixed Precision Training (AMP) - 2x faster training
  • Large Batch Sizes - 512 samples per batch for better GPU utilization
  • Optimized Model Architecture - 5.8M parameters with 3 LSTM layers
  • GPU Memory Optimization - Efficient memory management
  • High-Quality Text Generation - Generates Shakespeare-like text

📊 Performance

  • Training Speed: 6.4 samples/sec (3.5x faster than original)
  • Model Size: 5.8M parameters (20x larger than original)
  • Accuracy: 55.2% (2.8% improvement)
  • GPU Utilization: Optimized for RTX 4070 with 12GB VRAM

🏗️ Architecture

  • Embedding Layer: 256 dimensions
  • LSTM Layers: 3 layers with 512 hidden units each
  • Dropout: 0.2 for regularization
  • Output Layer: Dense layer with softmax activation

📁 Project Structure

Q2/├── main.py                 # Main execution script├── requirements.txt        # Dependencies├── README.md              # This file├── shakespeare_data.pkl   # Preprocessed dataset├── shakespeare.txt        # Raw Shakespeare text├── src/│   ├── data_loader.py     # Data loading and preprocessing│   └── trainer.py         # Optimized training utilities├── models/│   ├── rnn_model.py       # Optimized LSTM model│   └── shakespeare_rnn_optimized_optimized.pth  # Trained model└── plots/    └── shakespeare_rnn_optimized_optimized_training_history.png

🚀 Quick Start

1. Setup Environment

# Activate virtual environmentsource"/home/umer-farooq/Desktop/Uni/Gen AI/Assignment 1/genai_env_linux/bin/activate"# Install dependenciespip install torch numpy matplotlib scikit-learn

2. Run Training

# Train with default settings (10 epochs, batch size 512)python3 main.py --mode train# Train with custom settingspython3 main.py --mode train --epochs 20 --batch-size 1024 --learning-rate 0.001

3. Generate Text

# Generate text with trained modelpython3 main.py --mode generate --max-chars 1000 --temperature 0.8# Generate with custom seed phrasepython3 main.py --mode generate --seed-phrase"Once upon a time" --max-chars 500

4. Full Pipeline

# Run complete pipeline (preprocess + train + generate)python3 main.py --mode full --epochs 5 --batch-size 512

⚙️ Command Line Options

OptionDefaultDescription
--modefullExecution mode:preprocess,train,generate, orfull
--epochs10Number of training epochs
--batch-size512Batch size for training
--learning-rate0.001Learning rate for optimizer
--hidden-size512Hidden size for LSTM layers
--num-layers3Number of LSTM layers
--model-nameshakespeare_rnn_optimizedName for saving the model
--seed-phrase"To be or not to be"Seed phrase for text generation
--max-chars1000Maximum characters to generate
--temperature0.8Temperature for text generation

🎯 Optimizations Implemented

  1. Mixed Precision Training (AMP)

    • Uses 16-bit precision for 2x speed boost
    • Maintains 32-bit precision for accuracy
    • Automatic loss scaling
  2. Large Batch Sizes

    • 512 samples per batch (vs 64 in original)
    • Better GPU utilization
    • More stable gradients
  3. Optimized Model Architecture

    • Larger embedding dimensions (256 vs 128)
    • More LSTM layers (3 vs 2)
    • Larger hidden dimensions (512 vs 128)
    • Better weight initialization
  4. GPU Memory Optimization

    • Data moved to GPU once at start
    • No CPU-GPU transfers during training
    • Efficient memory management

📈 Training Results

After 3 epochs of training:

  • Training Loss: 1.3799
  • Validation Loss: 1.5663
  • Accuracy: 55.2%
  • Training Time: ~4 minutes
  • GPU Memory Usage: 1.9GB

🎭 Generated Text Examples

The model generates high-quality Shakespeare-like text with:

  • Proper character names (ANGELO, ISABELLA, LUCIO)
  • Dramatic dialogue structure
  • Coherent sentence flow
  • Shakespearean vocabulary and style

🔧 System Requirements

  • GPU: NVIDIA RTX 4070 or better (12GB+ VRAM recommended)
  • CUDA: Version 12.8+
  • PyTorch: Version 2.8.0+
  • Python: 3.12+
  • RAM: 8GB+ recommended

📝 Notes

  • The model uses character-level tokenization (65 unique characters)
  • Training data: Tiny Shakespeare dataset (1.1M characters)
  • Model saves automatically after training
  • Training plots are generated and saved toplots/ directory
  • Generated text quality improves with more training epochs

🚀 Future Improvements

  • Increase batch size to 1024-2048 for even better GPU utilization
  • Implement learning rate scheduling
  • Add gradient accumulation for very large batch sizes
  • Use data parallelism for multi-GPU training
  • Implement early stopping to prevent overfitting

About

Optimized LSTM-based character-level text generator trained on Shakespeare, achieving 3.5x faster training with mixed precision.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp