- Notifications
You must be signed in to change notification settings - Fork0
Optimized LSTM-based character-level text generator trained on Shakespeare, achieving 3.5x faster training with mixed precision.
License
Umer-Farooq-CS/RNN-Character-Level-Text-Generation
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
A high-performance PyTorch implementation for character-level text generation using LSTM networks, optimized for GPU training with mixed precision and large batch sizes.
This project implements an optimized LSTM-based character-level text generator trained on Shakespeare's works. The model achieves55.2% accuracy with3.5x faster training through advanced GPU optimizations and mixed precision training.
- 🚀3.5x faster training with mixed precision (AMP)
- 🎭High-quality Shakespeare-like text generation
- ⚡5.8M parameter model with optimized architecture
- 🔧GPU memory optimization for large batch sizes
- 📊Comprehensive evaluation metrics and analysis
- Features
- Performance
- Architecture
- Project Structure
- Quick Start
- Command Line Options
- Optimizations
- Training Results
- Generated Text Examples
- System Requirements
- Notes
- Future Improvements
- Mixed Precision Training (AMP) - 2x faster training
- Large Batch Sizes - 512 samples per batch for better GPU utilization
- Optimized Model Architecture - 5.8M parameters with 3 LSTM layers
- GPU Memory Optimization - Efficient memory management
- High-Quality Text Generation - Generates Shakespeare-like text
- Training Speed: 6.4 samples/sec (3.5x faster than original)
- Model Size: 5.8M parameters (20x larger than original)
- Accuracy: 55.2% (2.8% improvement)
- GPU Utilization: Optimized for RTX 4070 with 12GB VRAM
- Embedding Layer: 256 dimensions
- LSTM Layers: 3 layers with 512 hidden units each
- Dropout: 0.2 for regularization
- Output Layer: Dense layer with softmax activation
Q2/├── main.py # Main execution script├── requirements.txt # Dependencies├── README.md # This file├── shakespeare_data.pkl # Preprocessed dataset├── shakespeare.txt # Raw Shakespeare text├── src/│ ├── data_loader.py # Data loading and preprocessing│ └── trainer.py # Optimized training utilities├── models/│ ├── rnn_model.py # Optimized LSTM model│ └── shakespeare_rnn_optimized_optimized.pth # Trained model└── plots/ └── shakespeare_rnn_optimized_optimized_training_history.png# Activate virtual environmentsource"/home/umer-farooq/Desktop/Uni/Gen AI/Assignment 1/genai_env_linux/bin/activate"# Install dependenciespip install torch numpy matplotlib scikit-learn
# Train with default settings (10 epochs, batch size 512)python3 main.py --mode train# Train with custom settingspython3 main.py --mode train --epochs 20 --batch-size 1024 --learning-rate 0.001
# Generate text with trained modelpython3 main.py --mode generate --max-chars 1000 --temperature 0.8# Generate with custom seed phrasepython3 main.py --mode generate --seed-phrase"Once upon a time" --max-chars 500
# Run complete pipeline (preprocess + train + generate)python3 main.py --mode full --epochs 5 --batch-size 512| Option | Default | Description |
|---|---|---|
--mode | full | Execution mode:preprocess,train,generate, orfull |
--epochs | 10 | Number of training epochs |
--batch-size | 512 | Batch size for training |
--learning-rate | 0.001 | Learning rate for optimizer |
--hidden-size | 512 | Hidden size for LSTM layers |
--num-layers | 3 | Number of LSTM layers |
--model-name | shakespeare_rnn_optimized | Name for saving the model |
--seed-phrase | "To be or not to be" | Seed phrase for text generation |
--max-chars | 1000 | Maximum characters to generate |
--temperature | 0.8 | Temperature for text generation |
Mixed Precision Training (AMP)
- Uses 16-bit precision for 2x speed boost
- Maintains 32-bit precision for accuracy
- Automatic loss scaling
Large Batch Sizes
- 512 samples per batch (vs 64 in original)
- Better GPU utilization
- More stable gradients
Optimized Model Architecture
- Larger embedding dimensions (256 vs 128)
- More LSTM layers (3 vs 2)
- Larger hidden dimensions (512 vs 128)
- Better weight initialization
GPU Memory Optimization
- Data moved to GPU once at start
- No CPU-GPU transfers during training
- Efficient memory management
After 3 epochs of training:
- Training Loss: 1.3799
- Validation Loss: 1.5663
- Accuracy: 55.2%
- Training Time: ~4 minutes
- GPU Memory Usage: 1.9GB
The model generates high-quality Shakespeare-like text with:
- Proper character names (ANGELO, ISABELLA, LUCIO)
- Dramatic dialogue structure
- Coherent sentence flow
- Shakespearean vocabulary and style
- GPU: NVIDIA RTX 4070 or better (12GB+ VRAM recommended)
- CUDA: Version 12.8+
- PyTorch: Version 2.8.0+
- Python: 3.12+
- RAM: 8GB+ recommended
- The model uses character-level tokenization (65 unique characters)
- Training data: Tiny Shakespeare dataset (1.1M characters)
- Model saves automatically after training
- Training plots are generated and saved to
plots/directory - Generated text quality improves with more training epochs
- Increase batch size to 1024-2048 for even better GPU utilization
- Implement learning rate scheduling
- Add gradient accumulation for very large batch sizes
- Use data parallelism for multi-GPU training
- Implement early stopping to prevent overfitting
About
Optimized LSTM-based character-level text generator trained on Shakespeare, achieving 3.5x faster training with mixed precision.
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.