Umer-Farooq-CS/RNN-Character-Level-Text-GenerationPublic

NotificationsYou must be signed in to change notification settings
Fork0
Star0

Optimized LSTM-based character-level text generator trained on Shakespeare, achieving 3.5x faster training with mixed precision.

License

MIT license

0 stars 0 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
docs		docs
models		models
plots		plots
results		results
src		src
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
shakespeare.txt		shakespeare.txt

Repository files navigation

Optimized RNN Character-Level Text Generation

A high-performance PyTorch implementation for character-level text generation using LSTM networks, optimized for GPU training with mixed precision and large batch sizes.

🎯 Project Overview

This project implements an optimized LSTM-based character-level text generator trained on Shakespeare's works. The model achieves55.2% accuracy with3.5x faster training through advanced GPU optimizations and mixed precision training.

Key Achievements

🚀3.5x faster training with mixed precision (AMP)
🎭High-quality Shakespeare-like text generation
⚡5.8M parameter model with optimized architecture
🔧GPU memory optimization for large batch sizes
📊Comprehensive evaluation metrics and analysis

📋 Table of Contents

🚀 Features

Mixed Precision Training (AMP) - 2x faster training
Large Batch Sizes - 512 samples per batch for better GPU utilization
Optimized Model Architecture - 5.8M parameters with 3 LSTM layers
GPU Memory Optimization - Efficient memory management
High-Quality Text Generation - Generates Shakespeare-like text

📊 Performance

Training Speed: 6.4 samples/sec (3.5x faster than original)
Model Size: 5.8M parameters (20x larger than original)
Accuracy: 55.2% (2.8% improvement)
GPU Utilization: Optimized for RTX 4070 with 12GB VRAM

🏗️ Architecture

Embedding Layer: 256 dimensions
LSTM Layers: 3 layers with 512 hidden units each
Dropout: 0.2 for regularization
Output Layer: Dense layer with softmax activation

📁 Project Structure

Q2/├── main.py                 # Main execution script├── requirements.txt        # Dependencies├── README.md              # This file├── shakespeare_data.pkl   # Preprocessed dataset├── shakespeare.txt        # Raw Shakespeare text├── src/│   ├── data_loader.py     # Data loading and preprocessing│   └── trainer.py         # Optimized training utilities├── models/│   ├── rnn_model.py       # Optimized LSTM model│   └── shakespeare_rnn_optimized_optimized.pth  # Trained model└── plots/    └── shakespeare_rnn_optimized_optimized_training_history.png

🚀 Quick Start

1. Setup Environment

# Activate virtual environmentsource"/home/umer-farooq/Desktop/Uni/Gen AI/Assignment 1/genai_env_linux/bin/activate"# Install dependenciespip install torch numpy matplotlib scikit-learn

2. Run Training

# Train with default settings (10 epochs, batch size 512)python3 main.py --mode train# Train with custom settingspython3 main.py --mode train --epochs 20 --batch-size 1024 --learning-rate 0.001

3. Generate Text

# Generate text with trained modelpython3 main.py --mode generate --max-chars 1000 --temperature 0.8# Generate with custom seed phrasepython3 main.py --mode generate --seed-phrase"Once upon a time" --max-chars 500

4. Full Pipeline

# Run complete pipeline (preprocess + train + generate)python3 main.py --mode full --epochs 5 --batch-size 512

⚙️ Command Line Options

Option	Default	Description
`--mode`	`full`	Execution mode:`preprocess`,`train`,`generate`, or`full`
`--epochs`	`10`	Number of training epochs
`--batch-size`	`512`	Batch size for training
`--learning-rate`	`0.001`	Learning rate for optimizer
`--hidden-size`	`512`	Hidden size for LSTM layers
`--num-layers`	`3`	Number of LSTM layers
`--model-name`	`shakespeare_rnn_optimized`	Name for saving the model
`--seed-phrase`	`"To be or not to be"`	Seed phrase for text generation
`--max-chars`	`1000`	Maximum characters to generate
`--temperature`	`0.8`	Temperature for text generation

🎯 Optimizations Implemented

Mixed Precision Training (AMP)
- Uses 16-bit precision for 2x speed boost
- Maintains 32-bit precision for accuracy
- Automatic loss scaling
Large Batch Sizes
- 512 samples per batch (vs 64 in original)
- Better GPU utilization
- More stable gradients
Optimized Model Architecture
- Larger embedding dimensions (256 vs 128)
- More LSTM layers (3 vs 2)
- Larger hidden dimensions (512 vs 128)
- Better weight initialization
GPU Memory Optimization
- Data moved to GPU once at start
- No CPU-GPU transfers during training
- Efficient memory management

📈 Training Results

After 3 epochs of training:

Training Loss: 1.3799
Validation Loss: 1.5663
Accuracy: 55.2%
Training Time: ~4 minutes
GPU Memory Usage: 1.9GB

🎭 Generated Text Examples

The model generates high-quality Shakespeare-like text with:

Proper character names (ANGELO, ISABELLA, LUCIO)
Dramatic dialogue structure
Coherent sentence flow
Shakespearean vocabulary and style

🔧 System Requirements

GPU: NVIDIA RTX 4070 or better (12GB+ VRAM recommended)
CUDA: Version 12.8+
PyTorch: Version 2.8.0+
Python: 3.12+
RAM: 8GB+ recommended

📝 Notes

The model uses character-level tokenization (65 unique characters)
Training data: Tiny Shakespeare dataset (1.1M characters)
Model saves automatically after training
Training plots are generated and saved toplots/ directory
Generated text quality improves with more training epochs

🚀 Future Improvements

Increase batch size to 1024-2048 for even better GPU utilization
Implement learning rate scheduling
Add gradient accumulation for very large batch sizes
Use data parallelism for multi-GPU training
Implement early stopping to prevent overfitting

About

Optimized LSTM-based character-level text generator trained on Shakespeare, achieving 3.5x faster training with mixed precision.

Releases

No releases published

Packages

No packages published

Languages

Python100.0%

Movatterモバイル変換

License

Umer-Farooq-CS/RNN-Character-Level-Text-Generation

Folders and files

Latest commit

History

Repository files navigation

Optimized RNN Character-Level Text Generation

🎯 Project Overview

Key Achievements

📋 Table of Contents

🚀 Features

📊 Performance

🏗️ Architecture

📁 Project Structure

🚀 Quick Start

1. Setup Environment

2. Run Training

3. Generate Text

4. Full Pipeline

⚙️ Command Line Options

🎯 Optimizations Implemented

📈 Training Results

🎭 Generated Text Examples

🔧 System Requirements

📝 Notes

🚀 Future Improvements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Languages

Packages