PinnedLoading
- Arxiv-Papers-Recommendations-using-RAG
Arxiv-Papers-Recommendations-using-RAG PublicA retrieval-augmented generation (RAG) pipeline using Elasticsearch & FAISS to index 3,000+ ArXiv papers, improving search efficiency by 35%. Deployed on AWS (S3, DynamoDB) with Docker & CI/CD, ens…
Jupyter Notebook
- Efficient-Query-Processing-on-Distributed-Systems-
Efficient-Query-Processing-on-Distributed-Systems- PublicDesigned a scalable ETL pipeline using Docker, Neo4j, Kafka, Hadoop, and Kubernetes to process 1.5B+ rows (~50GB). Configured Minikube for Kubernetes orchestration, optimizing resource allocation &…
Python
- Log-based-System-Anomaly-Detection-
Log-based-System-Anomaly-Detection- PublicDeveloped a log anomaly detection pipeline for 575K+ log blocks using Drain, Sentence-BERT, and Bi-LSTM, achieving 98.32% F1-score. Automated data preprocessing & model training (80% reduction in m…
Python
- Scalable-Data-Ingestion-Query-Optimization-for-Large-Scale-Reddit-Data
Scalable-Data-Ingestion-Query-Optimization-for-Large-Scale-Reddit-Data PublicBuilt a high-performance ETL pipeline to process 10M+ Reddit comments into PostgreSQL, optimizing ingestion with pg_bulkload (under 300s insertion time). Designed optimized SQL queries for subreddi…
Shell
If the problem persists, check theGitHub status page orcontact support.