Hi, I'm amachine learning engineer focusing/focused on:NLP,recommender system,quantitative finance.
I havemachine learning,backend engineering anddata engineering experiences, following are thetech stacks I used before:
- Programming Languages:Python, C++, Java (Only for Data-Engineering), SQL, JavaScript, Shell, Rust (A Little)
- Frameworks, Libs or Tools:Pulsar, Milvus, gRPC, Spark, Hive, K8S, PyTorch, FAISS, Redis, Flask, TensorFlow (Long Time Ago)
Here is myprojects index:
Side Projects
- simpler-distil-whisper:Reproduce paperDistil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling.
- PLM-ICD-multi-label-classifier:Reproduce paperPLM-ICD: Automatic ICD Coding with Pretrained Language Models.
- SlimPajama-DC Data Deduplicator:Reproduce paperSlimPajama-DC: Understanding Data Combinations for LLM Training.
- feather: AC++feature-hash lib withPython binding provided.
- osimhash: APython binder oversimhashC++ text deduplication lib.
- pypack: GeneratesPython runtime tar.gz file (forPySpark) runnable on all python-version/os/platforms.
Codes Reading
- fastTextAnnotation: The very detailed code annotation for facebook fasttext lib.
- hnswlibAnnotation: The very detailed code annotation for hnswlib.
- finBERT: BERT for financial news sentiment classification.
Self Using
- quicmd: Some useful quickly execution commands.
- config4: Some self-using configs, for now about tmux and vim.
- wiki4codes: Some lib/framework/algorithms/models' trials recording, demos, examples, etc...
Popular repositoriesLoading
- PLM-ICD-multi-label-classifier
PLM-ICD-multi-label-classifier PublicA multi-label classifier based on PLM-ICD paper
Python 6
- images
images Publicimages to link markdown
Something went wrong, please refresh the page to try again.
If the problem persists, check theGitHub status page orcontact support.
If the problem persists, check theGitHub status page orcontact support.