data-processing-pipelines
Here are 18 public repositories matching this topic...
Language:All
Sort:Most stars
Scalable data pre processing and curation toolkit for LLMs
- Updated
Jul 18, 2025 - Python
convtools is a specialized Python library for dynamic, declarative data transformations with automatic code generation
- Updated
Mar 18, 2025 - Python
Visual AI development framework for training and inference of ML models, scaling pipelines, and automating workflows with Python.⭐ Leave a star to support us!
- Updated
Apr 11, 2025 - Python
Making it easier to navigate and clean TAHMO weather station data for ML development
- Updated
Sep 7, 2024 - Python
A simplistic, general purpose pipeline framework.
- Updated
Jul 21, 2022 - Python
Artifician is an event-driven framework designed to simplify and accelerate the process of preparing datasets for Artificial Intelligence models.
- Updated
Jan 30, 2024 - Python
A pipeline that consumes twitter data to extract meaningful insights about a variety of topics using the following technologies: twitter API, Kafka, MongoDB, and Tableau.
- Updated
Aug 2, 2021 - Python
Understanding the customer life cycle Acquiring customer data Applying big data concepts to your customer relationships Finding high propensity prospects Upselling by identifying related products and interests Generating customer loyalty by discovering response patterns Predicting customer lifetime value (CLV) Identifying dissatisfied customers …
- Updated
Oct 3, 2020 - Jupyter Notebook
Homework assignments for MFF UK course NDBI046 - Introduction to Data Engineering
- Updated
May 23, 2023 - TypeScript
An open-source Python library for processing and developing End-to-End AI pipelines for Time Series Analysis
- Updated
Feb 1, 2024 - Jupyter Notebook
The Resume Application Tracking System uses Google Gemini Pro Vision to automatically parse, analyze, and categorize resumes for efficient recruitment. It integrates AI-driven vision capabilities to enhance resume processing and candidate selection.
- Updated
Feb 13, 2025 - Python
Notebooks from finance, general practice and Jovian courses on data analysis, ML and DL
- Updated
Mar 1, 2024 - Jupyter Notebook
🎢 IaaS visual editor to create & deploy data processing pipelines - python, rmq, react, meteorjs
- Updated
Jan 28, 2025 - JavaScript
Dataset
- Updated
Jan 13, 2022 - Jupyter Notebook
Successfully established a machine learning model using PySpark which can accurately classify whether a bank customer will churn or not up to an accuracy of more than 86% on the test set.
- Updated
Aug 4, 2024 - Jupyter Notebook
Codes for data flow between models, data post-process, and visualization
- Updated
Aug 28, 2023 - Jupyter Notebook
Experimental libraries - Azure Storage, multithreaded Data Processing pipelines, and many more ...
- Updated
May 21, 2021 - Java
Data Engineering & Software Blog
- Updated
Mar 2, 2024
Improve this page
Add a description, image, and links to thedata-processing-pipelines topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thedata-processing-pipelines topic, visit your repo's landing page and select "manage topics."