glue-catalog

Star

Here are 19 public repositories matching this topic...

Language:All

Filter by language

All19 Python9 Jupyter Notebook4 Dockerfile2 HCL1 Java1 TypeScript1

Sort:Most stars

Sort options

Most stars Fewest stars Most forks Fewest forks Recently updated Least recently updated

aws /aws-sdk-pandas

Star4k

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

mysql python emr aws data-science lambda aws-lambda athena etl pandas data-engineering redshift ray apache-parquet amazon-athena apache-arrow aws-glue modin glue-catalog amazon-sagemaker-notebook

UpdatedMar 17, 2025
Python

dbt-labs /dbt-athena

Star244

The athena adapter plugin for dbt (https://getdbt.com)

athena s3 dbt iceberg glue-catalog dbt-athena dbt-athena-community

UpdatedFeb 7, 2025
Python

bbenzikry /spark-eks

Star27

Examples and custom spark images for working with the spark-on-k8s operator on AWS

docker kubernetes dockerfile aws spark kubernetes-operator metastore eks eks-cluster glue-catalog

UpdatedFeb 14, 2021
Dockerfile

webysther /aws-glue-docker

Sponsor

Star23

🐋 Docker image for AWS Glue Spark/Python

python docker dockerfile aws development spark etl docker-image sam pandas aws-cli pytest data-engineering cdk apache-arrow aws-glue python-poetry glue-catalog aws-glue-docker glue-pyspark

UpdatedSep 5, 2023
Dockerfile

miztiik /s3-to-rds-with-glue

Star17

Extract, transform, and load data for analytic processing using AWS Glue

spark etl glue cdk glue-job cloud-development-kit glue-catalog miztiik-automation s3-to-rds

UpdatedMay 2, 2021
Python

kyopark2014 /case-study-wait-for-callback

Star4

This is a case study showing how to deploy "Wait-for-Callback" using Step Functions

lambda-functions step-functions event-bridge glue-catalog

UpdatedJan 4, 2023
TypeScript

marwan116 /aws-parquet

Star3

a toolkit that provides an object-oriented interface for working with parquet datasets on AWS

python aws data-science athena etl pandas data-engineering apache-parquet amazon-athena apache-arrow aws-glue glue-catalog

UpdatedJun 19, 2023
Python

GabrielDan92 /AWS_Terraform_PySpark-ETL_Job

Star3

Terraform configuration that creates several AWS services, uploads data in S3 and starts the Glue Crawler and Glue Job.

aws terraform s3-bucket pyspark glue-job glue-catalog aws-glue-crawler

UpdatedFeb 10, 2022
Python

PATRICIAJUNQUEIRA /DataLake_PipelineAWS

Star1

Pipeline ETL na AWS

api aws s3-bucket gold bronze silver visualization-dashboard glue-job glue-catalog lakeformation gluecrawer

UpdatedJul 28, 2024
Python

BhawnaMehbubani /Process-and-Ingest-only-quality-movies-in-Redshift-Dara-Warehouse

Star1

This repository contains a production-grade ETL (Extract, Transform, Load) pipeline built with AWS Glue and Amazon Redshift. The pipeline processes a raw IMDb movie dataset stored in Amazon S3, applies data quality validation, dynamically routes data based on validation results, and loads it into Amazon Redshift for advanced analytic

s3-bucket sns redshift crawlers eventbridge glue-catalog glue-low-code-etl

UpdatedJan 24, 2025
Python

Zain970 /Stock-market-realtime-data-pipeline

Star0

Read the data from a source file using python and then produced that data to a kafka broker using a kafka producer , then consumed the message using a kafka consumer , uploaded the data to a aws s3 bucket then built crawler on top that and then queried that data using aws athena.

python aws kafka ec2 athena s3-bucket glue-catalog

UpdatedAug 18, 2024
Jupyter Notebook

Shilpaar90 /AWS-Capturing-Schema-Changes-In-S3

Star0

A pipeline within AWS to capture schema changes in S3 files and to update them in a DB.

aws crawler aws-lambda dynamodb s3 aws-dynamodb aws-cloudwatch-logs aws-lambda-python aws-glue aws-eventbridge glue-catalog aws-glue-crawler

UpdatedNov 30, 2021

edrrezend /ETL_Streaming_DataLake

Star0

ETL using application streaming and creating a Data Lake

python crawler athena etl s3 kinesis kinesis-firehose kinesis-stream datalake dataengineering glue-job glue-catalog

UpdatedApr 7, 2023
Jupyter Notebook

pranav-patil /aws-kinesis-analytics

Star0

AWS Kinesis Analytics gather metrics from various computers (cpu, memory), perform aggregation on Kinesis stream data using Kinesis Analytics (with flink) and store the stream data into AWS S3 bucket which is used by Amazon Athena for running various Analytics queries and rending charts using Grafana.

athena grafana s3 s3-bucket kinesis-firehose flink flink-stream-processing glue-catalog

UpdatedJan 14, 2024
Java

gakas14 /AWS-Serverless-Data-Lake

Star0

This workshop is to build a serverless data lake architecture using Amazon Kinesis Firehose for streaming data ingestion, AWS Glue for Data Integration (ETL, Catalogue Management), Amazon S3 for data lake storage, Amazon Athena for SQL big data analytics.

aws sql athena etl s3 data-lake kinesis-firehose kinesis-stream glue-etl glue-catalog

UpdatedNov 23, 2022
Jupyter Notebook

mineshmelvin /aws-forecast-pipeline-iac

Star0

IaC (Terraform) of AWS Forecast pipeline using Glue as workflow manager

aws spark terraform glue forecast iac rds crawlers quicksight glue-catalog

UpdatedOct 3, 2024
Python

infraspecdev /terraform-aws-athena

Star0

This Terraform module automates the setup of AWS Athena to query ALB access and connection logs stored in an S3 bucket.

athena glue-catalog terrform-module

UpdatedNov 14, 2024
HCL

datahealer /jupyter-s3-parquet-redshift

Star0

1️⃣ Querying Parquet file from S3 using AwsWrangler. 2️⃣ Querying from Redshift tables using Glue & AwsWrangler

python s3 pandas parquet redshift glue-catalog awswrangler

UpdatedAug 8, 2022
Jupyter Notebook

KRISHNASAIRAJ /AWS-Driven-Sales-Performance-Outlook

Star0

The Project aims to establish a robust data pipeline for tracking and analyzing sales performance using various AWS services. The process involves creating a DynamoDB database, implementing Change Data Capture (CDC), utilizing Kinesis streams, and finally, storing and querying the data in Amazon Athena.

python aws-lambda dynamodb s3-bucket kinesis kinesis-firehose aws-athena glue-catalog aws-glue-crawler eventbridge-pipes

UpdatedFeb 11, 2024
Python

Improve this page

Add a description, image, and links to theglue-catalog topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with theglue-catalog topic, visit your repo's landing page and select "manage topics."

Learn more

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

glue-catalog

Here are 19 public repositories matching this topic...

aws /aws-sdk-pandas

dbt-labs /dbt-athena

bbenzikry /spark-eks

webysther /aws-glue-docker

miztiik /s3-to-rds-with-glue

kyopark2014 /case-study-wait-for-callback

marwan116 /aws-parquet

GabrielDan92 /AWS_Terraform_PySpark-ETL_Job

PATRICIAJUNQUEIRA /DataLake_PipelineAWS

BhawnaMehbubani /Process-and-Ingest-only-quality-movies-in-Redshift-Dara-Warehouse

Zain970 /Stock-market-realtime-data-pipeline

Shilpaar90 /AWS-Capturing-Schema-Changes-In-S3

edrrezend /ETL_Streaming_DataLake

pranav-patil /aws-kinesis-analytics

gakas14 /AWS-Serverless-Data-Lake

mineshmelvin /aws-forecast-pipeline-iac

infraspecdev /terraform-aws-athena

datahealer /jupyter-s3-parquet-redshift

KRISHNASAIRAJ /AWS-Driven-Sales-Performance-Outlook

Improve this page

Add this topic to your repo