Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
#

emr-cluster

Here are 101 public repositories matching this topic...

goodreads_etl_pipeline

BERT, AWS RDS, AWS Forecast, EMR Spark Cluster, Hive, Serverless, Google Assistant + Raspberry Pi, Infrared, Google Cloud Platform Natural Language, Anomaly detection, Tensorflow, Mathematics

  • UpdatedAug 6, 2021
  • Jupyter Notebook

Classwork projects and home works done through Udacity data engineering nano degree

  • UpdatedDec 12, 2023
  • Jupyter Notebook

Terraform module to provision an Elastic MapReduce (EMR) cluster on AWS

  • UpdatedFeb 5, 2025
  • HCL
pyspark-on-aws-emr

The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances that you can use quickly. Just focus on writing pyspark code.

  • UpdatedJun 13, 2022
  • Python

This is an ETL application on AWS with general open sales and customer data that you can find here:https://github.com/camposvinicius/data/blob/main/AdventureWorks.zip, it's a zipped file with some .csvs inside that we will apply transformations.

  • UpdatedFeb 7, 2022
  • Smarty

Apache Spark TPC-DS benchmark setup with EMR launch setup

  • UpdatedJul 11, 2022
  • Smarty

A Cassandra Architecture for GDELT Database 🌍

  • UpdatedMar 7, 2019
  • Shell

Uses EMR clusters to export dynamoDB tables to S3 and generates import steps

  • UpdatedSep 16, 2022
  • Shell

A boilerplate for spark projects with docker support for local development and scripts for emr support.

  • UpdatedDec 2, 2017
  • Scala
Batch-ETL-with-AWS-EMR-and-MWAA

Create a data pipeline on AWS to execute batch processing in a Spark cluster provisioned by Amazon EMR. ETL using managed airflow: extracts data from S3, transform data using spark, load transformed data back to S3.

  • UpdatedJul 12, 2021
  • Python

A large-scale data framework that will enable us to store and analyze financial market data and drive future predictions for investment.

  • UpdatedMar 7, 2020
  • TSQL

This project demonstrates the use of Amazon Elastic Map Reduce (EMR) for processing large datasets using Apache Spark. It includes a Spark script for ETL (Extract, Transform, Load) operations, AWS command line instructions for setting up and managing the EMR cluster, and a dataset for testing and demonstration purposes.

  • UpdatedNov 12, 2023
  • Python

Generic python library that enables to provision emr clusters with yaml config files (Configuration as Code)

  • UpdatedDec 8, 2022
  • Python

Collection of code for submitting Spark/Hadoop/Hive/Pig tasks to EMR (AWS Elastic MapReduce) | #DE

  • UpdatedJan 13, 2020
  • Scala

Event driven EMR via Serverless

  • UpdatedNov 22, 2017
  • Python

Improve this page

Add a description, image, and links to theemr-cluster topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with theemr-cluster topic, visit your repo's landing page and select "manage topics."

Learn more


[8]ページ先頭

©2009-2025 Movatter.jp