Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work

Overview

Modern Data Lake Storage Layers

This repository contains supporting assets for my research in modern Data Lake storage layers like Apache Hudi, Apache Iceberg, and Delta Lake.

Specifically, there's a CloudFormation template to create an EMR cluster and EMR Studio with the necessary requirements and Jupyter notebooks with the example walkthroughs.

You can view the corresponding blog post and video

Pre-requisites

You'll need an AWS Account in which you have administrator privileges and the ability to deploy a CloudFormation template. The template will create an EMR Cluster and S3 bucket that will incur charges - be sure to either shut down the cluster when done or delete the CloudFormation stack. In order to delete the CloudFormation stack, you'll need to:

  • Manually delete any EMR Studio Workspaces you created
  • Manually empty the S3 bucket created by CloudFormation
  • Manually delete the VPC created by CloudFormation due to auto-created rules

Overview

The included CloudFormation template creates a new VPC and EMR Cluster for you to be able to run the notebooks. An EMR Studio is also created and you can find the Studio URL in the Outputs tab of your CloudFormation Stack.

Once the stack is done creating, you'll need to navigate to EMR Studio and create a new workspace attached to the "data-lakes" cluster.

Inside the workspace you either upload each notebook individually from the notebooks/ folder or simply connect to this repository by using the "Git" icon on the left-hand side.

Live Weather Updates using Flask and OpenWeather

AuraX Live Weather Updates using Flask and OpenWeather Installation To setup this project on your local machine, first clone this repository and insta

Ayush Gupta 3 Nov 02, 2021
Super Fast Telegram UserBot Made With Python.

Description Super Fast Telegram UserBot Made With Python. LOGO Made With Support of All Userbots Dev's Dark-Venom is a Light-Weight Userbot. It's unde

2 Sep 14, 2021
Cogs for Red-DiscordBot

matcha-cogs Cogs for Red-DiscordBot. Installation [p]repo add matcha-cogs

MatchaTeaLeaf 2 Aug 27, 2022
Tinyman Python SDK

tinyman-py-sdk Tinyman Python SDK Design Goal This SDK is designed for automated interaction with the Tinyman AMM. It will be most useful for develope

Tinyman 113 Dec 30, 2022
Roaster - this gui app + program bundle roasts.

Roaster - this gui app + program bundle roasts.

Harsh ADV) 1 Jan 04, 2022
Flaga ze Szturmu na AWS.

Witaj Jesteś na GitHub'ie i czytasz właśnie plik README.md który znajduje się wewnątrz repozytorium Flaga z 7 i 8 etapu Szturmu na AWS. W tym etapie w

9 May 16, 2022
A telegram bot to forward messages automatically when they arrived.

Telegram Message Forwarder Bot A telegram bot, which can forward messages from channel, group or chat to another channel, group or chat automatically.

Adnan Ahmad 181 Jan 07, 2023
Yet another Wahrheit-oder-Pflicht bot for Telegram, because all the others suck.

Der WoPperBot Yet another Wahrheit-oder-Pflicht bot for Telegram, because all the others suck. The existing bots are all defunct or incomplete. So I w

Ben Wiederhake 9 Nov 15, 2022
Sadew Jayasekara 23 Oct 21, 2022
“ Hey there 👋 I'm Sophia „ TG Group management bot with Some Extra features..

❤️ Sophia ❤️ Avaiilable on Telegram as SophiaBot 🏃‍♂️ Easy Deploy Mandatory Vars [+] Make Sure You Add All These Mandatory Vars. [-] APP_ID: You ca

THEEKSHANA 5 Dec 09, 2021
BleachBit system cleaner for Windows and Linux

BleachBit BleachBit cleans files to free disk space and to maintain privacy. Running from source To run BleachBit without installation, unpack the tar

1.9k Jan 06, 2023
TwitterDataStreaming - Twitter data streaming using APIs

Twitter_Data_Streaming Twitter data streaming using APIs Use Case 1: Streaming r

Rita Kushwaha 1 Jan 21, 2022
Python client to do LispTick requests

lisptick-python LispTick Python client library It allows to send request and receive result from a LispTick server. Get a socket connection to a LispT

Kereon Intelligence 1 Oct 25, 2021
Fully undetected auto skillcheck hack for dead by daylight that works decently well

Auto-skillcheck was made by Love ❌ code ✅ ❔ ・How to use Start off by installing python ofc Open cmd in the same directory and type pip install -r requ

Rdimo 10 Aug 13, 2022
Fix Twitter video embeds in Discord

TwitFix very basic flask server that fixes twitter embeds in discord by using youtube-dl to grab the direct link to the MP4 file and embeds the link t

Robin Universe 682 Dec 28, 2022
A simple script & container to pull COVID data from covidlive.com.au and post a summary to a slack channel

CovidLive AU Summary Slackbot This bot is a very simple slackbot that pulls data, summarises and posts up to date AU COVID stats to a provided slack c

James 3 Dec 18, 2021
Clippin n grafting Backend

Clipping' n Grafting Presenting you, 🎉 Clippin' n Grafting 🎉 , your very own ecommerce website displaying all your artsy-craftsy stuff. Not only the

Google-Developer-Student-Club-ISquareIT (GDSC I²IT) 2 Oct 22, 2021
Group Management Bot

❤️ 𝗦𝗛𝗔𝗗𝗜𝗬𝗢 ❤️ A Powerful, Smart And Advance Group Manager ... Written with AioGram , Pyrogram and Telethon... ⭐️ Thanks to everyone who starred

Abdisamad Omar Mohamed 4 Dec 01, 2021
AminoLab Library For AminoApps using aminoapps.com/api

AminoLab AminoLab Api For AminoApps using aminoapps.com/api Installing pip install AminoLab Example #Login import AminoLab client = AminoLab.Client()

10 Sep 26, 2022
A simple python oriented telegram bot to give out creative font style's

Font-Bot A simple python oriented telegram bot to give out creative font style's REQUIREMENTS tgcrypto pyrogram==1.2.9 Installation Fork this reposito

BL4CK H47 4 Jan 30, 2022