This project is the implementation template for HW 0 and HW 1 for both the programming and non-programming tracks

Overview

S22-W4111-HW-1-0:
W4111 - Intro to Databases HW0 and HW1

Introduction

This project is the implementation template for HW 0 and HW 1 for both the programming and non-programming tracks.

HW 0 - All Students

You have completed the first step, which is cloning the project template.

Note: You are Columbia students. You should be able to install SW and follow instructions.

MySQL:

  • Download the installation files for MySQL Community Server..

    • Make sure you download for the correct operating system.
    • If you are on Mac make sure you choose the correct architecture. ARM is for Apple silicon. x86 is for other Apple systems.
    • On Windows, you can download and use the MSI.
  • Follow the installation instructions for MySQL. There are official instructions and many online tutorials.

  • Remember your root user ID and password, that you set during installation. Also, choose "Legacy Authentication" when prompted.

    • If you forget your root user or password, you are on your own. The TAs and I will not fix any problems due to forgetting the information.
    • Also, if you say something like, "It did not prompt me for a user ID and password when I instaled ... ..," we will laugh. We will say something like, ""Sure. 20 million MySQL installations asked for the information, but it decide to not to ask you."
    • If you tell us that you are sure that you are entering the correct user ID and password we will laugh. We will say something like, "Which is more likely. That a DATABASE forgot something or" you did?"
  • You only need to install the server. All other SW packages are optional.

Anaconda:

  • I strongly recommend uninstalling any existing version of Anaconda. If you choose not to uninstall previous versions, you may hit issues. You are on your own if you hit issues due to conflicting versions of Anaconda during the semester.

  • Download the most recent version of Ananconda..

  • Follow the installation instructions. Choose "Install for me" when prompted. If you hit a problem and I find your Anaconda installation in the wrong directory, you are on your own. If you say something like, "But, it did not give me that option," you can guess what will happen.

DataGrip:

  • Download DataGrip. Make sure you choose the correct OS and silicon.

  • Follow the installation instructions.

  • Apply for a student license.

  • When you receive confirmation of your student license, set the license information in DataGrip.

HW0: Non-Programming

Step 1: Initial Files

  1. Create a folder in the project of the form _src, where is your UNI I created an example, which is dff9_src.

  2. Create a file in the directory _HW0.

  3. Copy the Jupyter notebook file from dff9_src/dff9_HW0.ipynb into the directory you created and replace dff9 with your UNI.

  4. Do the same for dff9_HW0.py

Step 2: Jupter Notebook

  • Start Anaconda.

  • Open Jupyter Notebook in Anaconda.

  • Navigate to the directory where you cloned the repository, and then go into the folder you created.

  • Open the notebook (the file ending in .ipynb).

  • The remaining steps in HW0: Non-Programming are in the notebook that you opened.

HW 0: Programming

  • Complete the steps for HW0: Non-Programming.

  • The programming track is not "harder" than non-programming. The initial set up is a little more work, however.

  • Download and install PyCharm. Download and install the professional edition.

  • Follow the instructions to set the license key using the JetBrains account you used to get the DataGrip licenses.

  • Start PyCharm, navigate to and open the project that you cloned from GitHub.

  • Follow the instructions for creating a new virtual Conda environment for the project.

  • Select the root folder in the project, right click and add a new Python Package named _web_src. My example is dff9_web_src.

  • Copy the files from dff9_web_src into the package you created.

  • Follow the instructions for adding a package to your virtual environment. You should add the package flask.

  • Right click on your file application.py that you copied and select run. You will see a console window open and this will show a URL. Copy on the URL.

  • Open a browser. Paste the URL and append '/health'. My URL looks like http://172.20.1.14:5000/health. Yours may be a little different.

  • Hit enter. You should see a health message. Take a screenshot of the browser window and add the file to the directory. My example is ""

Owner
Donald F. Ferguson
Senior Technical Fellow, Chief SW Architect, Ansys, Inc. Adjunct Professor, Dept. of Computer Science, Columbia University. CTO and Co-Founder, Seeka.TV
Donald F. Ferguson
Get mutations in cluster by querying from LAPIS API

Cluster Mutation Script Get mutations appearing within user-defined clusters. Usage Clusters are defined in the clusters dict in main.py: clusters = {

neherlab 1 Oct 22, 2021
Python dataset creator to construct datasets composed of OpenFace extracted features and Shimmer3 GSR+ Sensor datas

Python dataset creator to construct datasets composed of OpenFace extracted features and Shimmer3 GSR+ Sensor datas

Gabriele 3 Jul 05, 2022
An orchestration platform for the development, production, and observation of data assets.

Dagster An orchestration platform for the development, production, and observation of data assets. Dagster lets you define jobs in terms of the data f

Dagster 6.2k Jan 08, 2023
EOD Historical Data Python Library (Unofficial)

EOD Historical Data Python Library (Unofficial) https://eodhistoricaldata.com Installation python3 -m pip install eodhistoricaldata Note Demo API key

Michael Whittle 20 Dec 22, 2022
LynxKite: a complete graph data science platform for very large graphs and other datasets.

LynxKite is a complete graph data science platform for very large graphs and other datasets. It seamlessly combines the benefits of a friendly graphical interface and a powerful Python API.

124 Dec 14, 2022
Analyze the Gravitational wave data stored at LIGO/VIRGO observatories

Gravitational-Wave-Analysis This project showcases how to analyze the Gravitational wave data stored at LIGO/VIRGO observatories, using Python program

1 Jan 23, 2022
Tools for the analysis, simulation, and presentation of Lorentz TEM data.

ltempy ltempy is a set of tools for Lorentz TEM data analysis, simulation, and presentation. Features Single Image Transport of Intensity Equation (SI

McMorran Lab 1 Dec 26, 2022
A set of functions and analysis classes for solvation structure analysis

SolvationAnalysis The macroscopic behavior of a liquid is determined by its microscopic structure. For ionic systems, like batteries and many enzymes,

MDAnalysis 19 Nov 24, 2022
Bearsql allows you to query pandas dataframe with sql syntax.

Bearsql adds sql syntax on pandas dataframe. It uses duckdb to speedup the pandas processing and as the sql engine

14 Jun 22, 2022
API>local_db>AWS_RDS - Disclaimer! All data used is for educational purposes only.

APIlocal_dbAWS_RDS Disclaimer! All data used is for educational purposes only. ETL pipeline diagram. Aim of project By creating a fully working pipe

0 Apr 25, 2022
An implementation of the largeVis algorithm for visualizing large, high-dimensional datasets, for R

largeVis This is an implementation of the largeVis algorithm described in (https://arxiv.org/abs/1602.00370). It also incorporates: A very fast algori

336 May 25, 2022
An experimental project I'm undertaking for the sole purpose of increasing my Python knowledge

5ePy is an experimental project I'm undertaking for the sole purpose of increasing my Python knowledge. #Goals Goal: Create a working, albeit lightwei

Hayden Covington 1 Nov 24, 2021
BigDL - Evaluate the performance of BigDL (Distributed Deep Learning on Apache Spark) in big data analysis problems

Evaluate the performance of BigDL (Distributed Deep Learning on Apache Spark) in big data analysis problems.

Vo Cong Thanh 1 Jan 06, 2022
DenseClus is a Python module for clustering mixed type data using UMAP and HDBSCAN

DenseClus is a Python module for clustering mixed type data using UMAP and HDBSCAN. Allowing for both categorical and numerical data, DenseClus makes it possible to incorporate all features in cluste

Amazon Web Services - Labs 53 Dec 08, 2022
wikirepo is a Python package that provides a framework to easily source and leverage standardized Wikidata information

Python based Wikidata framework for easy dataframe extraction wikirepo is a Python package that provides a framework to easily source and leverage sta

Andrew Tavis McAllister 35 Jan 04, 2023
Pyspark Spotify ETL

This is my first Data Engineering project, it extracts data from the user's recently played tracks using Spotify's API, transforms data and then loads it into Postgresql using SQLAlchemy engine. Data

16 Jun 09, 2022
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

Amundsen 3.7k Jan 03, 2023
MetPy is a collection of tools in Python for reading, visualizing and performing calculations with weather data.

MetPy MetPy is a collection of tools in Python for reading, visualizing and performing calculations with weather data. MetPy follows semantic versioni

Unidata 971 Dec 25, 2022
Data cleaning tools for Business analysis

Datacleaning datacleaning tools for Business analysis This program is made for Vicky's work. You can use it, too. 数据清洗 该数据清洗工具是为了商业分析 这个程序是为了Vicky的工作而

Lin Jian 3 Nov 16, 2021
This repo contains a simple but effective tool made using python which can be used for quality control in statistical approach.

📈 Statistical Quality Control 📉 This repo contains a simple but effective tool made using python which can be used for quality control in statistica

SasiVatsal 8 Oct 18, 2022