Jupyter Notebook extension leveraging pandas DataFrames by integrating DataTables and ChartJS.

Overview

Jupyter DataTables

Jupyter Notebook extension to leverage pandas DataFrames by integrating DataTables JS.


About

Data scientists and in fact many developers work with pd.DataFrame on daily basis to interpret data to process them. In my typical workflow. The common workflow is to display the dataframe, take a look at the data schema and then produce multiple plots to check the distribution of the data to have a clearer picture, perhaps search some data in the table, etc...

What if those distribution plots were part of the standard DataFrame and we had the ability to quickly search through the table with minimal effort? What if it was the default representation?

The jupyter-datatables uses jupyter-require to draw the table.


Installation

pip install jupyter-datatables

Usage

import numpy as np
import pandas as pd

from jupyter_datatables import init_datatables_mode

init_datatables_mode()

That's it, your default pandas representation will now use Jupyter DataTables!

df = pd.DataFrame(np.abs(np.random.randn(50, 5)), columns=list(string.ascii_uppercase[:5]))

Jupyter Datatables table representation


In most cases, you don't need to worry too much about the size of your data. Jupyter DataTables calculates required sample size based on a confidence interval (by default this would be 0.95) and margin of error and ceils it to the highest 'smart' value.

For example, for a data containing 100,000 samples, given 0.975 confidence interval and 0.02 margin of error, the Jupyter DataTables would calculate that 3044 samples are required and it would round it up to 4000.

Jupyter Datatables long table sample size

With additional note:

Sample size: 4,000 out of 100,000


We can also handle wide tables with ease.

df = pd.DataFrame(np.abs(np.random.randn(50, 20)), columns=list(string.ascii_uppercase[:20]))

Jupyter Datatables wide table representation


As per 0.3.0, there is a support for interactive tooltips:

Jupyter Datatables wide table representation

And also support for custom indices including Date type:

dft = pd.DataFrame({'A': np.random.rand(5),
                    'B': [1, 1, 3, 2, 1],
                    'C': 'This is a very long sentence that should automatically be trimmed',
                    'D': [pd.Timestamp('20010101'), pd.Timestamp('20010102'), pd.Timestamp('20010103'), pd.Timestamp('20010104'), pd.Timestamp('20010105')],
                    'E': pd.Series([1.0] * 5).astype('float32'),
                    'F': [False, True, False, False, True],
                   })

dft.D = dft.D.apply(pd.to_datetime)
dft.set_index('D', inplace=True)

Jupyter Datatables wide table representation



Current status and future plans:

Check out the Project Board where we track issues and TODOs for our Jupyter tooling!


Author: Marek Cermak [email protected], @AICoE

Owner
Marek Čermák
DevOps Engineer @ LivesportTV
Marek Čermák
CONTRIBUTIONS ONLY: Voluptuous, despite the name, is a Python data validation library.

CONTRIBUTIONS ONLY What does this mean? I do not have time to fix issues myself. The only way fixes or new features will be added is by people submitt

Alec Thomas 1.8k Dec 31, 2022
Manim is an animation engine for explanatory math videos.

A community-maintained Python framework for creating mathematical animations.

12.4k Dec 30, 2022
https://there.oughta.be/a/macro-keyboard

inkkeys Details and instructions can be found on https://there.oughta.be/a/macro-keyboard In contrast to most of my other projects, I decided to put t

Sebastian Staacks 209 Dec 21, 2022
Generate "Jupiter" plots for circular genomes

jupiter Generate "Jupiter" plots for circular genomes Description Python scripts to generate plots from ViennaRNA output. Written in "pidgin" python w

Robert Edgar 2 Nov 29, 2021
Data Visualizations for the #30DayChartChallenge

The #30DayChartChallenge This repository contains all the charts made for the #30DayChartChallenge during the month of April. This project aims to exp

Isaac Arroyo 7 Sep 20, 2022
A programming language built on top of Python to easily allow Swahili speakers to get started with programming without ever knowing English

pyswahili A programming language built over Python to easily allow swahili speakers to get started with programming without ever knowing english pyswa

Jordan Kalebu 72 Dec 15, 2022
A Jupyter - Leaflet.js bridge

ipyleaflet A Jupyter / Leaflet bridge enabling interactive maps in the Jupyter notebook. Usage Selecting a basemap for a leaflet map: Loading a geojso

Jupyter Widgets 1.3k Dec 27, 2022
1900-2016 Olympic Data Analysis in Python by plotting different graphs

🔥 Olympics Data Analysis 🔥 In Data Science field, there is a big topic before creating a model for future prediction is Data Analysis. We can find o

Sayan Roy 1 Feb 06, 2022
Plot and save the ground truth and predicted results of human 3.6 M and CMU mocap dataset.

Visualization-of-Human3.6M-Dataset Plot and save the ground truth and predicted results of human 3.6 M and CMU mocap dataset. human-motion-prediction

Gaurav Kumar Yadav 5 Nov 18, 2022
A high-level plotting API for pandas, dask, xarray, and networkx built on HoloViews

hvPlot A high-level plotting API for the PyData ecosystem built on HoloViews. Build Status Coverage Latest dev release Latest release Docs What is it?

HoloViz 697 Jan 06, 2023
Create 3d loss surface visualizations, with optimizer path. Issues welcome!

MLVTK A loss surface visualization tool Simple feed-forward network trained on chess data, using elu activation and Adam optimizer Simple feed-forward

7 Dec 21, 2022
A research of IT labor market based especially on hh.ru. Salaries, rate of technologies and etc.

hh_ru_research Проект реализован в учебных целях анализа рынка труда, в особенности по hh.ru Input data В качестве входных данных используются сериали

3 Sep 07, 2022
mysql relation charts

sqlcharts 自动生成数据库关联关系图 复制settings.py.example 重命名为settings.py 将数据库配置信息填入settings.DATABASE,目前支持mysql和postgresql 执行 python build.py -b,-b是读取数据库表结构,如果只更新匹

6 Aug 22, 2022
Personal IMDB Graphs with Bokeh

Personal IMDB Graphs with Bokeh Do you like watching movies and also rate all of them in IMDB? Would you like to look at your IMDB stats based on your

2 Dec 15, 2021
PolytopeSampler is a Matlab implementation of constrained Riemannian Hamiltonian Monte Carlo for sampling from high dimensional disributions on polytopes

PolytopeSampler PolytopeSampler is a Matlab implementation of constrained Riemannian Hamiltonian Monte Carlo for sampling from high dimensional disrib

9 Sep 26, 2022
Data visualization using matplotlib

Data visualization using matplotlib project instructions Top 5 Most Common Coffee Origins In this visualization I used data from Ankur Chavda on Kaggl

13 Oct 27, 2021
Open-source demos hosted on Dash Gallery

Dash Sample Apps This repository hosts the code for over 100 open-source Dash apps written in Python or R. They can serve as a starting point for your

Plotly 2.7k Jan 07, 2023
Python Data Structures for Humans™.

Schematics Python Data Structures for Humans™. About Project documentation: https://schematics.readthedocs.io/en/latest/ Schematics is a Python librar

Schematics 2.5k Dec 28, 2022
Draw interactive NetworkX graphs with Altair

nx_altair Draw NetworkX graphs with Altair nx_altair offers a similar draw API to NetworkX but returns Altair Charts instead. If you'd like to contrib

Zachary Sailer 206 Dec 12, 2022
A Bokeh project developed for learning and teaching Bokeh interactive plotting!

Bokeh-Python-Visualization A Bokeh project developed for learning and teaching Bokeh interactive plotting! See my medium blog posts about making bokeh

Will Koehrsen 350 Dec 05, 2022