Finds, downloads, parses, and standardizes public bikeshare data into a standard pandas dataframe format

Last update: Dec 01, 2021

Related tags

Overview

opendata

Finds, downloads, parses, and standardizes public bikeshare data into a standard pandas dataframe format.

import asyncio
from opendata.sources.bikeshare.bay_wheels import trips as bay_wheels

trips_df, _ = asyncio.run(bay_wheels.async_load(trip_sample_rate=1000))

len(trips_df.index)
# 8731

trips_df.columns
# Index(['started_at', 'ended_at', 'start_station_id', 'end_station_id',
#        'start_station_name', 'end_station_name', 'rideable_type', 'ride_id',
#        'start_lat', 'start_lng', 'end_lat', 'end_lng', 'gender', 'user_type',
#        'bike_id', 'birth_year'],
#       dtype='object')

An example analysis can be found here: https://observablehq.com/@brady/bikeshare

Supports sampling and local file caching to improve performance.

Markets supported

import opendata.sources.bikeshare.bay_wheels
import opendata.sources.bikeshare.bixi
import opendata.sources.bikeshare.divvy
import opendata.sources.bikeshare.capital_bikeshare
import opendata.sources.bikeshare.citi_bike
import opendata.sources.bikeshare.cogo
import opendata.sources.bikeshare.niceride
import opendata.sources.bikeshare.bluebikes
import opendata.sources.bikeshare.metro_bike_share
import opendata.sources.bikeshare.indego

Bootstrap

Set up your environment

brew install chromedriver
brew install python3
python3 -m pip install pre-commit

pre-commit install --install-hooks
python3 -m venv venv
source venv/bin/activate
python3 -m pip install -r requirements.txt

Entering virtualenv

python3 -m venv venv
source venv/bin/activate
python3 -m pip install -r requirements.txt

Usage

Try the test export to CSV:

python3 test.py

Updating pip requirements

pip-compile

Pre-commit setup

pre-commit install --install-hooks

Finds, downloads, parses, and standardizes public bikeshare data into a standard pandas dataframe format

Related tags

Overview

opendata

Markets supported

Bootstrap

Entering virtualenv

Usage

Updating pip requirements

Pre-commit setup

Bikeshare markets to add

USA

World

Owner

Brady Law

My solution to the book A Collection of Data Science Take-Home Challenges

Programmatically access the physical and chemical properties of elements in modern periodic table.

t-SNE and hierarchical clustering are popular methods of exploratory data analysis, particularly in biology.

PyPDC is a Python package for calculating asymptotic Partial Directed Coherence estimations for brain connectivity analysis.

OpenARB is an open source program aiming to emulate a free market while encouraging players to participate in arbitrage in order to increase working capital.

Udacity - Data Analyst Nanodegree - Project 4 - Wrangle and Analyze Data

Single machine, multiple cards training; mix-precision training; DALI data loader.

VevestaX is an open source Python package for ML Engineers and Data Scientists.

Creating a statistical model to predict 10 year treasury yields

Randomisation-based inference in Python based on data resampling and permutation.

Weather analysis with Python, SQLite, SQLAlchemy, and Flask

💬 Python scripts to parse Messenger, Hangouts, WhatsApp and Telegram chat logs into DataFrames.

In this tutorial, raster models of soil depth and soil water holding capacity for the United States will be sampled at random geographic coordinates within the state of Colorado.

Data and code accompanying the paper Politics and Virality in the Time of Twitter

Techdegree Data Analysis Project 2

Hatchet is a Python-based library that allows Pandas dataframes to be indexed by structured tree and graph data.

Find exposed data in Azure with this public blob scanner

wikirepo is a Python package that provides a framework to easily source and leverage standardized Wikidata information

scikit-survival is a Python module for survival analysis built on top of scikit-learn.

Manage large and heterogeneous data spaces on the file system.