Python PostgreSQL adapter to stream results of multi-statement queries without a server-side cursor

Overview

streampq CircleCI Test Coverage

Stream results of multi-statement PostgreSQL queries from Python without server-side cursors. Has benefits over some other Python PostgreSQL libraries:

  • Streams results from complex multi-statement queries even though SQL doesn't allow server-side cursors for such queries - suitable for large amounts of results that don't fit in memory.

  • CTRL+C (SIGINT) by default behaves as expected even during slow queries - a KeyboardInterrupt is raised and quickly bubbles up through streampq code. Unless client code prevents it, the program will exit.

  • Every effort is made to cancel queries on KeyboardInterrupt, SystemExit, or errors - the server doesn't continue needlessly using resources.

Particularly useful when temporary tables are needed to store intermediate results in multi-statement SQL scripts.

Installation

pip install streampq

The libpq binary library is also required. This is typically either already installed, or installed by:

  • macOS + brew: brew install libpq
  • Linux (Debian): apt install libpq5
  • Linux (Red Hat):yum install postgresql-libs

The only runtime dependencies are libpq and Python itself.

Usage

from streampq import streampq_connect

# libpq connection paramters
# https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-PARAMKEYWORDS
#
# Any can be ommitted and environment variables will be used instead
# https://www.postgresql.org/docs/current/libpq-envars.html
connection_params = (
    ('host', 'localhost'),
    ('port', '5432'),
    ('dbname', 'postgres'),
    ('user', 'postgres'),
    ('password', 'password'),
)

# SQL statement(s) - if more than one, separate by ;
sql = '''
    SELECT * FROM my_table;
    SELECT * FROM my_other_table;
'''

# Connection and querying is via a context manager
with streampq_connect(connection_params) as query:
    for (columns, rows) in query(sql):
        print(columns)  # Tuple of column names
        for row in rows:
            print(row)  # Tuple of row  values

PostgreSQL types to Python type decoding

There are 164 built-in PostgreSQL data types (including array types), and streampq converts them to Python types. In summary:

PostgreSQL types Python type
null None
text (e.g. varchar), xml, network addresses, and money str
byte (e.g. bytea) bytes
integer (e.g. int4) int
inexact real number (e.g. float4) float
exact real number (e.g. numeric) Decimal
date date
timestamp datetime (without timezone)
timestamptz datetime (with offset timezone)
json and jsonb output of json.loads
interval streampq.Interval
range (e.g. daterange) streampq.Range
multirange (e.g. datemultirange) tuples of streampq.Range
arrays and vectors tuple (of any of the above types, or of nested tuples)

To customise these, override the default value of the get_decoders parameter of the streampq_connect function in streampq.py.

In general, built-in types are preferred over custom types, and immutable types are preferred over mutable.

streampq.Interval

The Python built-in timedelta type is not used for PostgreSQL interval since timedelta does not offer a way to store PostgreSQL intervals of years or months, other than converting to days which would be a loss of information.

Instead, a namedtuple is defined, streampq.Interval, with members:

Member Type
years int
months int
days int
hours int
minutes int
seconds Decimal

streampq.Range

There is no Python built-in type for a PosgreSQL range. So for these, a namedtuple is defined, streampq.Range, with members:

Member Type
lower int, date, datetime (without timezone), or datetime (with offset timezone)
upper int, date, datetime (without timezone), or datetime (with offset timezone)
bounds str - one of (), (], [), or []

Bind parameters - literals

Dynamic SQL literals can be bound using the literals parameter of the query function. It must be an iterable of key-value pairs.

sql = '''
    SELECT * FROM my_table WHERE my_col = {my_col_value};
'''

with streampq_connect(connection_params) as query:
    for (columns, rows) in query(sql, literals=(
        ('my_col_value', 'my-value'),
    )):
        for row in rows:
            pass

Bind parameters - identifiers

Dynamic SQL identifiers, e.g. column names, can be bound using the identifiers parameter of the query function. It must be an iterable of key-value pairs.

sql = '''
    SELECT * FROM my_table WHERE {column_name} = 'my-value';
'''

with streampq_connect(connection_params) as query:
    for (columns, rows) in query(sql, identifiers=(
        ('column_name', 'my_col'),
    )):
        for row in rows:
            pass

Identifiers and literals use different escaping rules - hence the need for 2 different parameters.

Single-statement SQL queries

While this library is specialsed for multi-statement queries, it works fine when there is only one. In this case the iterable returned from the query function yields only a single (columns, rows) pair.

Exceptions

Exceptions derive from streampq.StreamPQError. If there is any more information available on the error, it's added as a string in its args property. This is included in the string representation of the exception by default.

Exception hierarchy

  • StreamPQError

    Base class for all explicitly-thrown exceptions

    • ConnectionError

      An error occurred while attempting to connect to the database.

    • QueryError

      An error occurred while attempting to run a query. Typically this is due to a syntax error or a missing column.

    • CancelError

      An error occurred while attempting to cancel a query.

    • CommunicationError

      An error occurred communicating with the database after successful connection.

Owner
Department for International Trade
Department for International Trade
Class to connect to XAMPP MySQL Database

MySQL-DB-Connection-Class Class to connect to XAMPP MySQL Database Basta fazer o download o mysql_connect.py e modificar os parâmetros que quiser. E d

Alexandre Pimentel 4 Jul 12, 2021
Python script to clone SQL dashboard from one workspace to another

Databricks dashboard clone Unofficial project to allow Databricks SQL dashboard copy from one workspace to another. Resource clone Setup: Create a fil

Quentin Ambard 12 Jan 01, 2023
MySQLdb is a Python DB API-2.0 compliant library to interact with MySQL 3.23-5.1 (unofficial mirror)

==================== MySQLdb Installation ==================== .. contents:: .. Prerequisites ------------- + Python 2.3.4 or higher * http://ww

Sébastien Arnaud 17 Oct 10, 2021
Database connection pooler for Python

Nimue Strange women lying in ponds distributing swords is no basis for a system of government! --Dennis, Peasant Nimue is a database connection pool f

1 Nov 09, 2021
A Python-based RPC-like toolkit for interfacing with QuestDB.

pykit A Python-based RPC-like toolkit for interfacing with QuestDB. Requirements Python 3.9 Java Azul

QuestDB 11 Aug 03, 2022
Create a database, insert data and easily select it with Sqlite

sqliteBasics create a database, insert data and easily select it with Sqlite Watch on YouTube a step by step tutorial explaining this code: https://yo

Mariya 27 Dec 27, 2022
dask-sql is a distributed SQL query engine in python using Dask

dask-sql is a distributed SQL query engine in Python. It allows you to query and transform your data using a mixture of common SQL operations and Python code and also scale up the calculation easily

Nils Braun 271 Dec 30, 2022
Lazydata: Scalable data dependencies for Python projects

lazydata: scalable data dependencies lazydata is a minimalist library for including data dependencies into Python projects. Problem: Keeping all data

629 Nov 21, 2022
aioodbc - is a library for accessing a ODBC databases from the asyncio

aioodbc aioodbc is a Python 3.5+ module that makes it possible to access ODBC databases with asyncio. It relies on the awesome pyodbc library and pres

aio-libs 253 Dec 31, 2022
A simple password manager I typed with python using MongoDB .

Python with MongoDB A simple python code example using MongoDB. How do i run this code • First of all you need to have a python on your computer. If y

31 Dec 06, 2022
DBMS Mini-project: Recruitment Management System

# Hire-ME DBMS Mini-project: Recruitment Management System. 💫 ✨ Features Python + MYSQL using mysql.connector library Recruiter and Client Panel Beau

Karan Gandhi 35 Dec 23, 2022
This is a repository for a task assigned to me by Bilateral solutions!

Processing-Files-using-MySQL This is a repository for a task assigned to me by Bilateral solutions! Task: Make Folders named Processing,queue and proc

Kandal Khandeka 1 Nov 07, 2022
PyRemoteSQL is a python SQL client that allows you to connect to your remote server with phpMyAdmin installed.

PyRemoteSQL Python MySQL remote client Basically this is a python SQL client that allows you to connect to your remote server with phpMyAdmin installe

ProbablyX 3 Nov 04, 2022
Implementing basic MongoDB CRUD (Create, Read, Update, Delete) queries, using Python.

MongoDB with Python Implementing basic MongoDB CRUD (Create, Read, Update, Delete) queries, using Python. We can connect to a MongoDB database hosted

MousamSingh 4 Dec 01, 2021
Simple Python demo app that connects to an Oracle DB.

Cloud Foundry Sample Python Application Connecting to Oracle Simple Python demo app that connects to an Oracle DB. The app is based on the example pro

Daniel Buchko 1 Jan 10, 2022
A simple wrapper to make a flat file drop in raplacement for mongodb out of TinyDB

Purpose A simple wrapper to make a drop in replacement for mongodb out of tinydb. This module is an attempt to add an interface familiar to those curr

180 Jan 01, 2023
Little wrapper around asyncpg for specific experience.

Little wrapper around asyncpg for specific experience.

Nikita Sivakov 3 Nov 15, 2021
Implementing basic MySQL CRUD (Create, Read, Update, Delete) queries, using Python.

MySQL with Python Implementing basic MySQL CRUD (Create, Read, Update, Delete) queries, using Python. We can connect to a MySQL database hosted locall

MousamSingh 5 Dec 01, 2021
High level Python client for Elasticsearch

Elasticsearch DSL Elasticsearch DSL is a high-level library whose aim is to help with writing and running queries against Elasticsearch. It is built o

elastic 3.6k Jan 03, 2023
Find graph motifs using intuitive notation

d o t m o t i f Find graph motifs using intuitive notation DotMotif is a library that identifies subgraphs or motifs in a large graph. It looks like t

APL BRAIN 45 Jan 02, 2023