A simple XLSX/CSV reader - to dictionary converter

Overview

Test Python package codecov Upload Python Package to PyPI PythonVersions Sourcery Black Snyk Downloads Twitter Follow

sheet2dict

A simple XLSX/CSV reader - to dictionary converter

Installing

To install the package from pip, first run:

python3 -m pip install --no-cache-dir sheet2dict

Required pip packages for sheet2doc: csv, openpyxl

Usage

This library has 2 main features: reading a spreadsheet files and converting them to array of python dictionaries.

- XLSX

Use xlsx_to_dict() method when converting form spreadsheets.
Supported file formats for spreadsheets are: .xlsx,.xlsm,.xltx,.xltm
Spreadsheets with multiple worksheets are supported. If no sheet is specified, the active sheet is selected. If there is only one sheet, it is considered active.

# Import the library
from sheet2dict import Worksheet

# Create an object
ws = Worksheet()

# Convert active sheet (without specifying sheet name)
ws.xlsx_to_dict(path='inventory.xlsx')

# Convert the 'Main Warehouse' sheet of the 'inventory.xslx' spreadsheet file.
ws.xlsx_to_dict(path='inventory.xlsx', select_sheet='Main Warehouse')

# object.header returns first row with the data in a spreadsheet 
print(ws.header)

# object.sheet_items returns converted rows as dictionaries in the array 
print(ws.sheet_items)

You can parse data when worksheet is an object

# Import the library
from sheet2dict import Worksheet

# Example: read spreadsheet as object
path = 'inventory.xlsx'
xlsx_file = open(path, 'rb')
xlsx_file = BytesIO(xlsx_file.read())

# Parse spreadsheet from object
ws = Worksheet()
ws.xlsx_to_dict(path=xlsx_file)
print(ws.header)

- CSV

Use csv_to_dict() method when converting form csv.
CSV is a format with many variations, better handle encodings and delimiters on user side and not within module itself.

# Import the library
from sheet2dict import Worksheet

# Create an object
ws = Worksheet()

# Read CSV file
csv_file = open('inventory.csv', 'r', encoding='utf-8-sig')

# Convert 
ws.csv_to_dict(csv_file=csv_file, delimiter=';')

# object.header returns first row with the data in a spreadsheet 
print(ws.header)

# object.sheet_items returns converted rows as dictionaries in the array 
print(ws.sheet_items)

- Other functions

Worksheet object.header returns first row with the data in a spreadsheet

>> ws.header {'country': 'SK', 'city': 'Bratislava', 'citizens': '400000', 'random_field': 'cc'}">
Python 3.9.1
[Clang 12.0.0 (clang-1200.0.32.28)] on darwin
>>> from sheet2dict import Worksheet
>>> ws = Worksheet()
>>> ws.xlsx_to_dict(path="inventory.xlsx")

>>> ws.header
{'country': 'SK', 'city': 'Bratislava', 'citizens': '400000', 'random_field': 'cc'}

Worksheet object.sanitize_sheet_items removes None or empty dictionary keys from sheet_items

>> ws.sheet_items [ {'country': 'CZ', 'city': 'Prague', 'citizens': '600000', None: '22', 'random_field': 'cc'}, {'country': 'UK', 'city': 'London', 'citizens': '2000000', None: '33', 'random_field': 'cc'} ] >>> ws.sanitize_sheet_items [ {'country': 'CZ', 'city': 'Prague', 'citizens': '600000', 'random_field': 'cc'}, {'country': 'UK', 'city': 'London', 'citizens': '2000000', 'random_field': 'cc'} ]">
>>> from sheet2dict import Worksheet
>>> ws = Worksheet()
>>> ws.xlsx_to_dict(path="inventory.xlsx")

>>> ws.sheet_items
[
  {'country': 'CZ', 'city': 'Prague', 'citizens': '600000', None: '22', 'random_field': 'cc'},
  {'country': 'UK', 'city': 'London', 'citizens': '2000000', None: '33', 'random_field': 'cc'}
]

>>> ws.sanitize_sheet_items
[
  {'country': 'CZ', 'city': 'Prague', 'citizens': '600000', 'random_field': 'cc'},
  {'country': 'UK', 'city': 'London', 'citizens': '2000000', 'random_field': 'cc'}
]

Contributing and Code of Conduct

Contributing to sheet2dict

As an open source project, sheet2dict welcomes contributions of many forms.
Please read and follow our Contributing to sheet2dict

Contributors:

  • Thanks to 白一百 (bái-yī-bǎi) for making sheet2dict work with multi-sheet Excel files.

Code of Conduct

As a contributor, you can help us keep the sheet2dict project open and inclusive.
Please read and follow our Code of Conduct

You might also like...
SCTYMN is a GitHub repository that includes some simple scripts(currently only python scripts) that can be useful.

Simple Codes That You Might Need SCTYMN is a GitHub repository that includes some simple scripts(currently only python scripts) that can be useful. In

Żmija is a simple universal code generation tool.

Żmija Żmija is a simple universal code generation tool. It is intended to be used as a means to generate code that is both efficient and easily mainta

 Simple yet powerful CAD (Computer Aided Design) library, written with Python.
Simple yet powerful CAD (Computer Aided Design) library, written with Python.

Py-MADCAD it's time to throw parametric softwares out ! Simple yet powerful CAD (Computer Aided Design) library, written with Python. Installation

A simple USI Shogi Engine written in python using python-shogi.

Revengeshogi My attempt at creating a USI Shogi Engine in python using python-shogi. Current State of Engine Currently only generating random moves us

A simple document management REST based API for collaboratively interacting with documents

documan_api A simple document management REST based API for collaboratively interacting with documents.

Django-Text-to-HTML-converter - The simple Text to HTML Converter using Django framework

Django-Text-to-HTML-converter This is the simple Text to HTML Converter using Dj

Single API for reading, manipulating and writing data in csv, ods, xls, xlsx and xlsm files

pyexcel - Let you focus on data, instead of file formats Support the project If your company has embedded pyexcel and its components into a revenue ge

A flask extension using pyexcel to read, manipulate and write data in different excel formats: csv, ods, xls, xlsx and xlsm.

Flask-Excel - Let you focus on data, instead of file formats Support the project If your company has embedded pyexcel and its components into a revenu

Scrapes mcc-mnc.com and outputs 3 files with the data (JSON, CSV & XLSX)

mcc-mnc.com-webscraper Scrapes mcc-mnc.com and outputs 3 files with the data (JSON, CSV & XLSX) A Python script for web scraping mcc-mnc.com Link: mcc

LightCSV - This CSV reader is implemented in just pure Python.

LightCSV Simple light CSV reader This CSV reader is implemented in just pure Python. It allows to specify a separator, a quote char and column titles

JSON and CSV data for Swahili dictionary with over 16600+ words

kamusi JSON and CSV data for swahili dictionary with over 16600+ words. This repo consists of data from swahili dictionary with about 16683 words toge

Sheet Data Image/PDF-to-CSV Converter

Sheet Data Image/PDF-to-CSV Converter

DB-Drive-CSV - This is app is can be used to access CSV file as JSON from Google Drive.

DB Drive CSV This is app is can be used to access CSV file as JSON from Google Drive. How To Use Create file/ upload file to Google Drive There's 2 fi

Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel order of RGB and BGR. Simple Channel Converter for ONNX.
Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel order of RGB and BGR. Simple Channel Converter for ONNX.

scc4onnx Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel

A Python module for creating Excel XLSX files.
A Python module for creating Excel XLSX files.

XlsxWriter XlsxWriter is a Python module for writing files in the Excel 2007+ XLSX file format. XlsxWriter can be used to write text, numbers, formula

A Python module for creating Excel XLSX files.
A Python module for creating Excel XLSX files.

XlsxWriter XlsxWriter is a Python module for writing files in the Excel 2007+ XLSX file format. XlsxWriter can be used to write text, numbers, formula

É uma API feita em Python e Flask que pesquisa informações em uma tabela .xlsx e retorna o resultado.
É uma API feita em Python e Flask que pesquisa informações em uma tabela .xlsx e retorna o resultado.

API de rastreamento de pacotes É uma API feita em Python e Flask que pesquisa informações de rastreamento de pacotes em uma tabela .xlsx e retorna o r

Transfers a image file(.png) to an Excel file(.xlsx)
Transfers a image file(.png) to an Excel file(.xlsx)

Transfers a image file(.png) to an Excel file(.xlsx)

An XLSX spreadsheet renderer for Django REST Framework.

drf-renderer-xlsx provides an XLSX renderer for Django REST Framework. It uses OpenPyXL to create the spreadsheet and returns the data.

Comments
  • Empty cells in xlsx become the non-empty string

    Empty cells in xlsx become the non-empty string "None" in the dictionary

    Thank you for writing this. I have recently moved from using csv files to excel files and this has made the transition easier.

    One issue I have encountered is that empty cells are not being returned as a Falsey value like None, or the empty string ''.

    From what I have been able to understand, openpyxl reads the empty cell as None, and the following code in sheet2dict uses str to convert this into the string 'None'.

    https://github.com/Pytlicek/sheet2dict/blob/93bf731d327d620755ffd5585aedb0b23a17a6a8/sheet2dict/main.py#L30-L34

    I have filtered my own dictionaries to turn 'None' into '' but that will cause issues when any of my sheet really do contain the word None.

    I just did some quick testing, and it seems that the csv.DictReader() in the python standard library uses the empty string '' for missing values in the a csv file, so that might be a good thing to do for these cases.

    enhancement 
    opened by JoshuaCapel 5
  • Code Proposal: Allow User to Specify Sheet (Work with multi-sheet Excel Files)

    Code Proposal: Allow User to Specify Sheet (Work with multi-sheet Excel Files)

    This library is awesome! I'm using it a lot since it makes working with Excel files very simple.

    I have a proposal: Allow the user to specify the sheet they want to modify.

    I write a lot of functional programming and I can't remember if it's possible to overwrite a class function at import, so instead I've modified the sheet2dict to return a specified sheet when openpyxl opens the file.

    in main.py:

    def xlsx_to_dict(self, path, select_sheet=None):
            """
            Read a Worksheet and return it as array of dictionaries
            :param self: Worksheet Object
            :param path: Path to XLSX file
            :return: Array of rows as dictionaries
            """
            book = load_workbook(path)
            if select_sheet == None:
                  sheet = book.active
            else:
                  sheet = book[select_sheet]
    

    The user would then read an Excel file like this:

    ws =Worksheet()
    ws.xlsx_to_dict(path=<insert path>, select_sheet='<sheetname>'
    

    I don't know if this is the most elegant way to select a sheet, but it works for me. I suppose to import all worksheets by default - I think the maximum is 32 - into a larger dictionary, <edit I didn't complete my thought>, that there could be a clever way to put dictionaries in dictionaries, but this might be too convoluted since I assume most users will just use one sheet. Another option is just to continue creating Worksheet() objects with different names and sheets selected.

    enhancement 
    opened by bai-yi-bai 3
  • Add an option to parse all worksheets in one function call

    Add an option to parse all worksheets in one function call

    Added an option to parse all sheets from a workbook at once while keeping the way existing features worked untouched.

    Had to modify sheet_items to a dict with sheet names as keys and sheet's objects as values.

    enhancement 
    opened by 68Kamil68 2
  • Update README.md

    Update README.md

    Changed "the first sheet" to "active sheet" to avoid confusion if someone saves a spreadsheet and the active worksheet is, fe. second in line. Also, what is considered an active worksheet when there is only one worksheet in the spreadsheet. Add an example with and without the "select_sheet" option.

    documentation 
    opened by Pytlicek 1
Releases(0.1.5)
Owner
Tomas Pytel
PyConSK Organizer, Python Developer & DevOps
Tomas Pytel
Python 3 wrapper for the Vultr API v2.0

Vultr Python Python wrapper for the Vultr API. https://www.vultr.com https://www.vultr.com/api This is currently a WIP and not complete, but has some

CSSNR 6 Apr 28, 2022
FireEye Related Projects

FireEye FireEye Related Projects Tor-IP-Collector Simple python script that will collect a list of TOR IPs from the SecOps Institute Github and inject

Taran Ulrich 2 Nov 12, 2022
OpenTelemetry Python API and SDK

Getting Started • API Documentation • Getting In Touch (GitHub Discussions) Contributing • Examples OpenTelemetry Python This page describes the Pytho

OpenTelemetry - CNCF 1.1k Jan 08, 2023
Code and pre-trained models for "ReasonBert: Pre-trained to Reason with Distant Supervision", EMNLP'2021

ReasonBERT Code and pre-trained models for ReasonBert: Pre-trained to Reason with Distant Supervision, EMNLP'2021 Pretrained Models The pretrained mod

SunLab-OSU 29 Dec 19, 2022
A python package to avoid writing and maintaining duplicated python docstrings.

docstring-inheritance is a python package to avoid writing and maintaining duplicated python docstrings.

Antoine Dechaume 15 Dec 07, 2022
📘 OpenAPI/Swagger-generated API Reference Documentation

Generate interactive API documentation from OpenAPI definitions This is the README for the 2.x version of Redoc (React-based). The README for the 1.x

Redocly 19.2k Jan 02, 2023
Read write method - Read files in various types of formats

一个关于所有格式文件读取的方法 1。 问题描述: 各种各样的文件格式,读写操作非常的麻烦,能够有一种方法,可以整合所有格式的文件,方便用户进行读取和写入。 2

2 Jan 26, 2022
Course Materials for Math 340

UBC Math 340 Materials This repository aims to be the one repository for which you can find everything you about Math 340. Lecture Notes Lecture Notes

2 Nov 25, 2021
sphinx builder that outputs markdown files.

sphinx-markdown-builder sphinx builder that outputs markdown files Please ★ this repo if you found it useful ★ ★ ★ If you want frontmatter support ple

Clay Risser 144 Jan 06, 2023
Test utility for validating OpenAPI documentation

DRF OpenAPI Tester This is a test utility to validate DRF Test Responses against OpenAPI 2 and 3 schema. It has built-in support for: OpenAPI 2/3 yaml

snok 106 Jan 05, 2023
MkDocs plugin for setting revision date from git per markdown file

mkdocs-git-revision-date-plugin MkDocs plugin that displays the last revision date of the current page of the documentation based on Git. The revision

Terry Zhao 48 Jan 06, 2023
Poetry plugin to export the dependencies to various formats

Poetry export plugin This package is a plugin that allows the export of locked packages to various formats. Note: For now, only the requirements.txt f

Poetry 90 Jan 05, 2023
JTEX is a command line tool (CLI) for rendering LaTeX documents from jinja-style templates.

JTEX JTEX is a command line tool (CLI) for rendering LaTeX documents from jinja-style templates. This package uses Jinja2 as the template engine with

Curvenote 15 Dec 21, 2022
Jupyter Notebooks as Markdown Documents, Julia, Python or R scripts

Have you always wished Jupyter notebooks were plain text documents? Wished you could edit them in your favorite IDE? And get clear and meaningful diff

Marc Wouts 5.7k Jan 04, 2023
Openapi-core is a Python library that adds client-side and server-side support for the OpenAPI Specification v3.

Openapi-core is a Python library that adds client-side and server-side support for the OpenAPI Specification v3.

A 186 Dec 30, 2022
Plugins for MkDocs.

Plugins for MkDocs and Python Markdown pip install neoteroi-mkdocs This package includes the following plugins and extensions: Name Description Type m

35 Dec 23, 2022
Pystm32ai - A Python wrapper for the stm32ai command-line tool

PySTM32.AI A python wrapper for the stm32ai command-line tool to analyse deep le

Thibaut Vercueil 5 Jul 28, 2022
Show Rubygems description and annotate your code right from Sublime Text.

Gem Description for Sublime Text Show Rubygems description and annotate your code. Just mouse over your Gemfile's gem definitions to show the popup. s

Nando Vieira 2 Dec 19, 2022
NoVmpy - NoVmpy with python

git clone -b dev-1 https://github.com/wallds/VTIL-Python.git cd VTIL-Python py s

263 Dec 23, 2022
Tutorial for STARKs with supporting code in python

stark-anatomy STARK tutorial with supporting code in python Outline: introduction overview of STARKs basic tools -- algebra and polynomials FRI low de

121 Jan 03, 2023