A common, beautiful interface to tabular data, no matter the format

Overview

rows

Join the chat at https://gitter.im/turicas/rows Current version at PyPI Downloads per month on PyPI Supported Python Versions Software status License: LGPLv3

No matter in which format your tabular data is: rows will import it, automatically detect types and give you high-level Python objects so you can start working with the data instead of trying to parse it. It is also locale-and-unicode aware. :)

Want to learn more? Read the documentation (or build and browse the docs locally by running make docs-serve after installing requirements-development.txt).

Installation

The easiest way to getting the hands dirty is install rows, using pip.

PyPI

pip install rows

For another ways to instal refer to the Installation section documentation.

Contribution start guide

The preferred way to start contributing for the project is creating a virtualenv (you can do by using virtualenv, virtualenvwrapper, pyenv or whatever tool you'd like).

Create the virtualenv:

mkvirtualenv rows

Install all plugins' dependencies:

pip install --editable .[all]

Install development dependencies:

pip install -r requirements-development.txt
Comments
  • OverflowError

    OverflowError

    Após instalar as dependências requeridas para-o pacote socios-brasil, ao tentar descompactar como indicado, obtenho o erro abaixo:

    Traceback (most recent call last):
     File "extract_dump.py", line 27, in <module> 
        import rows
     File "C:\Users\milcent\AppData\Local\Continuum\Anaconda3\lib\site-packages\row s\__init__.py", line 22, in <module>
        import rows.plugins as plugins
     File "C:\Users\milcent\AppData\Local\Continuum\Anaconda3\lib\site-packages\row s\plugins\__init__.py", line 20, in <module>
        from . import plugin_csv as csv # NOQA
     File "C:\Users\milcent\AppData\Local\Continuum\Anaconda3\lib\site-packages\row s\plugins\plugin_csv.py", line 34, in <module>
        unicodecsv.field_size_limit(sys.maxsize) 
    OverflowError: Python int too large to convert to C long
    

    Rodando em Windows 7, Anaconda 64 bits, Python 3.6. Grato, Marcel Milcent

    opened by milcent 13
  • PDF Plugin

    PDF Plugin

    Create an algorithm to automatically extract tables from PDFs (available in text format). Could use pdftables, but the code is not up-to-date, does not work with Python3 etc.

    enhancement plugin 
    opened by turicas 7
  • Converter PDF x TXT

    Converter PDF x TXT

    Bom dia, estou tentando converter um arquivo pdf escaneado para texto (o pdf contém tabelas). Consegui instalar a biblioteca rows e as dependências rows[pdf], rows[cli]. Quando eu tento rodar o código em prompt command: rows pdf-to-text teste.pdf result.txt Eu tenho o seguinte erro: image

    Alguma ideia do que possa ser o problema?

    opened by Danielydsm 6
  • Autodetect delimiter in CSV files

    Autodetect delimiter in CSV files

    Currently the import_from_csv method have the parameter 'delimiter' that assumes ',' as default, but sometimes we don't know what is the delimiter and need it autodetect. Specially usefull in case of CSV files generated in MS Excell that uses ';' as delimiter.

    A quick and dirty possibility to make this works is counting the number of times ',', ';' and 'tab' is used in the file and assumes as delimiter the most used.

    enhancement help wanted plugin 
    opened by jeanferri 6
  • OverflowError: Python int too large to convert to C long

    OverflowError: Python int too large to convert to C long

    Bom dia!

    Estou aprendendo Python, então este pode ser um erro bem simples de resolver, mesmo assim não faço ideia do que pode ser feito:

    Ao tentar importar o rows aparece a mensagem do título.

    duplicate 
    opened by tbmpereira 5
  • Text plugin is not working on `rows convert`

    Text plugin is not working on `rows convert`

    The file cha-de-bebe.txt is not being read correctly on the command line (try rows print cha-de-bebe.txt or rows convert cha-de-bebe.txt cha-de-bebe.csv) -- but it was generated correctly using rows print http://some-url/ > cha-de-bebe.txt.

    @jsbueno could you please help checking it? I think this bug started after your PR #270 .

    bug 
    opened by turicas 5
  • locale.Error: unsupported locale setting

    locale.Error: unsupported locale setting

    ======================================================================
    ERROR: test_DecimalField (tests.tests_fields.FieldsTestCase)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/tests/tests_fields.py", line 203, in test_DecimalField
        with rows.locale_context(locale_name):
      File "/usr/lib64/python3.5/contextlib.py", line 59, in __enter__
        return next(self.gen)
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/rows/localization.py", line 23, in locale_context
        locale.setlocale(category, name)
      File "/usr/lib64/python3.5/locale.py", line 594, in setlocale
        return _setlocale(category, locale)
    locale.Error: unsupported locale setting
    
    ======================================================================
    ERROR: test_FloatField (tests.tests_fields.FieldsTestCase)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/tests/tests_fields.py", line 171, in test_FloatField
        with rows.locale_context(locale_name):
      File "/usr/lib64/python3.5/contextlib.py", line 59, in __enter__
        return next(self.gen)
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/rows/localization.py", line 23, in locale_context
        locale.setlocale(category, name)
      File "/usr/lib64/python3.5/locale.py", line 594, in setlocale
        return _setlocale(category, locale)
    locale.Error: unsupported locale setting
    
    ======================================================================
    ERROR: test_IntegerField (tests.tests_fields.FieldsTestCase)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/tests/tests_fields.py", line 144, in test_IntegerField
        with rows.locale_context(locale_name):
      File "/usr/lib64/python3.5/contextlib.py", line 59, in __enter__
        return next(self.gen)
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/rows/localization.py", line 23, in locale_context
        locale.setlocale(category, name)
      File "/usr/lib64/python3.5/locale.py", line 594, in setlocale
        return _setlocale(category, locale)
    locale.Error: unsupported locale setting
    
    ======================================================================
    ERROR: test_PercentField (tests.tests_fields.FieldsTestCase)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/tests/tests_fields.py", line 250, in test_PercentField
        with rows.locale_context(locale_name):
      File "/usr/lib64/python3.5/contextlib.py", line 59, in __enter__
        return next(self.gen)
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/rows/localization.py", line 23, in locale_context
        locale.setlocale(category, name)
      File "/usr/lib64/python3.5/locale.py", line 594, in setlocale
        return _setlocale(category, locale)
    locale.Error: unsupported locale setting
    
    ======================================================================
    ERROR: test_locale_context (tests.tests_localization.LocalizationTestCase)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/tests/tests_localization.py", line 41, in test_locale_context
        with locale_context(name):
      File "/usr/lib64/python3.5/contextlib.py", line 59, in __enter__
        return next(self.gen)
      File "/home/brain/git/fedora/python-rows/rows-0.3.0/rows/localization.py", line 23, in locale_context
        locale.setlocale(category, name)
      File "/usr/lib64/python3.5/locale.py", line 594, in setlocale
        return _setlocale(category, locale)
    locale.Error: unsupported locale setting
    
    opened by ignatenkobrain 5
  • Porting rows to Python3

    Porting rows to Python3

    This is a work in progress.

    I could make all tests pass on Python3, but 3 are broken on Python2 because of something I can't find yet on the type identification system.

    This PR is just to share it with you. Maybe your familiarity with the code can help fixing the tests.

    []'s!

    opened by henriquebastos 5
  • UserWarning: Call to deprecated function or class get_active_sheet

    UserWarning: Call to deprecated function or class get_active_sheet

    Hi, when I build package for Debian, debhelper tools runs pybuild, showing this warnings [1] I use the lastest source: git20151115.837b41.

    Is there something here or other has the same problem? thanks.

    [1] pybuild --test --test-nose -i python{version} -p 2.7 --dir . I: pybuild base:184: cd /pkgs/pkg-rows/rows-0.1.1+git20151115.837b41/.pybuild/pythonX.Y_2.7/build; python2.7 -m nose tests ...................................................................................................../usr/lib/python2.7/dist-packages/openpyxl/workbook/workbook.py:102: UserWarning: Call to deprecated function or class get_active_sheet (Use the .active property). def get_active_sheet(self): /usr/lib/python2.7/dist-packages/openpyxl/workbook/workbook.py:102: UserWarning: Call to deprecated function or class get_active_sheet (Use the .active property). def get_active_sheet(self): ./usr/lib/python2.7/dist-packages/openpyxl/workbook/workbook.py:102: UserWarning: Call to deprecated function or class get_active_sheet (Use the .active property). def get_active_sheet(self): ./usr/lib/python2.7/dist-packages/openpyxl/workbook/workbook.py:102: UserWarning: Call to deprecated function or class get_active_sheet (Use the .active property). def get_active_sheet(self):

    ..........................

    Ran 129 tests in 1.936s

    OK

    opened by kretcheu 5
  • Add sphinx documentation

    Add sphinx documentation

    Hello dear reviewer,

    I basically did three things:

    • Add the sphinx to the requirements-development.txt
    • Create a basic documentation, based on the Readme, with few improvements i've made.
    • Move some basic project information (intro and archtecture) to the init.py of the rows module

    I think the Sphinx doc can also be used as a website, and maybe can be hosted at github pages.

    []'s I hope this will be usefull! :)

    opened by raphapassini 5
  • Could not find import_from_pdf function

    Could not find import_from_pdf function

    I need to import data from pdf and found this example: https://gist.github.com/turicas/6b9ca83dcd531a6cd4fd87ced2a28c70

    But I was unable to run it, since the import_from_pdf is not available to me.

    I have already run the command: pip install rows[all]

    Is pdf format no longer supported?

    opened by marcellalves 4
  • New release on pypi

    New release on pypi

    I started using the "rows" lib today, and I've lost several hours of work because of a bug on empty cells in ods input. Here is my story.

    I was learning/discovering the "rows" lib with an ODS file, and I fall across a strange behavior. Of course, I thought it was because I didn't use the lib properly : so I tried all possible options, searched on the Internet... etc. After several hours, I eventually tried the same code with an equivalent XLSX file and I found out that the behavior was different ! So I realized that I had found a bug on my first day of use of the rows lib !

    I decided that I should report the bug. I took the time to write a script to illustrate my bug report. I was using rows 0.4.1 from pypi, but, before creating the bug report on github, I thought I should check if the bug is still present in the "develop" branch... and my script shows that the bug is fixed in the "develop" branch !

    Release 0.4.1 is dated Feb 14, 2019... almost 4 years old ! There has been 210 commits since 0.4.1 ; among these 210 commits, I counted about 45 fixes. While counting the commit messages with a fix message, I found the commit that fixes my bug: issue #320 fixed on Match 27 2019 in this commit https://github.com/turicas/rows/commit/c569f9415f2c76b2f6e9afbe1d748946e759711f

    So, in December 2022, some users are wasting hours because of a bug that was found and fixed 3,5 years ago :-( No comment !

    So, please, push a new release on pypi !

    opened by alexis-via 2
  • Replace unicodecsv by standard csv module

    Replace unicodecsv by standard csv module

    unicodecsv is not maintained since a while now [1]. It was preferred over standard csv because of the unicode support. Now that Python3 csv module [2] supports it, let's use it.

    For more context, we hit issues while rebuilding uncicodecsv during Fedora Python3.11 mass rebuild [3][4].

    [1] https://github.com/jdunck/python-unicodecsv [2] https://docs.python.org/3/library/csv.html [3] https://copr.fedorainfracloud.org/coprs/g/python/python3.11/package/python-unicodecsv/ [4] https://bugzilla.redhat.com/show_bug.cgi?id=2021938

    opened by jcapiitao 1
  • NameError: name 'obj' is not defined

    NameError: name 'obj' is not defined

    Esse erro rolou quando fui tentar usar o método closest_same_column em rows.plugins.pdf image

    Aparentemente aqui no código está faltando a parte em que pegamos o o objeto que tem o valor passado como parâmetro para trabalharmos com ele (e aparentemente isso também acontece com o outro método closest_same_line

    opened by dehatanes 0
  • Python 3.10: cannot import name 'Iterator' from 'collections'

    Python 3.10: cannot import name 'Iterator' from 'collections'

    File "/data/data/com.termux/files/usr/lib/python3.10/site-packages/rows/plugins/utils.py", line 20, in <module> 
    from collections import Iterator, OrderedDict            
    ImportError: cannot import name 'Iterator' from 'collections'
    

    Maybe this will be fix:

    try:
        from collections.abc import Iterator
    except ImportError:
        from collections import Iterator
    
    opened by fagci 0
  • [pgimport] Option to do not store values as NULL

    [pgimport] Option to do not store values as NULL

    NULL values can be confusing when analyzing data and there will be some cases where we prefer to add empty values as empty strings instead of NULL. The function pgimport (and the CLI equivalent) should have an option to deal with this scenario.

    enhancement cli plugin utils 
    opened by turicas 0
Releases(v0.4.1)
Owner
Álvaro Justen
Free/libre software hacker, hypnotist, remote worker, teacher, coffee lover/roaster
Álvaro Justen
Org agenda in the console

This Python script reads an org agenda file (i.e. a regular org file with some active dates) and displays an interactive and colored year calendar with detailed information for each day when the mous

Nicolas P. Rougier 113 Jan 03, 2023
An attempt at furthering Factorio Calculator to work in more general contexts.

factorio-optimizer Lets do Factorio Calculator but make it optimize. Why not use Factorio Calculator? Becuase factorio calculator is not general. The

Jonathan Woollett-Light 1 Jun 03, 2022
Repositório de código de curso de Djavue ministrado na Python Brasil 2021

djavue-python-brasil Repositório de código de curso de Djavue ministrado na Python Brasil 2021 Completamente baseado no curso Djavue. A diferença está

Buser 15 Dec 26, 2022
Data Science Course at Dept. of Computer Engineering, Chula 2022

2110446 Data Science Course at Chula 2022 Short links for exercises: Week1: Intro to Numpy, Pandas Numpy: https://colab.research.google.com/github/kao

Kao Panboonyuen 17 Nov 27, 2022
Apache Superset out of box version(Windows 64-bit)

superset_app Apache Superset out of box version (Windows 64bit) prepare job download 3 files python-3.8.10-embed-amd64.zip get-pip.py python_geohash‑0

Steven Lee 9 Oct 02, 2022
Pre-1.0 door/chest sound injector for Minecraft

doorjector Pre-1.0 door/chest sound injector for Minecraft. While the game is running, doorjector hotswaps the new sounds for the old right before the

Sam 1 Nov 20, 2021
Modelling and Implementation of Cable Driven Parallel Manipulator System with Tension Control

Cable Driven Parallel Robots (CDPR) is also known as Cable-Suspended Robots are the emerging and flexible end effector manipulation system. Cable-driven parallel robots (CDPRs) are categorized as a t

Siddharth U 0 Jul 19, 2022
Auto Join Zoom Meeting

Auto-Join-Zoom-Meeting Join a zoom meeting with out filling in meeting id's or passcodes, one button for it all! Setup See attached excel document. MA

JareBear 1 Jan 25, 2022
A small Blender addon for changing an object's local orientation while in edit mode

A small Blender addon for changing an object's local orientation while in edit mode.

Jonathan Lampel 50 Jan 06, 2023
本仓库整理了腾讯视频、爱奇艺、优酷、哔哩哔哩等视频网站中,能够观看的「豆瓣电影 Top250 榜单」影片。

Where is top 250 movie ? 本仓库整理了腾讯视频、爱奇艺、优酷、哔哩哔哩等视频网站中,能够观看的「豆瓣电影 Top250 榜单」影片,点击 Badge 可跳转至相应的电影首页。

MayanDev 123 Dec 22, 2022
Easily map device and application controls to a midi controller

pymidicontroller Introduction Easily map device and application controls to a midi controller

Tane Barriball 24 May 16, 2022
Mata kuliah Bahasa Pemrograman

praktikum2 MENGHITUNG LUAS DAN KELILING LINGKARAN FLOWCHART : OUTPUT PROGRAM : PENJELASAN : Tetapkan nilai pada variabel sesuai inputan dari user :

2 Nov 09, 2021
With Christmas and New Year ahead, it is time for some festive coding. Here is a Christmas Card for you all!

Christmas Card With Christmas and New Year ahead, it is time for some festive coding! Here is a Christmas Card for you all! NOTE: I have not made this

CodeMaster7000 1 Dec 25, 2021
Repository for my Monika Assistant project

Monika_Assistant Repository for my Monika Assistant project Major changes: Added face tracker Added manual daily log to see how long it takes me to fi

3 Jan 10, 2022
A modern Python build backend

trampolim A modern Python build backend. Features Task system, allowing to run arbitrary Python code during the build process (Planned) Easy to use CL

Filipe Laíns 39 Nov 08, 2022
Rufus port to linux, writed on Python3

Rufus-for-Linux Rufus port to linux, writed on Python3 Программа будет иметь тот же интерфейс что и оригинал, и тот же функционал. Программа создается

6 Jan 07, 2022
Statistics Calculator module for all types of Stats calculations.

Statistics-Calculator This Calculator user the formulas and methods to find the statistical values listed. Statistics Calculator module for all types

2 May 29, 2022
CEI Natural Disaster Tracking Portal

CEI Natural Disaster Tracking Portal (cc) Climatic Eye of ISCI We are an initiative that conducts studies in the field of Space Science, publishes pro

Baris Dincer 7 Dec 24, 2022
SimCSE在中文任务上的简单实验

SimCSE 中文测试 SimCSE在常见中文数据集上的测试,包含ATEC、BQ、LCQMC、PAWSX、STS-B共5个任务。 介绍 博客:https://kexue.fm/archives/8348 论文:《SimCSE: Simple Contrastive Learning of Sente

苏剑林(Jianlin Su) 504 Jan 04, 2023
A collection of existing KGQA datasets in the form of the huggingface datasets library, aiming to provide an easy-to-use access to them.

KGQA Datasets Brief Introduction This repository is a collection of existing KGQA datasets in the form of the huggingface datasets library, aiming to

Semantic Systems research group 21 Jan 06, 2023