Black for Python docstrings and reStructuredText (rst).

Overview

Style-Doc

Apache 2.0 License Contributor Covenant v2.0 Python Version pypi
Static Code Checks GitHub issues

Style-Doc is Black for Python docstrings and reStructuredText (rst). It can be used to format docstrings (Google docstring format) in Python files or reStructuredText.

One Conversation
This project is maintained by the One Conversation team of Deutsche Telekom AG.
It is based on the style_doc.py script from the HuggingFace Inc. team.

Installation

Style-Doc is available at the Python Package Index (PyPI). It can be installed with pip:

$ pip install style-doc

Usage

$ style-doc --help
usage: style-doc [-h] [--max_len MAX_LEN] [--check_only] [--py_only]
                 [--rst_only]
                 files [files ...]

positional arguments:
  files              The file(s) or folder(s) to restyle.

optional arguments:
  -h, --help         show this help message and exit
  --max_len MAX_LEN  The maximum length of lines.
  --check_only       Whether to only check and not fix styling issues.
  --py_only          Whether to only check py files.
  --rst_only         Whether to only check rst files.

Examples

  • format all docstrings (.py files) and rst files in the src and docs folder with line length of 99:
    style-doc --max_len 99 src docs
  • check all docstrings (.py files) and rst files in the src and docs folder with line length of 99:
    style-doc --max_len 99 --check_only src docs
  • format all docstrings (.py files only) in the src folder with line length of 99:
    style-doc --max_len 99 --py_only src
  • check all docstrings (.py files only) in the src folder with line length of 99:
    style-doc --max_len 99 --check_only --py_only src
  • format all rst files only in the docs folder with line length of 99:
    style-doc --max_len 99 --rst_only docs
  • check all rst files only in the docs folder with line length of 99:
    style-doc --max_len 99 --check_only --rst_only docs

To integrate Style-Doc (and more checks) into your GitHub Actions see our static_checks.yml example and our configuration in setup.py.

Support and Feedback

The following channels are available for discussions, feedback, and support requests:

Contribution

Our commitment to open source means that we are enabling -in fact encouraging- all interested parties to contribute and become part of our developer community.

Contribution and feedback is encouraged and always welcome. For more information about how to contribute, as well as additional contribution information, see our Contribution Guidelines. By participating in this project, you agree to abide by its Code of Conduct at all times.

Code of Conduct

This project has adopted the Contributor Covenant in version 2.0 as our code of conduct. Please see the details in our CODE_OF_CONDUCT.md. All contributors must abide by the code of conduct.

Working Language

We decided to apply English as the primary project language.

Consequently, all content will be made available primarily in English. We also ask all interested people to use English as language to create issues, in their code (comments, documentation etc.) and when you send requests to us. The application itself and all end-user facing content will be made available in other languages as needed.

Licensing

Copyright (c) 2020 The HuggingFace Inc. team
Copyright (c) 2021 Philip May, Deutsche Telekom AG

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Comments
  • --max-len seems mandatory, not optional parameter

    --max-len seems mandatory, not optional parameter

    I run style-doc . --check and get an error while ```style-doc . --check `--max-len 80`` works.

    The error message is:

      File "c:\users\epogr\anaconda3\lib\site-packages\style_doc\style_doc.py", line 460, in style_docstring 
        if len(docstring) < max_len and "\n" not in docstring:
    TypeError: '<' not supported between instances of 'int' and 'NoneType'
    
    opened by epogrebnyak 2
  • How should we

    How should we "communicate" an error?

    "You must not set --py_only and --rst_only at the same time." with sys.exit(1) or -1 or raise ValueError(...

    raise ValueError(f"{len(changed)} files should be restyled!") or use ``sys.exit...`

    enhancement help wanted 
    opened by PhilipMay 2
  • Ignore commented-out classes/functions/etc.

    Ignore commented-out classes/functions/etc.

    Currently the search for """ isn't respecting commented out code:

        # For future implementation
        # def base_url(self) -> str:
        #     """
        #     Generate SCIM base url
        #     """
        #     return "https://app.asana.com/api/1.0/scim/"
    

    becomes:

        # For future implementation
        # def base_url(self) -> str:
        #     """
        # Generate SCIM base url #
        """
        #     return "https://app.asana.com/api/1.0/scim/"
    

    Which is a syntax error, since it is uncommenting one of the """.

    opened by dragonpaw 2
  • Create a git pre-commit hook for style-doc

    Create a git pre-commit hook for style-doc

    Have you considered packaging style-doc for use as a git pre-commit hook, and listing it with the pre-commit project? It seems like it would be a great addition, and make it very easy for people to integrate the docstring formatter into their existing workflows and get automatic updates when new releases happen.

    opened by zaneselvans 1
  • Fix issues when code has `

    Fix issues when code has `"""` but is not a docstring

    We had to apply this workaround:

    https://github.com/telekom/style-doc/blob/db352ed72ae4473a805d485692df58ec4511a673/style_doc/style_doc.py#L495-L497

    # fmt: off and # fmt: on is needed so black does not convert it back to '"""'.

    bug 
    opened by PhilipMay 0
  • Add option to use config file

    Add option to use config file

    Use pyproject.toml

    see black

    • https://github.com/psf/black/blob/7567cdf3b4f32d4fb12bd5ca0da838f7ff252cfc/src/black/files.py#L69
    • https://github.com/psf/black/blob/017aafea992ca1c6d7af45d3013af7ddb7fda12a/src/black/init.py#L44
    enhancement good first issue low priority 
    opened by PhilipMay 0
Releases(0.2.0)
Owner
Telekom Open Source Software
published by Deutsche Telekom AG and partner companies
Telekom Open Source Software
Parrot is a paraphrase based utterance augmentation framework purpose built to accelerate training NLU models

Parrot is a paraphrase based utterance augmentation framework purpose built to accelerate training NLU models. A paraphrase framework is more than just a paraphrasing model.

Prithivida 681 Jan 01, 2023
🗣️ NALP is a library that covers Natural Adversarial Language Processing.

NALP: Natural Adversarial Language Processing Welcome to NALP. Have you ever wanted to create natural text from raw sources? If yes, NALP is for you!

Gustavo Rosa 21 Aug 12, 2022
A flask application to predict the speech emotion of any .wav file.

This is a speech emotion recognition app. It will allow you to train a modular MLP model with the RAVDESS dataset, and then use that model with a flask application to predict the speech emotion of an

Aryan Vijaywargia 2 Dec 15, 2021
A list of NLP(Natural Language Processing) tutorials

NLP Tutorial A list of NLP(Natural Language Processing) tutorials built on PyTorch. Table of Contents A step-by-step tutorial on how to implement and

Allen Lee 1.3k Dec 25, 2022
pkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation

pkuseg:一个多领域中文分词工具包 (English Version) pkuseg 是基于论文[Luo et. al, 2019]的工具包。其简单易用,支持细分领域分词,有效提升了分词准确度。 目录 主要亮点 编译和安装 各类分词工具包的性能对比 使用方式 论文引用 作者 常见问题及解答 主要

LancoPKU 6k Dec 29, 2022
Towards Nonlinear Disentanglement in Natural Data with Temporal Sparse Coding

Towards Nonlinear Disentanglement in Natural Data with Temporal Sparse Coding

Bethge Lab 61 Dec 21, 2022
Use fastai-v2 with HuggingFace's pretrained transformers

FastHugs Use fastai v2 with HuggingFace's pretrained transformers, see the notebooks below depending on your task: Text classification: fasthugs_seq_c

Morgan McGuire 111 Nov 16, 2022
KakaoBrain KoGPT (Korean Generative Pre-trained Transformer)

KoGPT KoGPT (Korean Generative Pre-trained Transformer) https://github.com/kakaobrain/kogpt https://huggingface.co/kakaobrain/kogpt Model Descriptions

Kakao Brain 797 Dec 26, 2022
🤗🖼️ HuggingPics: Fine-tune Vision Transformers for anything using images found on the web.

🤗 🖼️ HuggingPics Fine-tune Vision Transformers for anything using images found on the web. Check out the video below for a walkthrough of this proje

Nathan Raw 185 Dec 21, 2022
Basic yet complete Machine Learning pipeline for NLP tasks

Basic yet complete Machine Learning pipeline for NLP tasks This repository accompanies the article on building basic yet complete ML pipelines for sol

Ivan 20 Aug 22, 2022
This repository contains Python scripts for extracting linguistic features from Filipino texts.

Filipino Text Linguistic Feature Extractors This repository contains scripts for extracting linguistic features from Filipino texts. The scripts were

Joseph Imperial 1 Oct 05, 2021
CCF BDCI 2020 房产行业聊天问答匹配赛道 A榜47/2985

CCF BDCI 2020 房产行业聊天问答匹配 A榜47/2985 赛题描述详见:https://www.datafountain.cn/competitions/474 文件说明 data: 存放训练数据和测试数据以及预处理代码 model_bert.py: 网络模型结构定义 adv_train

shuo 40 Sep 28, 2022
PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset

PyTorch Large-Scale Language Model A Large-Scale PyTorch Language Model trained on the 1-Billion Word (LM1B) / (GBW) dataset Latest Results 39.98 Perp

Ryan Spring 114 Nov 04, 2022
Fixes mojibake and other glitches in Unicode text, after the fact.

ftfy: fixes text for you print(fix_encoding("(ง'⌣')ง")) (ง'⌣')ง Full documentation: https://ftfy.readthedocs.org Testimonials “My life is li

Luminoso Technologies, Inc. 3.4k Dec 29, 2022
LeBenchmark: a reproducible framework for assessing SSL from speech

LeBenchmark: a reproducible framework for assessing SSL from speech

11 Nov 30, 2022
Text Classification Using LSTM

Text classification is the task of assigning a set of predefined categories to free text. Text classifiers can be used to organize, structure, and categorize pretty much anything. For example, new ar

KrishArul26 3 Jan 03, 2023
This repository contains the code for running the character-level Sandwich Transformers from our ACL 2020 paper on Improving Transformer Models by Reordering their Sublayers.

Improving Transformer Models by Reordering their Sublayers This repository contains the code for running the character-level Sandwich Transformers fro

Ofir Press 53 Sep 26, 2022
Based on 125GB of data leaked from Twitch, you can see their monthly revenues from 2019-2021

Twitch Revenues Bu script'i kullanarak istediğiniz yayıncıların, Twitch'den sızdırılan 125 GB'lik veriye dayanarak, 2019-2021 arası aylık gelirlerini

4 Nov 11, 2021
:id: A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.

Dedupe Python Library dedupe is a python library that uses machine learning to perform fuzzy matching, deduplication and entity resolution quickly on

Dedupe.io 3.6k Jan 02, 2023
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.

Dense Passage Retrieval Dense Passage Retrieval (DPR) - is a set of tools and models for state-of-the-art open-domain Q&A research. It is based on the

Meta Research 1.1k Jan 07, 2023