The RAP community of practice includes all analysts and data scientists who are interested in adopting the working practices included in reproducible analytical pipelines (RAP) at NHS Digital.

Overview

Warning - this repository is a snapshot of a repository internal to NHS Digital. This means that links to videos and some URLs may not work.

Repository owner: NHS Digital Analytical Services

Email: [email protected]

To contact us raise an issue on Github or via email and will respond promptly.

RAP community of practice

Welcome to the landing page for the RAP community of practice repo.

You can learn all about Reproducible analytical pipelines (RAP) on our what is RAP page. In a nutshell though, RAP is becoming the standard for publishing analytical outputs in government. RAP combines a number of ways of working that help to improve the reliability, transparency, and speed of statistics publications. Reproducible Analytical Pipelines follow the principles of the AQUA Book guidelines, which revolve around analysis being reproducible, auditable, transparent, and quality assured.

The RAP community of practice includes all analysts and data scientists who are interested in adopting the working practices included in reproducible analytical pipelines (RAP). This repo is a central repository for resources and guidance to help teams adopting RAP practices. There is an associated [MS Teams page] where you can introduce yourself, ask for help, or discuss different approaches. Over time we hope to build up a community of people who can self-support and further develop these ways of working.

The community of practice aims to support teams in adopting RAP practices through:

  1. Offering in-person support as teams establish new working practices
  2. Producing learning materials that offer reusable templates adapted for the NHSD analytical environment

This work is prompted by the observations that teams can struggle to adopt RAP practices without direct support. While no one element of RAP is particularly difficult, learning several new skills at the same time as delivering BAU is challenging. Teams can struggle to find the defended time to embed these practices. See the Statistics Authority report on the barriers to RAP adoption for more information. Luckily, in NHSD we have strong senior support for RAP and many teams have already begun to adopt many of the practices included in RAP. Consequently, we already have a large pool of skilled, ethusiastic analysts who are willing to help others. These resources also aim to support the goals laid out in the Goldacre report Bringing NHS data analysis into the 21st century and to align with Tim Berners-Lee's Five star data principles.

Support and training

If your team is embarking upon a RAP journey, you should look at our what is RAP page and try to complete the self-assessment. From there, we recommend reaching out for some in-person support. The RAP Champion Function (within the Data Science Skilled Team) can offer support in many forms:

  • Reviewing your RAP work and assessing your progress against the levels of RAP
  • Peer review of code
  • Workshops for a specific RAP capability
  • Consultancy style engagement where we plan a migration strategy
  • Pair coding
  • Shadowing another team

If you want to talk about any of this then please reach out on the [RAP community of practice MS Teams] page (internal to NHSD).

We maintain a list of people who are willing to dedicate some time to support others. Please add your name to the mix if you are willing to support someone else. You don't need to be an expert - just willing to share what you know.

Tutorials and resources

As we work alongside teams, we try to produce reusable learning materials pitched at specifically supporting NHSD teams. We try (with partial success) to avoid reproducing guidance that is easily available online. Instead, we link to lots of external resources where you can self-serve. Our focus instead aims to create some bespoke guidance that lays out how you would accomplish these practices in the NHSD setting.

Here are some of the initial resources:

These resources are demand-driven so if you want something then please ask on the [MS Teams page]. We would also ask you to contribute if you can improve on any of the resources or can fill in any other gaps.

The resources are not intended to be prescriptive. There are many ways to accomplish a task and teams have valid reasons for choosing other approaches. Instead the intention of the resources provided here is to offer a way in for teams who want to adopt good practices that they have heard about but don't know where to start.

Misc

We have taken inspiration from the NHSD software engineering COP. It has tons of great material so I encourage you to read and reflect on these working practices.

Licence

RAP Community of Practice codebase is released under the MIT License.

The documentation is © Crown copyright and available under the terms of the Open Government 3.0 licence.

Comments
  • Dead link

    Dead link

    opened by abbieprescott 4
  • dependency management

    dependency management "not possible in DAE"

    In Levels of RAP it say: Does your repo include dependency management? (i.e. requirements.txt or conda environment for RDS users. Not possible in DAE)

    It's not strictly true that this cannot be described for DAE - though it is more limited. One can describe the cluster used (runtime, libraries etc).

    opened by SamHollings 2
  • RAP Publishing Checks - Clarify what are credentials and secrets

    RAP Publishing Checks - Clarify what are credentials and secrets

    We've had some feedback that the part of the publishing checks that says "no credentials or secrets" is not clear, as analysts have not seen these terms before.

    The following text might make things easier to understand:

    Credentials or secrets are essentially passwords that computers use for encrypted communication or access to services. For example, with many APIs (like the Google Maps API) you must supply a credential code to access the service. Often times these codes look like long strange combinations of letters and numbers (l79sDgH9s...). We must not share our passwords publicly, so you should not commit credentials and secrets.

    opened by goodyguts 2
  • Environment and dependecy management - needs to be clearer

    Environment and dependecy management - needs to be clearer

    In the "levels of RAP" people become confused by environment and dependency management - we need to link to page which very clearly describe these, what the point of it is, and how they can know if they're meeting this requirement.

    opened by SamHollings 2
  • Pyspark guidance

    Pyspark guidance

    I'm not a fan of referring to it as a "flavour of python" (about PYspark page)

    I think Pyspark should be contained underneath Python.

    I also think it should make it clear that distribution of processing only occurs if its set up right - spark on a normal laptop will not be any more powerful than say pandas. On a big cluster in databricks is a different story.

    I think this page might also need a reference to other python datastructures - and how there is a right tool for the right job.

    duplicate 
    opened by SamHollings 1
  • Split out Terminal guidance from

    Split out Terminal guidance from "git" guidance.

    The terminal guidance is contained within the git guidance - but the terminal is a separate tool which can be used for many purposes - probably better to have it as its own level alongside Python, git etc, and then for these pages to be referenced by the other technologies.

    opened by SamHollings 1
  • code in the open - topics and add to data-analytics-services

    code in the open - topics and add to data-analytics-services

    On the "how to publish your code in the open page" - we should tell people they should add their publication to the page: https://github.com/NHSDigital/data-analytics-services and also that they should set appropriate topics for their publication, i.e. nhs-digital-publication

    opened by SamHollings 1
  • Signpost resources to ensure accessibility requirements are met

    Signpost resources to ensure accessibility requirements are met

    This is most relevant for any outputs produced. See guidance.

    As a starting point, the python visualisation guide should include tips on how to make visualisations more accessible:

    • The Home Office has some posters on accessible design
    • There are also countless online resources on accessibility relating to colour-blindness, visual impairments etc.

    We should also consider including a note on accessibility in the design of RAP. A pipeline would be difficult to reproduce if a user could not access any part of the pipeline. This includes README files, as well as output types.

    opened by harrietrs 1
  • Environment management external links

    Environment management external links

    We should do more to explain how environment management plays into reproducibility.

    This page is quite useful and would save us duplicating: https://realpython.com/python-virtual-environments-a-primer/

    opened by connor1q 1
  • Broken link

    Broken link

    https://github.com/NHSDigital/rap-community-of-practice/blob/main/python/project-structure-and-packaging.md#generic-package-template

    There is a broken link to the generic package template in the section above

    opened by connor1q 1
  • Contributions section

    Contributions section

    We're keen to encourage external improvements to these resources but we don't yet have a contributions section that explains how we will review and moderate.

    opened by connor1q 1
  • Code review page ideas

    Code review page ideas

    We have recently been doing some code reviewing. Here are a few things that we think might make the page more helpful.

    Code review before merge request

    Code should be reviewed with someone before submitting a merge request. The reviewer should consider whether the code needs to be refactored or redesigned.

    I'm not sure that I always agree with this. Merge requests make it really easy to leave comments on different parts of the code, and in some ways make the life of the reviewer and the merge request submitter easier. Maybe rephrase as

    You don't have to save reviewing your code until the end. You can do small reviewing and also pair programming while developing the ticket. Seeking feedback sooner could mean you save time because you do not have to change as much when the final review happens later.

    Different types of code review

    There are different types of code review that you can get. It may be worth highlighting them.

    1. Merge request code review

      A standard review process that checks whether changes to the codebase are acceptable. You focus only on the code that has changed. It should be relatively quick, and very regular (one every time you implement a new feature). Normally done by a member of the team.

    2. Full code review

      A code review where someone looks at all your code together, and gives you overall feedback. This review allows someone to look at the bigger picture, rather than one individual feature. These reviews take longer, and are less regular. Normally done by members outside your team, so that it is a fresh pair of eyes.

    3. Fitness to publish checks

      A code review to check the code is okay to publish. Note that, in the code review, you will normally limit yourself to making suggestions that you want completed before the code is published. This may mean you avoid suggesting big changes to the code, and instead focus in on checks like ensuring documentation is well written, or removing passwords from the code.

    Maybe split code review checklist into beginner and advanced items?

    One of the items on the code review checklist is

    Documentation is hosted for easy access. GitHub Pages and Read the Docs provide a free service for hosting documentation publicly.

    Even with advanced teams in data services I do not see them doing this. It might be worth prioritizing, so that the checklist is less overwhelming.

    Maybe organise the checklist items by the RAP level the team is aiming for.

    on jira workplan 
    opened by goodyguts 2
  • 03_quality-assuring-analytical-ouputs page not clearly linked with levels of RAP

    03_quality-assuring-analytical-ouputs page not clearly linked with levels of RAP

    The AQUA page (https://github.com/NHSDigital/rap-community-of-practice/blob/main/implementing_RAP/general_guidance/quality-assuring-analytical-ouputs.md) is not clearly associated with the levels of RAP and so people can find it a bit confusing when and how they should be following it.

    We need to more clearly link it into peoples workflow when planning out RAP (some of it is beyond RAP and more general guidance on managing analytical work), and perhaps reduce duplication by removing those bits already covered by the "levels of RAP" - and making these clear.

    on jira workplan 
    opened by SamHollings 1
  • Clean code guidance

    Clean code guidance

    some teams want to use clean code - we need guidance on the best way to approach this for analytical code, why you would want to do it, and what to watch out for.

    on jira workplan 
    opened by SamHollings 2
Releases(v1.1.0)
  • v1.1.0(Dec 21, 2022)

    What's Changed

    Automatic Release Notes

    • Release v1.1.0 by @xiyaozhuang in https://github.com/NHSDigital/rap-community-of-practice/pull/35

    New Contributors

    • @xiyaozhuang made their first contribution in https://github.com/NHSDigital/rap-community-of-practice/pull/35

    Full Changelog: https://github.com/NHSDigital/rap-community-of-practice/compare/v1.0.0...v1.1.0

    Source code(tar.gz)
    Source code(zip)
  • v1.0.0(Dec 6, 2022)

    What Changed

    Automatic release notes

    • Hr 1188 r git by @helrich in https://github.com/NHSDigital/rap-community-of-practice/pull/2
    • Add Intro to R link by @helrich in https://github.com/NHSDigital/rap-community-of-practice/pull/3
    • Improving layout and expanding rollout section by @connor1q in https://github.com/NHSDigital/rap-community-of-practice/pull/4
    • Cq updates by @connor1q in https://github.com/NHSDigital/rap-community-of-practice/pull/5
    • Hr changes by @helrich in https://github.com/NHSDigital/rap-community-of-practice/pull/9
    • Hr updates to git by @helrich in https://github.com/NHSDigital/rap-community-of-practice/pull/10
    • Update publishing code in the open by @harrietrs in https://github.com/NHSDigital/rap-community-of-practice/pull/20
    • Sh new front page by @SamHollings in https://github.com/NHSDigital/rap-community-of-practice/pull/22
    • Restructure and edit files by @abbieprescott in https://github.com/NHSDigital/rap-community-of-practice/pull/23
    • Create gh-pages version by @harrietrs in https://github.com/NHSDigital/rap-community-of-practice/pull/31
    • add two new guides and pr prep by @helrich in https://github.com/NHSDigital/rap-community-of-practice/pull/32
    • Publishes when to stop coding guide by @josephwilson8-nhs in https://github.com/NHSDigital/rap-community-of-practice/pull/33
    • Added new improved guides on virtual environments by @xiyaozhuang in https://github.com/NHSDigital/rap-community-of-practice/pull/34

    New Contributors

    • @helrich made their first contribution in https://github.com/NHSDigital/rap-community-of-practice/pull/2
    • @connor1q made their first contribution in https://github.com/NHSDigital/rap-community-of-practice/pull/4
    • @harrietrs made their first contribution in https://github.com/NHSDigital/rap-community-of-practice/pull/20
    • @SamHollings made their first contribution in https://github.com/NHSDigital/rap-community-of-practice/pull/22
    • @abbieprescott made their first contribution in https://github.com/NHSDigital/rap-community-of-practice/pull/23
    • @josephwilson8-nhs made their first contribution in https://github.com/NHSDigital/rap-community-of-practice/pull/33
    • @xiyaozhuang made their first contribution in https://github.com/NHSDigital/rap-community-of-practice/pull/34

    Full Changelog: https://github.com/NHSDigital/rap-community-of-practice/commits/v1.0.0

    Source code(tar.gz)
    Source code(zip)
Owner
NHS Digital
NHS Digital Public Repository
NHS Digital
Visualize Data From Stray Scanner https://keke.dev/blog/2021/03/10/Stray-Scanner.html

StrayVisualizer A set of scripts to work with data collected using Stray Scanner. Usage Installing Dependencies Install dependencies with pip -r requi

Kenneth Blomqvist 45 Dec 30, 2022
Senator Stock Trading Tester

Senator Stock Trading Tester Program to compare stock performance of Senator's transactions vs when the sale is disclosed. Using to find if tracking S

Cole Cestaro 1 Dec 07, 2021
You can change your mac address with this program.

1 - Warning! You can use this program with Kali Linux. Therefore if you don't install the Kali Linux. Firstly you need to install Kali Linux. 2 - Star

Mustafa Bahadır Doğrusöz 1 Jun 10, 2022
This repository collects nice scripts ("plugins") for the SimpleBot bot for DeltaChat.

Having fun with DeltaChat This repository collects nice scripts ("plugins") for the SimpleBot bot for DeltaChat. DeltaChat is a nice e-mail based mess

Valentin Brandner 3 Dec 25, 2021
Never miss a deadline again

Hack the Opportunities Never miss a deadline again! Link to the excel sheet Contribution This list is not complete and I alone cannot make it whole. T

Vibali Joshi 391 Dec 28, 2022
Rufus port to linux, writed on Python3

Rufus-for-Linux Rufus port to linux, writed on Python3 Программа будет иметь тот же интерфейс что и оригинал, и тот же функционал. Программа создается

6 Jan 07, 2022
Python framework to build apps with the GASP metaphor

Gaspium Python framework to build apps with the GASP metaphor This project is part of the Pyrustic Open Ecosystem. Installation | Documentation | Late

5 Jan 01, 2023
tox-gh is a tox plugin which helps running tox on GitHub Actions with multiple different Python versions on multiple workers in parallel

tox-gh is a tox plugin which helps running tox on GitHub Actions with multiple different Python versions on multiple workers in parallel. This project is inspired by tox-travis.

tox development team 19 Dec 26, 2022
pyToledo is a Python library to interact with the common virtual learning environment for the Association KU Leuven (Toledo).

pyToledo pyToledo is a Python library to interact with the common virtual learning environment for the Association KU Leuven a.k.a Toledo. Motivation

Daan Vervacke 5 Jan 03, 2022
MiniJVM is simple java virtual machine written by python language, it can load class file from file system and run it.

MiniJVM MiniJVM是一款使用python编写的简易JVM,能够从本地加载class文件并且执行绝大多数指令。 支持的功能 1.从本地磁盘加载class并解析 2.支持绝大多数指令集的执行 3.支持虚拟机内存分区以及对象的创建 4.支持方法的调用和参数传递 5.支持静态代码块的初始化 不支

keguoyu 60 Apr 01, 2022
Lock a program and kills it indefinitely if it is started.

Kill By Lock Lock a program and kills it indefinitely if it is started. How start it? It' simple, you just have to double-click on the python file (.p

1 Jan 12, 2022
An assistant to guess your pip dependencies from your code, without using a requirements file.

Pip Sala Bim is an assistant to guess your pip dependencies from your code, without using a requirements file. Pip Sala Bim will tell you which packag

Collage Labs 15 Nov 19, 2022
Mnemosyne: efficient learning with powerful digital flash-cards.

Mnemosyne: Optimized Flashcards and Research Project Mnemosyne is: a free, open-source, spaced-repetition flashcard program that helps you learn as ef

359 Dec 24, 2022
School helper, helps you at your pyllabus's.

pyllabus, helps you at your syllabus's... WARNING: It won't run without config.py! You should add config.py yourself, it will include your APIKEY. e.g

Ahmet Efe AKYAZI 6 Aug 07, 2022
Unfinished Python library based on ndspy, for Zelda: Phantom Hourglass and Spirit Tracks.

zed An unfinished library and toolset by me, for viewing and editing files from The Legend of Zelda: Phantom Hourglass and The Legend of Zelda: Spirit

4 Oct 13, 2022
Script para generar automatización de registro de formularios IEEH

Formularios_IEEH Script para generar automatización de registro de formularios IEEH Corresponde a un conjunto de script en python que permiten la auto

vhevia11 1 Jan 06, 2022
Pygments is a generic syntax highlighter written in Python

Welcome to Pygments This is the source of Pygments. It is a generic syntax highlighter written in Python that supports over 500 languages and text for

1.2k Jan 06, 2023
Wordle Solver

Wordle Solver Installation Install the following onto your computer: Python 3.10.x Download Page Run pip install -r requirements.txt Instructions To r

John Bucknam 1 Feb 15, 2022
Python Create Your Own Tool Series

Python Create Your Own Tool Series Hey there! This is an additional Github repository that contains the final product files for each video in my Youtu

Joe Helle 21 Dec 02, 2022