Optimal skincare partition finder using graph theory

Related tags

Algorithmspigment
Overview

Pigment

License: ISC CC BY-SA 4.0

The problem of partitioning up a skincare regime into parts such that each part does not interfere with itself is equivalent to the minimal clique cover problem, which can be transformed into the vertex colouring of a graph, both of which are NP-hard and thus computationally infeasible to find optimal solutions for. This project is a brute-force proof-of-concept that exhaustively solves the problem of good skincare product grouping!

Usage

  1. Modify the ingredient conflict dictionary (named conflicts in the pigment.py mainline) to reflect your skincare products. If you say A conflicts with B, you don't have to also write the rule that B conflicts with A. The script handles the reflexivity.

  2. Run the program (you need Python 3):

    python3 pigment.py

Algorithm

This algorithm takes in an adjacency list for a conflict graph where each edge between two nodes represents an instance of two ingredients conflicting.

It then exhaustively generates every possible partition using a recursive backtracking depth-first-search algorithm where for each ingredient, it explores every sub-tree consisting of adding the ingredient to every existing part before finally creating a new part. Each terminal/leaf node represents a generated partition, which we exhaustively check: for each part in the partition, we check to see if any pair exists as an edge in the conflict dictionary. If no such pairs exist among any part, the partition is valid.

partition tree

The algorithm looks for the valid partition with the least amount of parts.

The number of partitions that are brute-force generated is equivalent to the nth Bell number and it is sequence A000110 in the OEIS.

It runs in O(a fuckton of time). If you have a lot of stuff in your skincare routine, this algorithm may take forever to run. It is recommended that you do not add vanity elements (aka adding an element just for it to show up in the final result) such as:

CONFLICTS = OrderedDict((
    ("A", ["B", "C"])
    ("D", [])
))

In this case, "D" is a vanity element; it contributes nothing to conflict data but bloats the state space (which, in a brute-force algorithm like this, is not good). If an element doesn't conflict with anything, then use it as liberally as you like without restriction.

You have been warned.

Modelling

Say, for the purposes of illustration (as these opinions are still hotly debated in the skincare community today), we have the following ingredients:

  • Retinol
  • AHAs/BHAs
  • Copper peptides
  • Ferrulic acid

and the following interactions:

  • Retinol and AHAs/BHAs conflict with each other
  • Copper peptides interfere with AHAs/BHAs
  • Ferrulic acid interferes with copper peptides

We can therefore model compatible products as an undirected graph where each node represents a skincare ingredient and each edge between node a and node b represents the sentence "ingredient a is compatible with ingredient b". We can represent the relation above as such:

compatibility graph

The ideal here is that we want to take all four of these ingredients at once, however as noted by the conflicts above, that isn't possible. The next best solution, if we can't create 1 part, is to try to create 2 part. We know that in our model, retinol is compatible with copper peptides, and ferrulic acid is compatible with AHAs/BHAs, but we discard the possibility of using retinol with ferrulic acid though, as its part contains AHAs/BHAs, which are not compatible with retinol (as shown by the lack of edge).

minimum clique

This is the optimal solution. In one skincare session, we take retinol with the copper peptides, and another session we take AHAs/BHAs and ferrulic acid.

Our major goal, therefore, is to partition the ingredients list into as few parts as possible such that each parts's ingredients represents a clique, where a clique is an induced subgraph that is complete. In layperson's terms, we are looking to create subgraphs of ingredients such that each ingredient has an edge connected to every other ingredient node in the subgraph. Such complete subgraphs are known as cliques. As shown below, when two ingredients are compatible with each other, the resultant clique has a single edge between two nodes (as shown by K2: 1). For four ingredients, the resultant clique has six edges between the four nodes (as shown by K2:6). To see ten ingredients compatible with each other is somewhat uncommon.

complete graphs These images are taken from Wikipedia.org and are by koko90. See attribution for details

Minimal Clique Cover

In formal terms, a "clique cover" or "partition into cliques" of an undirected graph is a partition (or splitting of the graph into groups) into constituent cliques. Our problem is to find the "minimal" clique cover—aka—doing it in the least number of cliques—or splits—possible. As shown in the figure above, the trivial case is K1: 0 as each individual ingredient is its own clique, but that's the worst-case scenario we are trying to avoid. It would mean that no skincare ingredient is compatible with anything else e.g. you may have to take each 10 skincare ingredient on separate days, which would be a scheduling nightmare.

Graph Colouring

We can make things more readable by looking at an equivalent problem.

Given a graph G, the complement of the graph, let's call it G2, is a graph with the same nodes as G, but every edge in the original graph is missing, and every midding edge in the original graph is now an edge. In layperson's terms, a complement graph G2 for graph G contains only the edges necessary to turn G into a complete graph, as shown by this diagram:

complement of the Petersen graph Image edited by Claudio Rocchini; derived from David Eppstein. See attribution for details

We can invert the "maximal clique" problem by not mapping whether two skincare products are compatible with each other, but rather if they conflict. This makes specifications a whole lot easier to make, as now we can assume anything that isn't connected by an edge is compatible. If we change our first graph to model conflicts instead of synergies, we get the following:

conflict graph

Our problem is now to induce subgraphs such that none of the nodes have any edges between them. Each subgraph is its own group. In this example, we induce the subgraphs for the nodes {Retinol, Copper peptides} as well as for {Ferrulic acid, AHAs/BHAs}, as each graph has no nodes:

coloured conflict graph

Those with a background in CS will immediately notice that this is actually the well-studied graph colouring sub-problem known as "vertex colouring": colouring a graph such that no two colours are adjacent to each other. In this case, each colour group represents a partition, like from earlier. Again, the optimization problem is NP-hard and is intractable. Which is why the algorithm solves the colouring problem in the ugliest, most brute force way possible.

Bibliography

Attribution

  • Graphs made by me using Dreampuf's Dot Grapher and they are licensed as CC BY-SA 4.0 as the project is
  • Complete graphs K1, K2, and K3 are simple geometry and thus are in the public domain (author is David Benbennick).
  • Simplex graphs 4, 5, 6, 7, 8, 9, 10, 11, were released by Koko90 under GFDL and CC BY-SA 3.0 and will be coalesced into the license of this project, thus making them CC BY-SA 4.0
  • The Petersen graph complement image was edited by Claudio Rocchini whose original author was David Eppstein, also released under GFDL and CC BY-SA 3.0. CC BY-SA 4.0 as per the project.
Owner
Jason Nguyen
CS @ University of Guelph
Jason Nguyen
Gnat - GNAT is NOT Algorithmic Trading

GNAT GNAT is NOT Algorithmic Trading! GNAT is a financial tool with two goals in

Sher Shah 2 Jan 09, 2022
Cormen-Lib - An academic tool for data structures and algorithms courses

The Cormen-lib module is an insular data structures and algorithms library based on the Thomas H. Cormen's Introduction to Algorithms Third Edition. This library was made specifically for administeri

Cormen Lib 12 Aug 18, 2022
Algoritmos de busca:

Algoritmos-de-Buscas Algoritmos de busca: Abaixo está a interface da aplicação: Ao selecionar o tipo de busca e o caminho, então será realizado o cálc

Elielson Barbosa 5 Oct 04, 2021
Provide player's names and mmr and generate mathematically balanced teams

Lollo's matchmaking algorithm Provide player's names and mmr and generate mathematically balanced teams How to use Fill the input.json file with your

4 Aug 04, 2022
Infomap is a network clustering algorithm based on the Map equation.

Infomap Infomap is a network clustering algorithm based on the Map equation. For detailed documentation, see mapequation.org/infomap. For a list of re

347 Dec 23, 2022
zoofs is a Python library for performing feature selection using an variety of nature inspired wrapper algorithms. The algorithms range from swarm-intelligence to physics based to Evolutionary. It's easy to use ,flexible and powerful tool to reduce your feature size.

zoofs is a Python library for performing feature selection using a variety of nature-inspired wrapper algorithms. The algorithms range from swarm-intelligence to physics-based to Evolutionary. It's e

Jaswinder Singh 168 Dec 30, 2022
Implementation of Apriori algorithms via Python

Installing run bellow command for installing all packages pip install -r requirements.txt Data Put csv data under this directory "infrastructure/data

Mahdi Rezaei 0 Jul 25, 2022
A library for benchmarking, developing and deploying deep learning anomaly detection algorithms

A library for benchmarking, developing and deploying deep learning anomaly detection algorithms Key Features • Getting Started • Docs • License Introd

OpenVINO Toolkit 1.5k Jan 04, 2023
Optimal skincare partition finder using graph theory

Pigment The problem of partitioning up a skincare regime into parts such that each part does not interfere with itself is equivalent to the minimal cl

Jason Nguyen 1 Nov 22, 2021
A pure Python implementation of a mixed effects random forest (MERF) algorithm

Mixed Effects Random Forest This repository contains a pure Python implementation of a mixed effects random forest (MERF) algorithm. It can be used, o

Manifold 199 Dec 06, 2022
sudoku solver using CSP forward-tracking algorithms.

Sudoku sudoku solver using CSP forward-tracking algorithms. Description Sudoku is a logic-based game that consists of 9 3x3 grids that create one larg

Cindy 0 Dec 27, 2021
RRT algorithm and its optimization

RRT-Algorithm-Visualisation This is a project that aims to develop upon the RRT

Sarannya Bhattacharya 7 Mar 06, 2022
GoldenSAML Attack Libraries and Framework

WhiskeySAML and Friends TicketsPlease TicketsPlease: Python library to assist with the generation of Kerberos tickets, remote retrieval of ADFS config

Secureworks 43 Jan 03, 2023
Dynamic Programming-Join Optimization Algorithm

DP-JOA Join optimization is the process of optimizing the joining, or combining, of two or more tables in a database. Here is a simple join optimizati

Haoze Zhou 3 Feb 03, 2022
A fast, pure python implementation of the MuyGPs Gaussian process realization and training algorithm.

Fast implementation of the MuyGPs Gaussian process hyperparameter estimation algorithm MuyGPs is a GP estimation method that affords fast hyperparamet

Lawrence Livermore National Laboratory 13 Dec 02, 2022
Sign data using symmetric-key algorithm encryption.

Sign data using symmetric-key algorithm encryption. Validate signed data and identify possible validation errors. Uses sha-(1, 224, 256, 385 and 512)/hmac for signature encryption. Custom hash algori

Artur Barseghyan 39 Jun 10, 2022
A GUI visualization of QuickSort algorithm

QQuickSort A simple GUI visualization of QuickSort algorithm. It only uses PySide6, it does not have any other external dependency. How to run Install

Jaime R. 2 Dec 24, 2021
A genetic algorithm written in Python for educational purposes.

Genea: A Genetic Algorithm in Python Genea is a Genetic Algorithm written in Python, for educational purposes. I started writing it for fun, while lea

Dom De Felice 20 Jul 06, 2022
FLIght SCheduling OPTimization - a simple optimization library for flight scheduling and related problems in the discrete domain

Fliscopt FLIght SCheduling OPTimization 🛫 or fliscopt is a simple optimization library for flight scheduling and related problems in the discrete dom

33 Dec 17, 2022
Resilient Adaptive Parallel sImulator for griD (rapid)

Rapid is an open-source software library that implements a novel “parallel-in-time” (Parareal) algorithm and semi-analytical solutions for co-simulation of integrated transmission and distribution sy

Richard Lincoln 7 Sep 07, 2022