- Notifications
You must be signed in to change notification settings - Fork26
Python codecs extension featuring CLI tools for encoding/decoding anything
License
dhondta/python-codext
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
CodExt is a (Python2-3 compatible) library that extends the nativecodecs library (namely for adding new custom encodings and character mappings) and provides120+ new codecs, hence its name combiningCODecs EXTension. It also features aguess mode for decoding multiple layers of encoding andCLI tools for convenience.
$ pip install codext
| Want to contribute a new codec ? | Want to contribute a new macro ? |
|---|---|
| Check thedocumentation first ThenPR your new codec | PR your updated version ofmacros.json |
$ codext -i test.txt encode dna-1GTGAGCGGGTATGTGA$ echo -en "test" | codext encode morse- . ... -$ echo -en "test" | codext encode braille⠞⠑⠎⠞$ echo -en "test" | codext encode base100👫👜👪👫$echo -en"Test string"| codext encode reversegnirts tseT$echo -en"Test string"| codext encode reverse morse--. -. .. .-. - ... / - .... -$echo -en"Test string"| codext encode reverse morse dna-2AGTCAGTCAGTGAGAAAGTCAGTGAGAAAGTGAGTGAGAAAGTGAGTCAGTGAGAAAGTCAGAAAGTGAGTGAGTGAGAAAGTTAGAAAGTCAGAAAGTGAGTGAGTGAGAAAGTGAGAAAGTC$echo -en"Test string"| codext encode reverse morse dna-2 octal101107124103101107124103101107124107101107101101101107124103101107124107101107101101101107124107101107124107101107101101101107124107101107124103101107124107101107101101101107124103101107101101101107124107101107124107101107124107101107101101101107124124101107101101101107124103101107101101101107124107101107124107101107124107101107101101101107124107101107101101101107124103$echo -en"AGTCAGTCAGTGAGAAAGTCAGTGAGAAAGTGAGTGAGAAAGTGAGTCAGTGAGAAAGTCAGAAAGTGAGTGAGTGAGAAAGTTAGAAAGTCAGAAAGTGAGTGAGTGAGAAAGTGAGAAAGTC"| codext -d dna-2 morse reversetest string
$ codext add-macro my-encoding-chain gzip base63 lzma base64$ codext list macrosexample-macro, my-encoding-chain$echo -en"Test string"| codext encode my-encoding-chainCQQFAF0AAIAAABuTgySPa7WaZC5Sunt6FS0ko71BdrYE8zHqg91qaqadZIR2LafUzpeYDBalvE///ug4AA==$ codext remove-macro my-encoding-chain$ codext list macrosexample-macro
$ echo "Test string !" | base122*.7!ft9�-f9Â$ echo "Test string !" | base91 "ONK;WDZM%Z%xE7L$ echo "Test string !" | base91 | base85B2P|BJ6A+nO(j|-cttl%$ echo "Test string !" | base91 | base85 | base36 | base58-flickrQVx5tvgjvCAkXaMSuKoQmCnjeCV1YyyR3WErUUErFf$ echo "Test string !" | base91 | base85 | base36 | base58-flickr | base58-flickr -d | base36 -d | base85 -d | base91 -dTest string !$ echo "Test string !" | base91 | base85 | base36 | base58-flickr | unbase -m 3Test string !$ echo "Test string !" | base91 | base85 | base36 | base58-flickr | unbase -f TestTest string !Getting the list of available codecs:
>>>importcodext>>>codext.list()['ascii85','base85','base100','base122', ...,'tomtom','dna','html','markdown','url','resistor','sms','whitespace','whitespace-after-before']>>>codext.encode("this is a test","base58-bitcoin")'jo91waLQA1NNeBmZKUF'>>>codext.encode("this is a test","base58-ripple")'jo9rA2LQwr44eBmZK7E'>>>codext.encode("this is a test","base58-url")'JN91Wzkpa1nnDbLyjtf'>>>codecs.encode("this is a test","base100")'👫👟👠👪🐗👠👪🐗👘🐗👫👜👪👫'>>>codecs.decode("👫👟👠👪🐗👠👪🐗👘🐗👫👜👪👫","base100")'this is a test'>>>foriinrange(8):print(codext.encode("this is a test","dna-%d"% (i+1)))GTGAGCCAGCCGGTATACAAGCCGGTATACAAGCAGACAAGTGAGCGGGTATGTGACTCACGGACGGCCTATAGAACGGCCTATAGAACGACAGAACTCACGCCCTATCTCAACAGATTGATTAACGCGTGGATTAACGCGTGGATGAGTGGACAGATAAACGCACAGAGACATTCATTAAGCGCTCCATTAAGCGCTCCATCACTCCAGACATAAAGCGAGACTCTGTAAGTAATTCGCGAGGTAATTCGCGAGGTAGTGAGGTCTGTATTTCGCTCTGTGTCTAACTAATTGCGCACCTAATTGCGCACCTACTCACCTGTCTATTTGCGTGTCGAGTGCCTGCCGGATATCTTGCCGGATATCTTGCTGTCTTGAGTGCGGGATAGAGTCACTCGGTCGGCCATATGTTCGGCCATATGTTCGTCTGTTCACTCGCCCATACACT>>>codext.decode("GTGAGCCAGCCGGTATACAAGCCGGTATACAAGCAGACAAGTGAGCGGGTATGTGA","dna-1")'this is a test'>>>codecs.encode("this is a test","morse")'- .... .. ... / .. ... / .- / - . ... -'>>>codecs.decode("- .... .. ... / .. ... / .- / - . ... -","morse")'this is a test'>>>withopen("morse.txt",'w',encoding="morse")asf:f.write("this is a test")14>>>withopen("morse.txt",encoding="morse")asf:f.read()'this is a test'>>>codext.decode(""" = X : x n r y Y y p a ` n | ao h ` g o z ""","whitespace-after+before")'CSC{not_so_invisible}'>>>print(codext.encode("An example test string","baudot-tape"))***.** .****.** . .** .* .*** .****.**** .** .** .**.* .***.**.** .**.**.****.*.****.** .*
base1: useless, but for the sake of completenessbase2: simple conversion to binary (with a variant with a reversed alphabet)base3: conversion to ternary (with a variant with a reversed alphabet)base4: conversion to quarternary (with a variant with a reversed alphabet)base8: simple conversion to octal (with a variant with a reversed alphabet)base10: simple conversion to decimalbase11: conversion to digits with a "a"base16: simple conversion to hexadecimal (with a variant holding an alphabet with digits and letters inverted)base26: conversion to alphabet lettersbase32: classical conversion according to the RFC4648 with all its variants (zbase32, extended hexadecimal,geohash,Crockford)base36:Base36 conversion to letters and digits (with a variant inverting both groups)base45:Base45 DRAFT algorithm (with a variant inverting letters and digits)base58: multiple versions ofBase58 (bitcoin, flickr, ripple)base62:Base62 conversion to lower- and uppercase letters and digits (with a variant with letters and digits inverted)base63: similar tobase62with the "_" addedbase64: classical conversion according to RFC4648 with its variant URL (orfile) (it also holds a variant with letters and digits inverted)base67: custom conversion using some more special characters (also with a variant with letters and digits inverted)base85: all variants of Base85 (Ascii85,z85,Adobe,(x)btoa,RFC1924,XML)base91:Base91 custom conversionbase100(oremoji):Base100 custom conversionbase122:Base100 custom conversionbase-genericN: seebase encodings ; supports any possible base
This category also containsascii85,adobe,[x]btoa,zeromq with thebase85 codec.
baudot: supports CCITT-1, CCITT-2, EU/FR, ITA1, ITA2, MTK-2 (Python3 only), UK, ...baudot-spaced: variant ofbaudot; groups of 5 bits are whitespace-separatedbaudot-tape: variant ofbaudot; outputs a string that looks like a perforated tapebcd:Binary Coded Decimal, encodes characters from their (zero-left-padded) ordinalsbcd-extended0: variant ofbcd; encodes characters from their (zero-left-padded) ordinals using prefix bits0000bcd-extended1: variant ofbcd; encodes characters from their (zero-left-padded) ordinals using prefix bits1111excess3: uses Excess-3 (aka Stibitz code) binary encoding to convert characters from their ordinalsgray: aka reflected binary codemanchester: XORes each bit of the input with01manchester-inverted: variant ofmanchester; XORes each bit of the input with10rotateN: rotates characters by the specified number of bits (N belongs to [1, 7] ; Python 3 only)
a1z26: keeps words whitespace-separated and uses a custom character separatorcases: set of case-related encodings (including camel-, kebab-, lower-, pascal-, upper-, snake- and swap-case, slugify, capitalize, title)dummy: set of simple encodings (including integer, replace, reverse, word-reverse, substite and strip-spaces)octal: dummy octal conversion (converts to 3-digits groups)octal-spaced: variant ofoctal; dummy octal conversion, handling whitespace separatorsordinal: dummy character ordinals conversion (converts to 3-digits groups)ordinal-spaced: variant ofordinal; dummy character ordinals conversion, handling whitespace separators
gzip: standard Gzip compression/decompressionlz77: compresses the given data with the algorithm of Lempel and Ziv of 1977lz78: compresses the given data with the algorithm of Lempel and Ziv of 1978pkzip_deflate: standard Zip-deflate compression/decompressionpkzip_bzip2: standard BZip2 compression/decompressionpkzip_lzma: standard LZMA compression/decompression
⚠️ Compression functions are of course definitelyNOT encoding functions ; they are implemented for leveraging the.encode(...)API fromcodecs.
affine: aka Affine Cipheratbash: aka Atbash Cipherbacon: aka Baconian Cipherbarbie-N: aka Barbie Typewriter (N belongs to [1, 4])citrix: aka Citrix CTX1 password encodingrailfence: aka Rail Fence CipherrotN: aka Caesar cipher (N belongs to [1,25])scytaleN: encrypts using the number of letters on the rod (N belongs to [1,[)shiftN: shift ordinals (N belongs to [1,255])xorN: XOR with a single byte (N belongs to [1,255])
⚠️ Crypto functions are of course definitelyNOT encoding functions ; they are implemented for leveraging the.encode(...)API fromcodecs.
blake: includes BLAKE2b and BLAKE2s (Python 3 only ; relies onhashlib)checksums: includes Adler32 and CRC32 (relies onzlib)crypt: Unix's crypt hash for passwords (Python 3 and Unix only ; relies oncrypt)md: aka Message Digest ; includes MD4 and MD5 (relies onhashlib)sha: aka Secure Hash Algorithms ; includes SHA1, 224, 256, 384, 512 (Python2/3) but also SHA3-224, -256, -384 and -512 (Python 3 only ; relies onhashlib)shake: aka SHAKE hashing (Python 3 only ; relies onhashlib)
⚠️ Hash functions are of course definitelyNOT encoding functions ; they are implemented for convenience with the.encode(...)API fromcodecsand useful for chaning codecs.
braille: well-known braille language (Python 3 only)ipsum: aka lorem ipsumgalactic: aka galactic alphabet or Minecraft enchantment language (Python 3 only)leetspeak: based on minimalistic elite speaking rulesmorse: uses whitespace as a separatornavajo: only handles letters (not full words from the Navajo dictionary)radio: aka NATO or radio phonetic alphabetsouthpark: converts letters to Kenny's language from Southpark (whitespace is also handled)southpark-icase: case insensitive variant ofsouthparktap: converts text to tap/knock code, commonly used by prisonerstomtom: similar tomorse, using slashes and backslashes
dna: implements the 8 rules of DNA sequences (N belongs to [1,8])letter-indices: encodes consonants and/or vowels with their corresponding indicesmarkdown: unidirectional encoding from Markdown to HTML
hexagram: uses Base64 and encodes the result to a charset ofI Ching hexagrams (as implementedhere)klopf: aka Klopf code ; Polybius square with trivial alphabetical distributionresistor: aka resistor color codesrick: aka Rick cipher (in reference to Rick Astley's song "Never gonna give you up")sms: also calledT9 code ; uses "-" as a separator for encoding, "-" or "_" or whitespace for decodingwhitespace: replaces bits with whitespaces and tabswhitespace_after_before: variant ofwhitespace; encodes characters as new characters with whitespaces before and after according to an equation described in the codec name (e.g. "whitespace+2*after-3*before")
html: implements entities according tothis referenceurl: aka URL encoding
About
Python codecs extension featuring CLI tools for encoding/decoding anything
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Uh oh!
There was an error while loading.Please reload this page.
Contributors4
Uh oh!
There was an error while loading.Please reload this page.



