TexZK/bytesparsePublic

NotificationsYou must be signed in to change notification settings
Fork1
Star2

Library to handle sparse bytes within a virtual memory space

License

BSD-2-Clause license

2 stars 1 fork Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 255 Commits
.github/workflows		.github/workflows
docs		docs
src/bytesparse		src/bytesparse
tests		tests
.coveragerc		.coveragerc
.editorconfig		.editorconfig
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
AUTHORS.rst		AUTHORS.rst
CHANGELOG.rst		CHANGELOG.rst
CONTRIBUTING.rst		CONTRIBUTING.rst
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.rst		README.rst
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Repository files navigation

Overview

docs
tests
package

Library to handle sparse bytes within a virtual memory space.

Free software: BSD 2-Clause License

Objectives

This library aims to provide utilities to work with a virtual memory, whichconsists of a virtual addressing space where sparse chunks of data can bestored.

In order to be easy to use, its interface should be close to that of abytearray, which is the closest pythonic way to store dynamic data.The main downside of abytearray is that it requires a contiguous dataallocation starting from address 0. This is not good when sparse data have tobe stored, such as when emulating the addressing space of a genericmicrocontroller.

The main idea is to provide abytearray-like class with the possibility tointernally hold the sparse blocks of data.A block is ideally a tuple(start, data) where start is the startaddress and data is the container of data items (e.g.bytearray).The length of the block islen(data).Those blocks are usually not overlapping nor contiguous, and sorted by startaddress.

Python implementation

This library provides a pure Python implementation, for maximum compatibility.

Its implementation should be correct and robust, whilst trying to be as fastas common sense suggests. This means that the code should be reasonablyoptimized for general use, while still providing features that are less likelyto be used, yet compatible with the existing Python API (e.g.bytearray ordict).

The Python implementation can also leverage the capabilities of its powerfulint type, so that a virtually infinite addressing space can be used,even with negative addresses!

Data chunks are stored as common mutablebytearray objects, whose size islimited by the Python engine (i.e. that ofsize_t).

Thebytesparse package provides the following virtual memory types:

bytesparse.Memory, a generic virtual memory with infinite address range.
bytesparse.bytesparse, a subclass behaving more likebytearray.

All the implementations inherit the behavior ofcollections.abc.MutableSequence andcollections.abc.MutableMapping.Please refer tothe collections.abc reference manual for more informationabout the interface API methods and capabilities.

Cython implementation

The library also provides an experimental Cython implementation. It tries tomimic the same algorithms of the Python implementation, while leveraging thespeedup of compiled C code.

Please refer to thecbytesparse Python package for more details.

Examples

Here's a quick usage example ofbytesparse objects:

>>>from bytesparseimport Memory>>>from bytesparseimport bytesparse

>>> m= bytesparse(b'Hello, World!')# creates from bytes>>>len(m)# total length13>>>str(m)# string representation, with bounds and data blocks"<[(0, b'Hello, World!')]>">>>bytes(m)# exports as bytesb'Hello, World!'>>> m.to_bytes()# exports the whole range as bytesb'Hello, World!'

>>> m.extend(b'!!')# more emphasis!!!>>>bytes(m)b'Hello, World!!!'

>>> i= m.index(b',')# gets the address of the comma>>> m[:i]=b'Ciao'# replaces 'Hello' with 'Ciao'>>>bytes(m)b'Ciao, World!!!'

>>> i= m.index(b',')# gets the address of the comma>>> m.insert(i,b'ne')# inserts 'ne' to make 'Ciaone' ("big ciao")>>>bytes(m)b'Ciaone, World!!!'

>>> i= m.index(b',')# gets the address of the comma>>> m[(i-2):i]=b' ciao'# makes 'Ciaone' --> 'Ciao ciao'>>>bytes(m)b'Ciao ciao, World!!!'

>>> m.pop()# less emphasis --> 33 == ord('!')33>>>bytes(m)b'Ciao ciao, World!!'

>>>del m[m.index(b'l')]# makes 'World' --> 'Word'>>>bytes(m)b'Ciao ciao, Word!!'

>>> m.popitem()# less emphasis --> pops 33 (== '!') at address 16(16, 33)>>>bytes(m)b'Ciao ciao, Word!'

>>> m.remove(b' ciao')# self-explanatory>>>bytes(m)b'Ciao, Word!'

>>> i= m.index(b',')# gets the address of the comma>>> m.clear(start=i,endex=(i+2))# makes empty space between the words>>> m.to_blocks()# exports as data block list[(0, b'Ciao'), (6, b'Word!')]>>> m.contiguous# multiple data blocks (emptiness inbetween)False>>> m.content_parts# two data blocks2>>> m.content_size# excluding emptiness9>>>len(m)# including emptiness11

>>> m.flood(pattern=b'.')# replaces emptiness with dots>>>bytes(m)b'Ciao..Word!'>>> m[-2]# 100 == ord('d')100

>>> m.peek(-2)# 100 == ord('d')100>>> m.poke(-2,b'k')# makes 'Word' --> 'Work'>>>bytes(m)b'Ciao..Work!'

>>> m.crop(start=m.index(b'W'))# keeps 'Work!'>>> m.to_blocks()[(6, b'Work!')]>>> m.span# address range of the whole memory(6, 11)>>> m.start, m.endex# same as above(6, 11)

>>> m.bound_span= (2,10)# sets memory address bounds>>>str(m)"<2, [(6, b'Work')], 10>">>> m.to_blocks()[(6, b'Work')]

>>> m.shift(-6)# shifts to the left;NOTE: address bounds will cut 2 bytes!>>> m.to_blocks()[(2, b'rk')]>>>str(m)"<2, [(2, b'rk')], 10>"

>>> a= bytesparse(b'Ma')>>> a.write(0, m)# writes (2, b'rk') --> 'Mark'>>> a.to_blocks()[(0, b'Mark')]

>>> b= Memory.from_bytes(b'ing',offset=4)>>> b.to_blocks()[(4, b'ing')]

>>> a.write(0, b)# writes (4, b'ing') --> 'Marking'>>> a.to_blocks()[(0, b'Marking')]

>>> a.reserve(4,2)# inserts 2 empty bytes after 'Mark'>>> a.to_blocks()[(0, b'Mark'), (6, b'ing')]

>>> a.write(4,b'et')# --> 'Marketing'>>> a.to_blocks()[(0, b'Marketing')]

>>> a.fill(1,-1,b'*')# fills asterisks between the first and last letters>>> a.to_blocks()[(0, b'M*******g')]

>>> v= a.view(1,-1)# creates a memory view spanning the asterisks>>> v[::2]=b'1234'# replaces even asterisks with numbers>>> a.to_blocks()[(0, b'M1*2*3*4g')]>>> a.count(b'*')# counts all the asterisks3>>> v.release()# release memory view

>>> c= a.copy()# creates a (deep) copy>>> c== aTrue>>> cis aFalse

>>>del a[a.index(b'*')::2]# deletes every other byte from the first asterisk>>> a.to_blocks()[(0, b'M1234')]

>>> a.shift(3)# moves away from the trivial 0 index>>> a.to_blocks()[(3, b'M1234')]>>>list(a.keys())[3, 4, 5, 6, 7]>>>list(a.values())[77, 49, 50, 51, 52]>>>list(a.items())[(3, 77), (4, 49), (5, 50), (6, 51), (7, 52)]

>>> c.to_blocks()# reminder[(0, b'M1*2*3*4g')]>>> c[2::2]=None# clears (empties) every other byte from the first asterisk>>> c.to_blocks()[(0, b'M1'), (3, b'2'), (5, b'3'), (7, b'4')]>>>list(c.intervals())# lists all the block ranges[(0, 2), (3, 4), (5, 6), (7, 8)]>>>list(c.gaps())# lists all the empty ranges[(None, 0), (2, 3), (4, 5), (6, 7), (8, None)]

>>> c.flood(pattern=b'xy')# fills empty spaces>>> c.to_blocks()[(0, b'M1x2x3x4')]

>>> t= c.cut(c.index(b'1'), c.index(b'3'))# cuts an inner slice>>> t.to_blocks()[(1, b'1x2x')]>>> c.to_blocks()[(0, b'M'), (5, b'3x4')]>>> t.bound_span# address bounds of the slice (automatically activated)(1, 5)

>>> k= bytesparse.from_blocks([(4,b'ABC'), (9,b'xy')],start=2,endex=15)# bounded>>>str(k)# shows summary"<2, [(4, b'ABC'), (9, b'xy')], 15>">>> k.bound_span# defined at creation(2, 15)>>> k.span# superseded by bounds(2, 15)>>> k.content_span# actual content span (min/max addresses)(4, 11)>>>len(k)# superseded by bounds13>>> k.content_size# actual content size5

>>> k.flood(pattern=b'.')# floods between span>>> k.to_blocks()[(2, b'..ABC..xy....')]

Background

This library started as a spin-off ofhexrec.blocks.Memory.That was based on a simple Python implementation using immutable objects (i.e.tuple andbytes). While good enough to handle common hexadecimal files,it was totally unsuited for dynamic/interactive environments, such as emulators,IDEs, data editors, and so on.Instead,bytesparse should be more flexible and faster, hopefullysuitable for generic usage.

While developing the Python implementation, why not also jump on the Cythonbandwagon, which permits even faster algorithms? Moreover, Cython itself isan interesting intermediate language, which brings to the speed of C, whilstbeing close enough to Python for the like.

Too bad, one great downside is that debugging Cython-compiled code is quite anhassle -- that is why I debugged it in a crude way I cannot even mention, andthe reason why there must be dozens of bugs hidden around there, despite thetest suite :-) Moreover, the Cython implementation is still experimental, withsome features yet to be polished (e.g. reference counting).

Documentation

For the full documentation, please refer to:

https://bytesparse.readthedocs.io/en/latest/

Installation

From PyPI (might not be the latest version found ongithub):

$ pip install bytesparse

From the source code root directory:

$ pip install.

Development

To run the all the tests:

$ pip install tox$ tox

About

Library to handle sparse bytes within a virtual memory space

bytesparse.readthedocs.io/

Releases13

1.1.0 Latest

Jul 4, 2025

+ 12 releases

Packages

No packages published

Languages

Python100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Overview

Objectives

Python implementation

Cython implementation

Examples

Background

Documentation

Installation

Development

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases13

Packages

Uh oh!

Languages

Movatterモバイル変換

License

TexZK/bytesparse

Folders and files

Latest commit

History

Repository files navigation

Overview

Objectives

Python implementation

Cython implementation

Examples

Background

Documentation

Installation

Development

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases13

Packages0

Uh oh!

Languages

Packages