- Notifications
You must be signed in to change notification settings - Fork3
Convert arrays to bitmask representations to quickly operate with them through bitwise operations. (uses JSBI)
License
codescrum/bitmask-set-jsbi-js
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Convert arrays to bitmask representations to quickly operate with them through bitwise operations.
In general, this approach can be applied to whenever you want to:
- Quickly retrieve the intersection betwen two (or more) arrays of elements.
- Quickly add/remove elements of other arrays to/from a particular array.
The approach implemented here might work best if you need to operatewith many arrays/sets instead of just a couple.
Important: This package depends on the JSBI — pure-JavaScript BigInts library for compatibility.
Seehttps://github.com/GoogleChromeLabs/jsbi,https://www.npmjs.com/package/jsbiandhttps://v8.dev/features/bigint for information on this.
You can check the state of BigInt adoption here:https://caniuse.com/?search=bigint.
For a more modern version of this package (using native BigInt) please see:https://github.com/codescrum/bitmask-set-js
npm install @codescrum/bitmask-set-jsbi
import{BitmaskSet,Bitmask}from"@codescrum/bitmask-set-jsbi"// Given an array that you want to operate onletmyArray=[1,2,3,4,5,6,7,8,9]// We first define the `BitmaskSet` which we will work with.letset=newBitmaskSet(myArray)// Then we define some bitmasks, representing different sets of elements// you pass either their elements, or a strings of 1's and 0'sleta=set.bitmask([1,3,5,7,9])// let a = set.bitmask("101010101")letb=set.bitmask([2,4,6,8])// let b = set.bitmask("010101010")letc=set.bitmask([1,3,5])// let c = set.bitmask("101010000")letd=set.bitmask([6,8])// let d = set.bitmask("000001010")lete=set.bitmask([1,2,3,4,5])// let e = set.bitmask("111110000")letf=set.bitmask([6,7,8,9])// let e = set.bitmask("000001111")letg=set.bitmask([1,9])// let g = set.bitmask("100000001")// For ease of visualization we defined the elements in order when// creating the bitmasks, but you can pass the elements in any order, even// if they include duplicates when creating the bitmasks.//// Note that you must use the `BitmaskSet` instance to create the// bitmasks based on it, or alternatively instantiate them as:letbitmask=newBitmask(set,[1,2,3])// same as set.bitmask([1,3,5])// Print a bitmask's string representationconsole.log("bitmask: "+bitmask)// bitmask: 111000000//// All the following methods return bitmasks// These methods are the most basic ones and can only take other// bitmasks as argumentsa.and(b)// 000000000 bitwise anda.or(b)// 111111111 bitwise ora.xor(e)// 010100101 bitwise xora.invert()// 010101010 (like bitwise not)// These methods can take both bitmasks and elements as arguments// Also, they always return a new bitmask (they don't mutate the bitmask// you call them on.//// Note that some are just the same with different names for// semantics convenience (expressiveness)a.add([1,2,3,4,5])// 111110101 you can add an array of elementsa.add(e)// 111110101 you can add elements from another bitmask tooa.union(e)// 000000101 same as add (added for convenience)a.include(e)// 111110101 same as add (added for convenience)a.remove([1,2,3,4,5])// 000000101 you can remove an array of elementsa.remove(e)// 000000101 you can remove elements from another bitmask tooa.exclude(e)// 000000101 same as remove (added for convenience)a.distinct(e)// 000000101 same as xor (added for convenience)a.unlike(e)// 000000101 same as xor (added for convenience)a.different(e)// 000000101 same as xor (added for convenience)a.intersection(e)// 101010000 return elements in commona.intersect(e)// 101010000 same as intersection (added for convenience)a.is_in(b)// true - checks if all elements of a are in b.a.is_not_in(b)// false - a.is_not_in(b) == !a.is_in(b)a.is_full()// false - checks if all elements are present (i.e. "111111111")a.is_empty()// false - checks if no elements are present (i.e. "000000000")a.is_zero()// false - same as is_empty (i.e. "000000000")// All bitmask methods are `lazy` in the sense they don't compute any// elements while you perform operations on them.// As long as you are in "bitmask space" you manipulate everything through// bitwise operations.// Once you are ready to get your final items, just call `elements()`a.invert().elements()// [2,4,6,8] (note: this result will be memoized)//// Some more examples// Compare to string representationconsole.log("a == '101010101': ",(a=='101010101'))// true// Equality between bitmasksconsole.log("a.invert().equals(b): ",a.invert().equals(b))// trueconsole.log(`${a}.is_in(${b}): `,a.is_in(b))// falseconsole.log(`${a}.is_in(${c}): `,a.is_in(c))// falseconsole.log(`${c}.is_in(${a}): `,c.is_in(a))// trueconsole.log(`${d}.is_in(${f}): `,d.is_in(f))// trueconsole.log(`${g}.is_in(${a}): `,g.is_in(a))// trueconsole.log(`${g}.is_in(${b}): `,g.is_in(b))// false// Note that you can chain methods since they keep returning// bitmasks, so a given selection of elements can be expressed// as follows:letresult=a// Take all elements from `a`.distinct(b)// mutually excluding those from `b`.and(e)// that are also in `e`.add(f)// then add the ones in `f`.remove(g)// then remove those in `g`.invert()// invert the current selectionconsole.log("result: "+result)// result: 100000001// After manipulations, compute final resulting elements.// Elements get computed when you first call `elements()`// in your resulting bitmask:console.debug(result.elements())// [ 1, 9 ]
A few things to keep in mind:
- The array of elements you pass in to create the
BitmaskSetis assumed to be of unique values (no duplicates). Thismay not be an issue, since the internal presentation will remove duplicate elements anyway. - The order of the elements you pass in to create the BitmaskSet or the bitmasksis really not important. But the internal operations do rely on this order being preserved.
- The array of elements cannot be empty, you must check this case beforeattempting to create the
BitmaskSet.
The basic idea is very simple, it does the following:
Given an array of unique values, say
[1,2,3,4,5]we create bitmasks of thesame lenght as the numbers of elements (such as10101) to represent whichelements are present and which are not. Take these examples:11111would represent all 5 elements[1,2,3,4,5]10101would represent elements[1,3,5]00000would represent an empty set[]
These bitmasks are then encoded as BigInt values which we can operate withbitwise operations. You use these methods instead of looping and comparingin the usual way.
Please note that althogh here we use the wordset the actualimplementation is based onarrays since we need the indexes of elements tomaintain the internal consistency of the associated bitmasks.
- Clone repo.
npm installnpm test
If you want to interactively debug the code/tests you might want to invoke mocha as follows:
Place adebugger call anywhere to create a breakpoint for use with the NodeJS built-in debugger.
Then, run:
mocha inspectand inputc to continue to your breakpoint.
Or, alternatively run with theNODE_INSPECT_RESUME_ON_START=1 flag to continue on to your breakpoint, without pausing at the high-level imports (which is a bit confusing initially).
NODE_INSPECT_RESUME_ON_START=1 mocha inspectSeehttps://mochajs.org/#-inspect-inspect-brk-inspect andhttps://nodejs.org/api/debugger.html for more information.
Note: If themocha command is not found, you might want to either install globally (npm install -g mocha) or add the specific<project_root>/node_modules/.bin to your $PATH.
It may be possible to add the following features to this implementation:
- Provide reasonable behaviour and API whenever the set's elements is empty on initialization.
- Be able to extend the
BitmaskSetwith more elements dynamically. - Remove JSBI for native BigInt.https://github.com/GoogleChromeLabs/babel-plugin-transform-jsbi-to-bigint. Or perhaps completely use ArrayBuffer.
- Provide benchmarking against some other more "tradditional" approaches.
- Remove JSBI library.
This package was extracted from a particular implementation we had to createsome time ago.
Actual performance for this library hasn't been measured yet, but at the timethis approach was the one to be easier to reason about and worked very well forour use case, which was finding elements in common with many dozens ofthousand-element arrays quickly in the browser (for UI updates).
The approach implemented here might work best if you need to operate with manyarrays/sets instead of just a few.
Benchmarking this properly may require hundreds of arrays of thousands ofelements before computing the final elements, which is closer to the originalissue we had to solve.
Yes!, actually, try these ones first:
- https://lodash.com/
- https://github.com/lovasoa/fast_array_intersect
- https://github.com/YuJianrong/fast-array-diff
- https://github.com/chouguleds/array-operations
- https://socket.dev/npm/package/fast-loops
- https://npmmirror.com/package/nv-array-fast
- https://www.npmjs.com/package/gonfalon
- https://www.npmjs.com/package/big-bit-mask
- https://www.npmjs.com/package/easy-bits
- https://github.com/namuol/bm
We didn't tested them all, we just put a reasonble list for your reference.
MIT
Miguel Diaz (@gato-omega)[https://github.com/gato-omega]
About
Convert arrays to bitmask representations to quickly operate with them through bitwise operations. (uses JSBI)
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Uh oh!
There was an error while loading.Please reload this page.