Advanced Usage

Remarks on Storage

Before we dive deeper into the usage of TinyDB, we should stop for a momentand discuss how TinyDB stores data.

To convert your data to a format that is writable to disk TinyDB uses thePython JSON module by default.It’s great when only simple data types are involved but it cannot handle morecomplex data types like custom classes. On Python 2 it also converts strings toUnicode strings upon reading(describedhere).

If that causes problems, you can writeyour own storage, that uses a more powerful (but also slower)library likepickle orPyYAML.

Hint

Opening multiple TinyDB instances on the same data (e.g. with theJSONStorage) may result in unexpected behavior due to query caching.Seequery_caching on how to disable the query cache.

Queries

With that out of the way, let’s start with TinyDB’s rich set of queries.There are two main ways to construct queries. The first one resembles thesyntax of popular ORM tools:

>>>fromtinydbimportQuery>>>User=Query()>>>db.search(User.name=='John')

As you can see, we first create a new Query object and then use it to specifywhich fields to check. Searching for nested fields is just as easy:

>>>db.search(User.birthday.year==1990)

Not all fields can be accessed this way if the field name is not a valid Pythonidentifier. In this case, you can switch to dict access notation:

>>># This would be invalid Python syntax:>>>db.search(User.country-code=='foo')>>># Use this instead:>>>db.search(User['country-code']=='foo')

In addition, you can use arbitrary transform function where a field would be,for example:

>>>fromunidecodeimportunidecode>>>db.search(User.name.map(unidecode)=='Jose')>>># will match 'José' etc.

The second, traditional way of constructing queries is as follows:

>>>fromtinydbimportwhere>>>db.search(where('field')=='value')

Usingwhere('field') is a shorthand for the following code:

>>>db.search(Query()['field']=='value')

Accessing nested fields with this syntax can be achieved like this:

>>>db.search(where('birthday').year==1900)>>>db.search(where('birthday')['year']==1900)

Advanced queries

In theGetting Started you’ve learned about the basic comparisons(==,<,>, …). In addition to these TinyDB supports the followingqueries:

>>># Existence of a field:>>>db.search(User.name.exists())
>>># Regex:>>># Full item has to match the regex:>>>db.search(User.name.matches('[aZ]*'))>>># Case insensitive search for 'John':>>>importre>>>db.search(User.name.matches('John',flags=re.IGNORECASE))>>># Any part of the item has to match the regex:>>>db.search(User.name.search('b+'))
>>># Custom test:>>>test_func=lambdas:s=='John'>>>db.search(User.name.test(test_func))
>>># Custom test with parameters:>>>deftest_func(val,m,n):>>>returnm<=val<=n>>>db.search(User.age.test(test_func,0,21))>>>db.search(User.age.test(test_func,21,99))

Another case is if you have adict where you want to find all documentsthat match thisdict. We call this searching for a fragment:

>>>db.search(Query().fragment({'foo':True,'bar':False}))[{'foo': True, 'bar': False, 'foobar: 'yes!'}]

You also can search for documents where a specific field matches the fragment:

>>>db.search(Query().field.fragment({'foo':True,'bar':False}))[{'field': {'foo': True, 'bar': False, 'foobar: 'yes!'}]

When a field contains a list, you also can use theany andall methods.There are two ways to use them: with lists of values and with nested queries.Let’s start with the first one. Assuming we have a user object with a groups listlike this:

>>>db.insert({'name':'user1','groups':['user']})>>>db.insert({'name':'user2','groups':['admin','user']})>>>db.insert({'name':'user3','groups':['sudo','user']})

Now we can use the following queries:

>>># User's groups include at least one value from ['admin', 'sudo']>>>db.search(User.groups.any(['admin','sudo']))[{'name': 'user2', 'groups': ['admin', 'user']}, {'name': 'user3', 'groups': ['sudo', 'user']}]>>>>>># User's groups include all values from ['admin', 'user']>>>db.search(User.groups.all(['admin','user']))[{'name': 'user2', 'groups': ['admin', 'user']}]

In some cases you may want to have more complexany/all queries.This is where nested queries come in as helpful. Let’s set up a table like this:

>>>Group=Query()>>>Permission=Query()>>>groups=db.table('groups')>>>groups.insert({        'name': 'user',        'permissions': [{'type': 'read'}]})>>>groups.insert({        'name': 'sudo',        'permissions': [{'type': 'read'}, {'type': 'sudo'}]})>>>groups.insert({        'name': 'admin',        'permissions': [{'type': 'read'}, {'type': 'write'}, {'type': 'sudo'}]})

Now let’s search this table using nestedany/all queries:

>>># Group has a permission with type 'read'>>>groups.search(Group.permissions.any(Permission.type=='read'))[{'name': 'user', 'permissions': [{'type': 'read'}]}, {'name': 'sudo', 'permissions': [{'type': 'read'}, {'type': 'sudo'}]}, {'name': 'admin', 'permissions':        [{'type': 'read'}, {'type': 'write'}, {'type': 'sudo'}]}]>>># Group has ONLY permission 'read'>>>groups.search(Group.permissions.all(Permission.type=='read'))[{'name': 'user', 'permissions': [{'type': 'read'}]}]

As you can see,any tests if there isat least one document matchingthe query whileall ensuresall documents match the query.

The opposite operation, checking if a single item is contained in a list,is also possible usingone_of:

>>>db.search(User.name.one_of(['jane','john']))

Query modifiers

TinyDB also allows you to use logical operations to modify and combinequeries:

>>># Negate a query:>>>db.search(~(User.name=='John'))
>>># Logical AND:>>>db.search((User.name=='John')&(User.age<=30))
>>># Logical OR:>>>db.search((User.name=='John')|(User.name=='Bob'))

Note

When using& or|, make sure you wrap the conditions on both sideswith parentheses or Python will mess up the comparison.

Also, when using negation (~) you’ll have to wrap the query you wantto negate in parentheses.

The reason for these requirements is that Python’s binary operators that areused for query modifiers have a higher operator precedence than comparisonoperators. Simply put,~User.name=='John' is parsed by Python as(~User.name)=='John' instead of~(User.name=='John'). See also thePythondocs on operator precedencefor details.

Recap

Let’s review the query operations we’ve learned:

Queries
Query().field.exists()Match any document where a field calledfield exists
Query().field.matches(regex)Match any document with the whole field matching theregular expression
Query().field.search(regex)Match any document with a substring of the field matchingthe regular expression
Query().field.test(func,*args)Matches any document for which the function returnsTrue
Query().field.all(query|list)If given a query, matches all documents where all documentsin the listfield match the query.If given a list, matches all documents where all documentsin the listfield are a member of the given list
Query().field.any(query|list)If given a query, matches all documents where at least onedocument in the listfield match the query.If given a list, matches all documents where at least onedocuments in the listfield are a member of the givenlist
Query().field.one_of(list)Match if the field is contained in the list
Logical operations on queries
~(query)Match documents that don’t match the query
(query1)&(query2)Match documents that match both queries
(query1)|(query2)Match documents that match at least one of the queries

Handling Data

Next, let’s look at some more ways to insert, update and retrieve data fromyour database.

Inserting data

As already described you can insert a document usingdb.insert(...).In case you want to insert multiple documents, you can usedb.insert_multiple(...):

>>>db.insert_multiple([        {'name': 'John', 'age': 22},        {'name': 'John', 'age': 37}])>>>db.insert_multiple({'int':1,'value':i}foriinrange(2))

Also in some cases it may be useful to specify the document ID yourself wheninserting data. You can do that by using theDocumentclass:

>>>db.insert(Document({'name':'John','age':22},doc_id=12))12

The same is possible when usingdb.insert_multiple(...):

>>>db.insert_multiple([    Document({'name': 'John', 'age': 22}, doc_id=12),    Document({'name': 'Jane', 'age': 24}, doc_id=14),])[12, 14]

Note

Inserting aDocument with an ID that already exists will resultin aValueError being raised.

Updating data

Sometimes you want to update all documents in your database. In this case, youcan leave out thequery argument:

>>>db.update({'foo':'bar'})

When passing a dict todb.update(fields,query), it only allows you toupdate a document by adding or overwriting its values. But sometimes you mayneed to e.g. remove one field or increment its value. In that case you canpass a function instead offields:

>>>fromtinydb.operationsimportdelete>>>db.update(delete('key1'),User.name=='John')

This will remove the keykey1 from all matching documents. TinyDB comeswith these operations:

  • delete(key): delete a key from the document
  • increment(key): increment the value of a key
  • decrement(key): decrement the value of a key
  • add(key,value): addvalue to the value of a key (also works for strings)
  • subtract(key,value): subtractvalue from the value of a key
  • set(key,value): setkey tovalue

Of course you also can write your own operations:

>>>defyour_operation(your_arguments):...deftransform(doc):...# do something with the document...# ......returntransform...>>>db.update(your_operation(arguments),query)

In order to perform multiple update operations at once, you can use theupdate_multiple method like this:

>>>db.update_multiple([...({'int':2},where('char')=='a'),...({'int':4},where('char')=='b'),...])

You also can use mix normal updates with update operations:

>>>db.update_multiple([...({'int':2},where('char')=='a'),...({delete('int'),where('char')=='b'),...])

Data access and modification

Upserting data

In some cases you’ll need a mix of bothupdate andinsert:upsert.This operation is provided a document and a query. If it finds any documentsmatching the query, they will be updated with the data from the provided document.On the other hand, if no matching document is found, it inserts the provideddocument into the table:

>>>db.upsert({'name':'John','logged-in':True},User.name=='John')

This will update all users with the name John to havelogged-in set toTrue.If no matching user is found, a new document is inserted with both the name setand thelogged-in flag.

To use the ID of the document as matching criterion aDocumentwithdoc_id is passed instead of a query:

>>>db.upsert(Document({'name':'John','logged-in':True},doc_id=12))

Retrieving data

There are several ways to retrieve data from your database. For instance youcan get the number of stored documents:

>>>len(db)3

Hint

This will return the number of documents in the default table(see the notes on thedefault table).

Then of course you can usedb.search(...) as described in theGetting Startedsection. But sometimes you want to get only one matching document. Instead of using

>>>try:...result=db.search(User.name=='John')[0]...exceptIndexError:...pass

you can usedb.get(...):

>>>db.get(User.name=='John'){'name': 'John', 'age': 22}>>>db.get(User.name=='Bobby')None

Caution

If multiple documents match the query, probably a random one of them willbe returned!

Often you don’t want to search for documents but only know whether they arestored in the database. In this casedb.contains(...) is your friend:

>>>db.contains(User.name=='John')

In a similar manner you can look up the number of documents matching a query:

>>>db.count(User.name=='John')2

Recap

Let’s summarize the ways to handle data:

Inserting data
db.insert_multiple(...)Insert multiple documents
Updating data
db.update(operation,...)Update all matching documents with a special operation
Retrieving data
len(db)Get the number of documents in the database
db.get(query)Get one document matching the query
db.contains(query)Check if the database contains a matching document
db.count(query)Get the number of matching documents

Note

This was a new feature in v3.6.0

Using Document IDs

Internally TinyDB associates an ID with every document you insert. It’s returnedafter inserting a document:

>>>db.insert({'name':'John','age':22})3>>>db.insert_multiple([{...},{...},{...}])[4, 5, 6]

In addition you can get the ID of already inserted documents usingdocument.doc_id. This works both withget andall:

>>>el=db.get(User.name=='John')>>>el.doc_id3>>>el=db.all()[0]>>>el.doc_id1>>>el=db.all()[-1]>>>el.doc_id12

Different TinyDB methods also work with IDs, namely:update,remove,contains andget. The first two also return a list of affected IDs.

>>>db.update({'value':2},doc_ids=[1,2])>>>db.contains(doc_id=1)True>>>db.remove(doc_ids=[1,2])>>>db.get(doc_id=3){...}>>>db.get(doc_ids=[1,2])[{...}, {...}]

Usingdoc_id/doc_ids instead ofQuery() again is slightly fasterin operation.

Recap

Let’s sum up the way TinyDB supports working with IDs:

Getting a document’s ID
db.insert(...)Returns the inserted document’s ID
db.insert_multiple(...)Returns the inserted documents’ ID
document.doc_idGet the ID of a document fetched from the db
Working with IDs
db.get(doc_id=...)Get the document with the given ID
db.contains(doc_id=...)Check if the db contains a document with the givenIDs
db.update({...},doc_ids=[...])Update all documents with the given IDs
db.remove(doc_ids=[...])Remove all documents with the given IDs

Tables

TinyDB supports working with multiple tables. They behave just the same astheTinyDB class. To create and use a table, usedb.table(name).

>>>table=db.table('table_name')>>>table.insert({'value':True})>>>table.all()[{'value': True}]>>>forrowintable:>>>print(row){'value': True}

To remove a table from a database, use:

>>>db.drop_table('table_name')

If on the other hand you want to remove all tables, use the counterpart:

>>>db.drop_tables()

Finally, you can get a list with the names of all tables in your database:

>>>db.tables(){'_default', 'table_name'}

Default Table

TinyDB uses a table named_default as the default table. All operationson the database object (likedb.insert(...)) operate on this table.The name of this table can be modified by setting thedefault_table_nameclass variable to modify the default table name for all instances:

>>>#1: for a single instance only>>>db=TinyDB(storage=SomeStorage)>>>db.default_table_name='my-default'>>>#2: for all instances>>>TinyDB.default_table_name='my-default'

Query Caching

TinyDB caches query result for performance. That way re-running a query won’thave to read the data from the storage as long as the database hasn’t beenmodified. You can optimize the query cache size by passing thecache_sizeto thetable(...) function:

>>>table=db.table('table_name',cache_size=30)

Hint

You can setcache_size toNone to make the cache unlimited insize. Also, you can setcache_size to 0 to disable it.

Hint

It’s not possible to open the same table multiple times with differentsettings. After the first invocation, all the subsequent calls will returnthe same table with the same settings as the first one.

Hint

The TinyDB query cache doesn’t check if the underlying storagethat the database uses has been modified by an external process. In thiscase the query cache may return outdated results. To clear the cache andread data from the storage again you can usedb.clear_cache().

Hint

When using an unlimited cache size andtest() queries, TinyDBwill store a reference to the test function. As a result of that behaviorlong-running applications that uselambda functions as a test functionmay experience memory leaks.

Storage & Middleware

Storage Types

TinyDB comes with two storage types: JSON and in-memory. Bydefault TinyDB stores its data in JSON files so you have to specify the pathwhere to store it:

>>>fromtinydbimportTinyDB,where>>>db=TinyDB('path/to/db.json')

To use the in-memory storage, use:

>>>fromtinydb.storagesimportMemoryStorage>>>db=TinyDB(storage=MemoryStorage)

Hint

All arguments except for thestorage argument are forwarded to theunderlying storage. For the JSON storage you can use this to passadditional keyword arguments to Python’sjson.dump(…)method. For example, you can set it to create prettified JSON files likethis:

>>>db=TinyDB('db.json',sort_keys=True,indent=4,separators=(',',': '))

To modify the default storage for allTinyDB instances, set thedefault_storage_class class variable:

>>>TinyDB.default_storage_class=MemoryStorage

In case you need to access the storage instance directly, you can use thestorage property of your TinyDB instance. This may be useful to callmethod directly on the storage or middleware:

>>>db=TinyDB(storage=CachingMiddleware(MemoryStorage))<tinydb.middlewares.CachingMiddleware at 0x10991def0>>>>db.storage.flush()

Middleware

Middleware wraps around existing storage allowing you to customize theirbehaviour.

>>>fromtinydb.storagesimportJSONStorage>>>fromtinydb.middlewaresimportCachingMiddleware>>>db=TinyDB('/path/to/db.json',storage=CachingMiddleware(JSONStorage))

Hint

You can nest middleware:

>>>db=TinyDB('/path/to/db.json',                storage=FirstMiddleware(SecondMiddleware(JSONStorage)))

CachingMiddleware

TheCachingMiddleware improves speed by reducing disk I/O. It caches allread operations and writes data to disk after a configured number ofwrite operations.

To make sure that all data is safely written when closing the table, use oneof these ways:

# Using a context manager:withdatabaseasdb:# Your operations
# Using the close functiondb.close()

MyPy Type Checking

TinyDB comes with type annotations that MyPy can use to make sure you’re usingthe API correctly. Unfortunately, MyPy doesn’t understand all code patternsthat TinyDB uses. For that reason TinyDB ships a MyPy plugin that helpscorrectly type checking code that uses TinyDB. To use it, add it to theplugins list in theMyPy configuration file(typically located insetup.cfg ormypy.ini):

[mypy]plugins=tinydb.mypy_plugin

What’s next

Congratulations, you’ve made through the user guide! Now go and build somethingawesome or dive deeper into TinyDB with these resources:

« Getting Started |How to Extend TinyDB »