Schemas overview Stay organized with collections Save and categorize content based on your preferences.
This page discusses Spanner schema requirements, how to use the schemato create hierarchical relationships, and schema features. It also introducesinterleaved tables, which can improve query performance when querying tables ina parent-child relationship.
A schema is a namespace that contains database objects, such as tables, views,indexes, and functions. You use schemas to organize objects, apply fine-grained access controlprivileges, and avoid naming collisions. You must define a schema for eachdatabase in Spanner.
You can also further segment and store rows in your database table acrossdifferent geographic regions. For more information, see theGeo-partitioning overview.
Strongly typed data
Data in Spanner is strongly typed. Data types include scalar andcomplex types, which are described inData types in GoogleSQLandPostgreSQL data types.
Choose a primary key
Spanner databases can contain one or more tables. Tables arestructured as rows and columns. The table schema defines one or more tablecolumns as the table'sprimary key which uniquely identifies each row. Primarykeys are always indexed for quick row lookup. If you want to update or deleteexisting rows in a table, then the table must have a primary key. A table withno primary key columns can only have one row. Only GoogleSQL-dialect databasescan have tables without a primary key.
Often your application already has a field that's a natural fit for use as theprimary key. For example, for aCustomers table, there might be anapplication-suppliedCustomerId that serves well as the primary key. In othercases, you might need to generate a primary key when inserting the row. Thiswould typically be a unique integer value with no business significance (asurrogate primary key).
In all cases, you should be careful not to create hotspots with the choice ofyour primary key. For example, if you insert records with a monotonicallyincreasing integer as the key, you'll always insert at the end of your keyspace. This is undesirable because Spanner divides data amongservers by key ranges, which means your inserts will be directed at a singleserver, creating a hotspot. There are techniques that can spread the load acrossmultiple servers and avoid hotspots:
- Hash the key and store it ina column. Use the hash column (or the hash column and the unique key columnstogether) as the primary key.
- Swap the order of thecolumns in the primary key.
- Use a Universally Unique Identifier(UUID). Version 4UUID is recommended, because it usesrandom values in the high-order bits. Don't use a UUID algorithm (such asversion 1 UUID) that stores the timestamp in the high order bits.
- Bit-reversesequential values.
Parent-child table relationships
There are two ways to define parent-child relationships inSpanner:table interleaving andforeign keys.
Spanner's table interleaving is a good choice for manyparent-child relationships. With interleaving, Spanner physicallycolocates child rows with parent rows in storage. Co-location can significantlyimprove performance. For example, if you have aCustomers table and anInvoices table, and your application frequently fetches all the invoices for acustomer, you can defineInvoices as an interleaved child table ofCustomers. In doing so, you're declaring a data locality relationship betweentwo independent tables. You're telling Spannerto store one or more rows ofInvoices with oneCustomers row. Thisparent-child relationship is enforced when interleaved with theINTERLEAVE IN PARENT clause.INTERLEAVE IN child tables share the samephysical row interleaving characteristics, but Spanner doesn't enforce referentialintegrity between parent and child.
You associate a child table with a parent table by using DDL that declares thechild table as interleaved in the parent, and by including the parent tableprimary key as the first part of the child table composite primary key.
For more information about interleaving, seeCreate interleaved tables.
Foreign keys are a more general parent-child solution and address additional usecases. They are not limited to primary key columns, and tables can have multipleforeign key relationships, both as a parent in some relationships and a child inothers. However, a foreign key relationship does not imply co-location of thetables in the storage layer.
Google recommends that you choose to represent parent-child relationships eitheras interleaved tables or as foreign keys, but not both. For more information onforeign keys and their comparison to interleaved tables, seeForeign keysoverview.
Primary keys in interleaved tables
For interleaving, every table must have a primary key. If you declare a table tobe an interleaved child of another table, the table must have a compositeprimary key that includes all of the components of the parent's primary key, inthe same order, and, typically, one or more additional child table columns.
Spanner stores rows in sorted order by primary key values, withchild rows inserted between parent rows. See an illustration of interleaved rowsinCreate interleaved tables later in this page.
In summary, Spanner can physically colocate rows of relatedtables. Theschema examples show what this physical layoutlooks like.
Database splits
You can define hierarchies of interleaved parent-child relationships up to sevenlayers deep, which means that you can colocate rows of seven independent tables.If the size of the data in your tables is small, a single Spannerserver can probably handle your database. But what happens when your relatedtables grow and start reaching the resource limits of an individual server?Spanner is a distributed database, which means that as yourdatabase grows, Spanner divides your data into chunks called"splits." Individual splits can move independently from each other and getassigned to different servers, which can be in different physical locations. Asplit holds a range of contiguous rows. The start and end keys of this range arecalled "split boundaries". Spanner automatically adds and removessplit boundaries based on size and load, which changes the number of splits inthe database.
Load-based splitting
As an example of how Spanner performs load-based splitting tomitigate read hotspots, suppose your database contains a table with 10 rows thatare read more frequently than all of the other rows in the table.Spanner can add split boundaries between each of those 10 rows sothat they're each handled by a different server, rather than allowing all thereads of those rows to consume the resources of a single server.
As a general rule, if you followbest practices for schema design,Spanner can mitigate hotspots such that the read throughputshould improve every few minutes until you saturate the resources in yourinstance or run into cases where no new split boundaries can be added (becauseyou have a split that covers just a single row with no interleaved children).
Named schemas
Named schemas help you organize similar data together. This helps you to quicklyfind objects in the Google Cloud console, apply privileges, and avoid namingcollisions.
Named schemas, like other database objects, are managed using DDL.
Spanner named schemas permit you to use fully qualified names(FQNs) to query for data. FQNs let you combine the schema name and theobject name to identify database objects. For example, you could create a schemacalledwarehouse for the warehouse business unit. The tables that use thisschema could include:product,order, andcustomer information. Or youcould create a schema calledfulfillment for the fulfillment business unit.This schema could also have tables calledproduct,order, andcustomerinformation. In the first example, the FQN iswarehouse.product and in thesecond example, the FQN isfulfillment.product. This prevents confusion insituations where multiple objects share the same name.
In theCREATE SCHEMA DDL, table objects are given both an FQN, for example,sales.customers, and a short name, for example,sales.
The following database objects support named schemas:
TABLECREATEINTERLEAVE IN [PARENT]FOREIGN KEYSYNONYM
VIEWINDEXSEARCH INDEXFOREIGN KEYSEQUENCE
For more information about using named schemas, seeManage namedschemas.
Use fine-grained access control with named schemas
Named schemas let you grant schema-level access to each object in the schema.This applies to schema objects that exist at the time that you grant access.You must grant access to objects that are added later.
Fine-grained access control limits access to entire groups of database objects, such astables, columns, and rows in the table.
For more information, seeGrant fine-grained access control privileges to namedschemas.
Schema examples
The schema examples in this section show how to create parent and child tableswith and without interleaving, and illustrate the corresponding physical layoutsof data.
Create a parent table
Suppose you're creating a music application and you need a table that storesrows of singer data:
Note that the table contains one primary key column,SingerId, which appearsto the left of the bolded line, and that tables are organized by rows andcolumns.
You can define the table with the following DDL:
GoogleSQL
CREATETABLESingers(SingerIdINT64NOTNULLPRIMARYKEY,FirstNameSTRING(1024),LastNameSTRING(1024),SingerInfoBYTES(MAX),);
PostgreSQL
CREATETABLEsingers(singer_idBIGINTPRIMARYKEY,first_nameVARCHAR(1024),last_nameVARCHAR(1024),singer_infoBYTEA);
Note the following about the example schema:
Singersis a table at the root of the database hierarchy (because it's notdefined as an interleaved child of another table).- For GoogleSQL-dialect databases, primary key columns are usually annotated with
NOT NULL(though you can omit this annotation if you want to allowNULLvalues inkey columns. For more information, seeKeyColumns). - Columns that are not included in the primary key are called non-key columns,and they can have an optional
NOT NULLannotation. - Columns that use the
STRINGorBYTEStype in GoogleSQL must bedefined with a length, which represents the maximum number of Unicodecharacters that can be stored in the field. The length specification isoptional for the PostgreSQLvarcharandcharacter varyingtypes. For more information, seeScalar Data Typesfor GoogleSQL-dialect databases andPostgreSQL datatypes for PostgreSQL-dialect databases.
What does the physical layout of the rows in theSingers table look like? Thefollowing diagram shows rows of theSingers table stored by primary key("Singers(1)", and then "Singers(2)", where the number in parentheses isthe primary key value.
The preceding diagram illustrates an example split boundary between the rowskeyed bySingers(3) andSingers(4), with the data from the resulting splitsassigned to different servers. As this table grows, it's possible for rows ofSingers data to be stored in different locations.
Create parent and child tables
Assume that you now want to add some basic data about each singer's albums tothe music application.
Note that the primary key ofAlbums is composed of two columns:SingerId andAlbumId, to associate each album with its singer. The following example schemadefines both theAlbums andSingers tables at the root of the databasehierarchy, which makes them sibling tables.
-- Schema hierarchy:-- + Singers (sibling table of Albums)-- + Albums (sibling table of Singers)
GoogleSQL
CREATETABLESingers(SingerIdINT64NOTNULLPRIMARYKEY,FirstNameSTRING(1024),LastNameSTRING(1024),SingerInfoBYTES(MAX),);CREATETABLEAlbums(SingerIdINT64NOTNULL,AlbumIdINT64NOTNULL,AlbumTitleSTRING(MAX),)PRIMARYKEY(SingerId,AlbumId);
PostgreSQL
CREATETABLEsingers(singer_idBIGINTPRIMARYKEY,first_nameVARCHAR(1024),last_nameVARCHAR(1024),singer_infoBYTEA);CREATETABLEalbums(singer_idBIGINT,album_idBIGINT,album_titleVARCHAR,PRIMARYKEY(singer_id,album_id));
The physical layout of the rows ofSingers andAlbums looks like thefollowing diagram, with rows of theAlbums table stored by contiguous primarykey, then rows ofSingers stored by contiguous primary key:
One important note about the schema is that Spanner assumes nodata locality relationships between theSingers andAlbums tables, becausethey are top-level tables. As the database grows, Spanner can addsplit boundaries between any of the rows. This means the rows of theAlbumstable could end up in a different split from the rows of theSingers table,and the two splits could move independently from each other.
Depending on your application's needs, it might be fine to allowAlbums datato be located on different splits fromSingers data. However, this might incura performance penalty due to the need to coordinate reads and updates acrossdistinct resources. If your application frequently needs to retrieve informationabout all the albums for a particular singer, then you should createAlbums asan interleaved child table ofSingers, which colocates rows from the twotables along the primary key dimension. The next example explains this in moredetail.
Create interleaved tables
Aninterleaved table is a table that you declare to be an interleaved child ofanother table because you want the rows of the child table to be physicallystored with the associated parent row. As mentioned earlier, the parent tableprimary key must be the first part of the child table composite primary key.
After you interleave a table, it's permanent. You can't undo the interleaving.Instead, you need to create the table again and migrate data to it.
As you're designing your music application, suppose you realize that the appneeds to frequently access rows from theAlbums table when it accesses aSingers row. For example, when you access the rowSingers(1), you also needto access the rowsAlbums(1, 1) andAlbums(1, 2). In this case,SingersandAlbums need to have a strong data locality relationship. You can declarethis data locality relationship by creatingAlbums as an interleaved childtable ofSingers.
-- Schema hierarchy:-- + Singers-- + Albums (interleaved table, child table of Singers)
The bolded line in the following schema shows how to createAlbums as aninterleaved table ofSingers.
GoogleSQL
CREATETABLESingers(SingerIdINT64NOTNULLPRIMARYKEY,FirstNameSTRING(1024),LastNameSTRING(1024),SingerInfoBYTES(MAX),);CREATETABLEAlbums(SingerIdINT64NOTNULL,AlbumIdINT64NOTNULL,AlbumTitleSTRING(MAX),)PRIMARYKEY(SingerId,AlbumId),INTERLEAVEINPARENTSingersONDELETECASCADE;
PostgreSQL
CREATETABLEsingers(singer_idBIGINTPRIMARYKEY,first_nameVARCHAR(1024),last_nameVARCHAR(1024),singer_infoBYTEA);CREATETABLEalbums(singer_idBIGINT,album_idBIGINT,album_titleVARCHAR,PRIMARYKEY(singer_id,album_id))INTERLEAVEINPARENTsingersONDELETECASCADE;
Notes about this schema:
SingerId, which is the first part of the primary key of the child tableAlbums, is also the primary key of its parent tableSingers.- The
ON DELETE CASCADEannotation signifies that when a row from the parent table is deleted, itschild rows are automatically deleted as well. If a child table doesn't havethis annotation, or the annotation isON DELETE NO ACTION, then you mustdelete the child rows before you can delete the parent row. - Interleaved rows are ordered first by rows of the parent table, then bycontiguous rows of the child table that share the parent's primary key. Forexample, "Singers(1)", then "Albums(1, 1)", and then "Albums(1, 2)".
- The data locality relationship of each singer and their album data ispreserved if this database splits, provided that the size of a
Singersrowand all itsAlbumsrows stays below the split size limit and that there isno hotspot in any of theseAlbumsrows. - The parent row must exist before you can insert child rows. The parent rowcan either already exist in the database or can be inserted before theinsertion of the child rows in the same transaction.
Suppose you'd like to modelProjects and theirResources as interleavedtables. Certain scenarios could benefit fromINTERLEAVE IN - the ability tonot require theProjects row to exist for entities under it to exist (say, aProject has been deleted, but its resources need to be cleaned up beforedeleting).
GoogleSQL
CREATETABLEProjects(ProjectIdINT64NOTNULL,ProjectNameSTRING(1024),)PRIMARYKEY(ProjectId);CREATETABLEResources(ProjectIdINT64NOTNULL,ResourceIdINT64NOTNULL,ResourceNameSTRING(1024),)PRIMARYKEY(ProjectId,ResourceId),INTERLEAVEINProjects;
PostgreSQL
CREATETABLEProjects(ProjectIdBIGINTPRIMARYKEY,ProjectNameVARCHAR(1024),);CREATETABLEResources(ProjectIdBIGINT,ResourceIdBIGINT,ResourceNameVARCHAR(1024),PRIMARYKEY(ProjectId,ResourceId))INTERLEAVEINProjects;
Note that in this example we use theINTERLEAVE IN Projects clause, rather thanINTERLEAVE IN PARENT Projects. This indicates we don't enforce the parent-childrelationship between Projects and Resources.
In this example, theResources(1, 10) andResources(1, 20) rows can exist in thedatabase even if theProjects(1) row doesn't exist.Projects(1) can bedeleted even ifResources(1, 10) andResources(1, 20) still exist, and thedeletion doesn't affect theseResources rows.
Create a hierarchy of interleaved tables
The parent-child relationship betweenSingers andAlbums can be extended tomore descendant tables. For example, you could create an interleaved tablecalledSongs as a child ofAlbums to store the track list of each album:
Songs must have a primary key that includes all the primary keys of the tablesthat are at a higher level in the hierarchy, that is,SingerId andAlbumId.
-- Schema hierarchy:-- + Singers-- + Albums (interleaved table, child table of Singers)-- + Songs (interleaved table, child table of Albums)
GoogleSQL
CREATETABLESingers(SingerIdINT64NOTNULLPRIMARYKEY,FirstNameSTRING(1024),LastNameSTRING(1024),SingerInfoBYTES(MAX),);CREATETABLEAlbums(SingerIdINT64NOTNULL,AlbumIdINT64NOTNULL,AlbumTitleSTRING(MAX),)PRIMARYKEY(SingerId,AlbumId),INTERLEAVEINPARENTSingersONDELETECASCADE;CREATETABLESongs(SingerIdINT64NOTNULL,AlbumIdINT64NOTNULL,TrackIdINT64NOTNULL,SongNameSTRING(MAX),)PRIMARYKEY(SingerId,AlbumId,TrackId),INTERLEAVEINPARENTAlbumsONDELETECASCADE;
PostgreSQL
CREATETABLEsingers(singer_idBIGINTPRIMARYKEY,first_nameVARCHAR(1024),last_nameVARCHAR(1024),singer_infoBYTEA);CREATETABLEalbums(singer_idBIGINT,album_idBIGINT,album_titleVARCHAR,PRIMARYKEY(singer_id,album_id))INTERLEAVEINPARENTsingersONDELETECASCADE;CREATETABLEsongs(singer_idBIGINT,album_idBIGINT,track_idBIGINT,song_nameVARCHAR,PRIMARYKEY(singer_id,album_id,track_id))INTERLEAVEINPARENTalbumsONDELETECASCADE;
The following diagram represents a physical view of interleaved rows.
In this example, as the number of singers grows, Spanner addssplit boundaries between singers to preserve data locality between a singer andits album and song data. However, if the size of a singer row and its child rowsexceeds the split size limit, or a hotspot is detected in the child rows,Spanner attempts to add split boundaries to isolate that hotspotrow along with all child rows below it.
In summary, a parent table along with all of its child and descendant tablesforms a hierarchy of tables in the schema. Although each table in the hierarchyis logically independent, physically interleaving them this way can improveperformance, effectively pre-joining the tables and allowing you to accessrelated rows together while minimizing storage accesses.
Joins with interleaved tables
If possible, join data in interleaved tables by primary key. Because eachinterleaved row is usually stored physically in the same split as its parentrow, Spanner can perform joins by primary key locally, minimizingstorage access and network traffic. In the following example,Singers andAlbums are joined on the primary keySingerId.
GoogleSQL
SELECTs.FirstName,a.AlbumTitleFROMSingersASsJOINAlbumsASaONs.SingerId=a.SingerId;
PostgreSQL
SELECTs.first_name,a.album_titleFROMsingersASsJOINalbumsASaONs.singer_id=a.singer_id;
Locality groups
Note: This feature is available with the Spanner Enterprise edition and Enterprise Plus edition. For more information, see theSpanner editions overview.
Spanner uses locality groups to preserve data localityrelationships across table columns. If you don't explicitly create any localitygroups for your tables, Spanner groups all columns into thedefault locality group and stores the data of all tables on SSD storage. Youcan use locality groups to do the following:
Usetiered storage. Tiered storage is afully-managed storage feature that lets you choose whether to store your dataon solid-state drives (SSD) or hard disk drives (HDD). By default, withoutusing tiered storage, Spanner stores all data on SSD storage.
Use column-grouping to store specified columns separately from other columns.Because the data for the specified columns are stored separately, reading datafrom those columns is faster than if all data is grouped together. To usecolumn-grouping, you need tocreate a locality group without specifying anytiered storage options. Spanner uses locality groups to storethe specified columns separately. If specified, the columns inherit theirtiered storage policy from the table or default locality group. Then, use the
CREATE TABLEDDL statement toset a locality group for the specified columns or use theALTER TABLEDDL statement toalter the locality group used by a table's column.The DDL statement determines the columns that are stored in the localitygroup. Finally, you canread data in these columns moreefficiently.
Key columns
This section includes some notes about key columns.
Change table keys
The keys of a table can't change; you can't add a key column to an existingtable or remove a key column from an existing table.
Store NULLs in a primary key
In GoogleSQL, if you would like to store NULL in a primary key column,omit theNOT NULL clause for that column in the schema. (PostgreSQL-dialect databases don'tsupport NULLs in a primary key column.)
Here's an example of omitting theNOT NULL clause on the primary key columnSingerId. Note that becauseSingerId is the primary key, there can be onlyone row that storesNULL in that column.
CREATETABLESingers(SingerIdINT64PRIMARYKEY,FirstNameSTRING(1024),LastNameSTRING(1024),);
The nullable property of the primary key column must match between the parentand the child table declarations. In this example,NOT NULL for the columnAlbums.SingerId is not allowed becauseSingers.SingerId omits it.
CREATETABLESingers(SingerIdINT64PRIMARYKEY,FirstNameSTRING(1024),LastNameSTRING(1024),);CREATETABLEAlbums(SingerIdINT64NOTNULL,AlbumIdINT64NOTNULL,AlbumTitleSTRING(MAX),)PRIMARYKEY(SingerId,AlbumId),INTERLEAVEINPARENTSingersONDELETECASCADE;
Disallowed types
The following columns cannot be of typeARRAY:
- A table's key columns.
- An index's key columns.
Design for multi-tenancy
You might want to implement multi-tenancy if you are storing data that belongsto different customers. For example, a music service might want to store eachindividual record label's content separately.
Classic multi-tenancy
The classic way to design for multi-tenancy is to create a separate database foreach customer. In this example, each database has its ownSingers table:
| SingerId | FirstName | LastName |
|---|---|---|
| 1 | Marc | Richards |
| 2 | Catalina | Smith |
| SingerId | FirstName | LastName |
|---|---|---|
| 1 | Alice | Trentor |
| 2 | Gabriel | Wright |
| SingerId | FirstName | LastName |
|---|---|---|
| 1 | Benjamin | Martinez |
| 2 | Hannah | Harris |
Schema-managed multi-tenancy
Another way to design for multi-tenancy in Spanner is to have allcustomers in a single table in a single database, and to use a different primarykey value for each customer. For example, you could include aCustomerId keycolumn in your tables. If you makeCustomerId the first key column, then thedata for each customer has good locality. Spannercan then effectively usedatabase splits to maximizeperformance based on data size and load patterns. In the following example,there is a singleSingers table for all customers:
| CustomerId | SingerId | FirstName | LastName |
|---|---|---|---|
| 1 | 1 | Marc | Richards |
| 1 | 2 | Catalina | Smith |
| 2 | 1 | Alice | Trentor |
| 2 | 2 | Gabriel | Wright |
| 3 | 1 | Benjamin | Martinez |
| 3 | 2 | Hannah | Harris |
If you must have separate databases for each tenant, there are constraints to beaware of:
- There arelimits on the number of databases per instanceand the number of tables and indexes per database. Depending on the numberof customers, it might not be possible to have separate databases or tables.
- Adding new tables and non-interleaved indexescan take a longtime. You might notbe able to get the performance you want if your schema design depends onadding new tables and indexes.
If you want to create separate databases, you might have more success if youdistribute your tables across databases in such a way that each database has alow number of schema changes per week.
If you create separate tables and indexes for each customer of your application,don't put all of the tables and indexes in the same database. Instead, splitthem across many databases, to mitigate theperformanceissues with creating a large numberof indexes.
To learn more about other data management patterns and application design formulti-tenancy, seeImplementing Multi-Tenancy inSpanner
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-02-05 UTC.