We a running an instance ofApache syncope with around 130k users (ca. 33k withactive orpending state). When searching for users by attributes we see long query times often over 50 seconds. One example query might be
select * from user_search_attr where schema_id = 'familyName' and stringvalue = 'Bergmann'user_search_attr is a view defined as follows:
CREATE OR REPLACEALGORITHM = UNDEFINED VIEW `user_search_attr` ASselect `u`.`id` AS `any_id`, `attrs`.`schema_id` AS `schema_id`, `attrs`.`booleanvalue` AS `booleanvalue`, `attrs`.`datevalue` AS `datevalue`, `attrs`.`doublevalue` AS `doublevalue`, `attrs`.`longvalue` AS `longvalue`, `attrs`.`stringvalue` AS `stringvalue`from (`SyncopeUser` `u`left join JSON_TABLE(coalesce(`u`.`plainAttrs`, '[]'), '$[*]' COLUMNS (`schema_id` varchar(255) PATH '$.schema', NESTED PATH '$.values[*]' COLUMNS (`booleanvalue` int(11) PATH '$.booleanValue', `datevalue` varchar(32) PATH '$.dateValue', `doublevalue` double PATH '$.doubleValue', `longvalue` bigint(20) PATH '$.longValue', `stringvalue` varchar(255) PATH '$.stringValue'))) `attrs` on (1 = 1))where `attrs`.`schema_id` is not null and (`attrs`.`booleanvalue` is not null or `attrs`.`datevalue` is not null or `attrs`.`doublevalue` is not null or `attrs`.`longvalue` is not null or `attrs`.`stringvalue` is not null);As you can see the data comes from the tableSyncopeUser which is defined as follows:
CREATE TABLE `SyncopeUser` ( `id` varchar(36) NOT NULL, `creationContext` varchar(255) DEFAULT NULL, `creationDate` datetime(3) DEFAULT NULL, `creator` varchar(255) DEFAULT NULL, `lastChangeContext` varchar(255) DEFAULT NULL, `lastChangeDate` datetime(3) DEFAULT NULL, `lastModifier` varchar(255) DEFAULT NULL, `status` varchar(255) DEFAULT NULL, `changePwdDate` datetime(3) DEFAULT NULL, `cipherAlgorithm` varchar(20) DEFAULT NULL, `failedLogins` int(11) DEFAULT NULL, `lastLoginDate` datetime(3) DEFAULT NULL, `mustChangePassword` int(11) DEFAULT NULL, `password` varchar(255) DEFAULT NULL, `passwordHistory` text DEFAULT NULL, `securityAnswer` varchar(255) DEFAULT NULL, `suspended` int(11) DEFAULT NULL, `token` text DEFAULT NULL, `tokenExpireTime` datetime(3) DEFAULT NULL, `username` varchar(255) DEFAULT NULL, `plainAttrs` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL CHECK (json_valid(`plainAttrs`)), `REALM_ID` varchar(36) DEFAULT NULL, `SECURITYQUESTION_ID` varchar(36) DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE KEY `U_SYNCPSR_USERNAME` (`username`), UNIQUE KEY `SyncopeUser_username` (`username`), KEY `SECURITYQUESTION_ID` (`SECURITYQUESTION_ID`), KEY `SyncopeUser_realm_id` (`REALM_ID`), CONSTRAINT `SyncopeUser_ibfk_1` FOREIGN KEY (`REALM_ID`) REFERENCES `Realm` (`id`), CONSTRAINT `SyncopeUser_ibfk_2` FOREIGN KEY (`SECURITYQUESTION_ID`) REFERENCES `SecurityQuestion` (`id`)) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_general_ci;The relevant columnplainAttrs contains the users attibutes as JSON string. One example might be[{"uniqueValue":{"stringValue":"123456789"},"schema":"lockSystemId"},{"values":[{"stringValue":"Bergmann"}],"schema":"familyName"}].
The following indexes are defined onSyncopeUser:
Table |Non_unique|Key_name |Seq_in_index|Column_name |Collation|Cardinality|Sub_part|Packed|Null|Index_type|Comment|Index_comment|Ignored|-----------+----------+--------------------+------------+-------------------+---------+-----------+--------+------+----+----------+-------+-------------+-------+SyncopeUser| 0|PRIMARY | 1|id |A | 149635| | | |BTREE | | |NO |SyncopeUser| 0|U_SYNCPSR_USERNAME | 1|username |A | 149635| | |YES |BTREE | | |NO |SyncopeUser| 0|SyncopeUser_username| 1|username |A | 149635| | |YES |BTREE | | |NO |SyncopeUser| 1|SECURITYQUESTION_ID | 1|SECURITYQUESTION_ID|A | 1| | |YES |BTREE | | |NO |SyncopeUser| 1|SyncopeUser_realm_id| 1|REALM_ID |A | 1| | |YES |BTREE | | |NO |As most normal users have over 15 attributes theuser_search_attr view contains over 2M rows.
The servers are VMs with currently 8 GB of RAM and 6 CPUs but this can be changed. We are currently running version 11.4.7 of MariaDB in a galera cluster of 3 Servers. Syncope is only connected to one of the servers as this cluster is also used for other applications and we don't want to influnce these other applications.
Here are some of the IMHO relevant configuration options:
[mysqld]aria_pagecache_buffer_size = 32Mbasedir = /usrbind-address = ::binlog_format = ROWcharacter_set_server = utf8collation_server = utf8_general_ciinnodb_adaptive_hash_index = OFFinnodb_autoinc_lock_mode = 2innodb_buffer_pool_instances = 1innodb_buffer_pool_size = 4096Minnodb_flush_log_at_trx_commit = 0innodb_log_buffer_size = 16MBinnodb_log_file_size = 512Mkey_buffer_size = 1Mlog-error = /var/log/mariadb/mariadb.loglog_queries_not_using_indexes = OFFlong_query_time = 4.0max_allowed_packet = 128Mmax_binlog_size = 100Mmax_connections = 400max_heap_table_size = 256Mperformance_schema = ONquery_cache_limit = 16Mquery_cache_size = 0query_cache_type = OFFskip-external-lockingskip-name-resolve = 1slow_query_log = 0slow_query_log_file = /var/log/mariadb/slow.logsort_buffer_size = 4Mtable_definition_cache = 800table_open_cache = 1000thread_cache_size = 8thread_stack = 256Ktmp_table_size = 256Mwsrep_on = ONwsrep_provider = /usr/lib64/galera-4/libgalera_smm.soAs most queries only search foractive orpending users one option that I have not tried is indexing or partitioning the table for thestatus column.
Currently we are not in the position of changing the design of the database or even switching to another IDM sollution. We hope to find optimazion potential on the DB layer or maybe some improvements on some queries.
- thanks for the question @Luuk. I am not the designer of the software but only a user. I think that the reason for the
varchar(255)in the case of thestatuscolumn is that the values are not fixed by design. The status of a user is determined by a workflow and the status is the state the workflow ends up in after changes to a user.Clemens Bergmann– Clemens Bergmann2025-08-03 10:38:12 +00:00CommentedAug 3 at 10:38 - "I am not the designer of the software but only a user" Fro slow queries you should also contact the designer, dis you do that ?Luuk– Luuk2025-08-03 11:05:51 +00:00CommentedAug 3 at 11:05
- Hi, yes. I posted a link to this question in the developer mailinglist. Maybe they can contribute here to.Clemens Bergmann– Clemens Bergmann2025-08-03 13:06:16 +00:00CommentedAug 3 at 13:06
- 1refmailing list post.
ANALYZE FORMAT=JSON SELECT * from user_search_attr ...will show a table scan. Suggestions for view: 1) JOIN (not LEFT), 2) dateValue is date(time) type 3) WHERE criteria in view - make part of tablecheck constraint.danblack– danblack2025-08-04 03:41:03 +00:00CommentedAug 4 at 3:41
1 Answer1
JSON_TABLE unfortunately isn't a Magicly indexed construct that copes well yet with large searches.
Its possible to create a materialised table that is maintained with triggers that will be fast to search.
Step 1: Create the table
CREATE OR REPLACE TABLE `SyncopeAttrs` (`any_id` varchar(36) NOT NULL, `schema_id` varchar(255), `booleanvalue` bool, `datevalue` bigint, `doublevalue` DOUBLE, `longvalue` bigint, `stringvalue` varchar(255), key(id, schema_id), key(schema_id)) ASSELECT u.id, attrs. schema_id, attrs.`booleanvalue`, `attrs`.`datevalue`, `attrs`.`doublevalue`, `attrs`.`longvalue`, `attrs`.`stringvalue`FROM `SyncopeUser` `u`LEFT JOIN JSON_TABLE(coalesce(`u`.`plainAttrs`, '[]'), '$[*]' COLUMNS (`schema_id` varchar(255) PATH '$.schema', NESTED PATH '$.values[*]' COLUMNS ( `booleanvalue` bool PATH '$.booleanValue', `datevalue` bigint PATH '$.dateValue', `doublevalue` DOUBLE PATH '$.doubleValue', `longvalue` bigint(20) PATH '$.longValue', `stringvalue` varchar(255) PATH '$.stringValue'))) `attrs`;This is a representation of the table at the time of the query:
Step 2: Create Triggers
A DELETE, INSERT and UPDATE triggers are required to keep the table in sync
Step 2a: Create the Update trigger
Note this will only trigger on the applicable fields of the table. I don't expect forid to the the column that changes but let us account for that anyway.
DELIMITER //CREATE TRIGGER trg_syncopeuser_updateAFTER UPDATE ON SyncopeUser OF plainAttrs, idFOR EACH ROWBEGIN DELETE FROM SyncopeAttrs WHERE id = NEW.id; DELETE FROM SyncopeAttrs WHERE id = OLD.id; INSERT INTO SyncopeAttrs ( any_id, schema_id, booleanvalue, datevalue, doublevalue, longvalue, stringvalue ) SELECT NEW.id, jt.schema_id, jt.booleanvalue, jt.datevalue, jt.doublevalue, jt.longvalue, jt.stringvalue FROM JSON_TABLE( COALESCE(NEW.plainAttrs, '[]'), '$[*]' COLUMNS ( schema_id VARCHAR(255) PATH '$.schema', NESTED PATH '$.values[*]' COLUMNS ( booleanvalue INT PATH '$.booleanValue', datevalue BIGINT PATH '$.dateValue', doublevalue DOUBLE PATH '$.doubleValue', longvalue BIGINT PATH '$.longValue', stringvalue VARCHAR(255) PATH '$.stringValue' ) ) ) AS jt; END IF;END;//2.INSERT
CREATE TRIGGER trg_syncopeuser_insertAFTER INSERT ON SyncopeUserFOR EACH ROWBEGINDELETE FROM SyncopeAttrs WHERE id = NEW.id;INSERT INTO SyncopeAttrs (any_id, schema_id, booleanvalue, datevalue, doublevalue, longvalue, stringvalue) SELECT NEW.id, jt.schema_id, jt.booleanvalue, jt.datevalue, jt.doublevalue, jt.longvalue, jt.stringvalue FROM JSON_TABLE( COALESCE(NEW.plainAttrs, '[]'), '$[*]' COLUMNS ( schema_id VARCHAR(255) PATH '$.schema', NESTED PATH '$.values[*]' COLUMNS ( booleanvalue INT PATH '$.booleanValue', datevalue BIGINT PATH '$.dateValue', doublevalue DOUBLE PATH '$.doubleValue', longvalue BIGINT PATH '$.longValue', stringvalue VARCHAR(255) PATH '$.stringValue' ) ) ) AS jt;END;2.DELETE
CREATE TRIGGER trg_syncopeuser_deleteBEFORE DELETE ON SyncopeUserFOR EACH ROW DELETE FROM SyncopeAttrs WHERE any_id = OLD.id;Step 3 Replace the view
CREATE OR REPLACEALGORITHM = UNDEFINED VIEW `user_search_attr` ASselect `any_id` AS `any_id`, `schema_id` AS `schema_id`, `booleanvalue` AS `booleanvalue`, `datevalue` AS `datevalue`, `doublevalue` AS `doublevalue`, `longvalue` AS `longvalue`, `stringvalue` AS `stringvalue`from SyncopeAttrs attrs;After this queries on the attributes of the table will be faster because they are using an indexed real table.
Note:booleanvalue change tobool for my compulsive type enforcement disorder :-)
Edit 1:datevalue in Syncope appears to a be a millisecond epoch so used a bigint for a lossless compatible form (seemed better thanvarchar), but date/datetime couldn't be used.
Edit 2: normalised naming so the view is pretty consistent with table name columns. Is possible to make the table as the view nameuser_search_attr
ref:https://sqlize.online/sql/mariadb118/51f5a71cfc6670d06c566f0db6ade4f2/
- That looks like a good idea. I try this in the testing environment and report back. This might take until next week as I am on vacation this week. One more question: what is the reason for using another table and not replacing the view with a table?Clemens Bergmann– Clemens Bergmann2025-08-05 04:47:44 +00:00CommentedAug 5 at 4:47
- quite right, it should be able to use a table instead of a view. The
idis replaced withany_idwhich might be important for Syncope. Alsoschema_idmight need to be nullable if there isn't any attributes for an id. Feedback (edits if able) welcome and I'll incorporate it into the answer. Have a good vacation.danblack– danblack2025-08-05 05:40:11 +00:00CommentedAug 5 at 5:40 - Hi, I was not happy about the fact that the code contains a variation of the initial view in three places. This makes it hard to maintain should the view change. Also there is not only this JSON_TABLE view but 6 in total so the code could get quite large and repetative. One solution could be to keep the view around (renaming it to
user_search_attr_view) and replace the inserts in the triggers with something likeINSERT INTO user_search_attr SELECT * FROM user_search_attr_view where any_id = NEW.id;. Why is this a bad idea?Clemens Bergmann– Clemens Bergmann2025-08-10 07:36:45 +00:00CommentedAug 10 at 7:36 - I was seeing the construction of
SyncopeAttrsas a way that the behaviour could be validated before replacinguser_search_attr. Renaming the view first and then making the 1 table and 3 triggers sounds perfectly reasonable to allow the removal of the old view simply at the end step. The original view included a reference toSyncopeUserso a trigger on aSyncopeUsercould be selecting for a view that is on the same row. I suspect there could be an error when this occurs.danblack– danblack2025-08-10 08:11:16 +00:00CommentedAug 10 at 8:11
Explore related questions
See similar questions with these tags.
