I am working withC#,Npgsql,EF Core andPostgres.
I defined an endpoint for a paginated search, where the filters and orderBy column are dynamic. The endpoint accepts aPaginationOptions instance:
public class PaginationOptions{ public int Page { get; set; } public int ItemsPerPage { get; set; } public string OrderBy { get; set; } public bool Desc { get; set; } public IList<FilterValues> Filters { get; set; }}public class FilterValues{ public string FieldName { get; set; } public IEnumerable<string> Values { get; set; }}The following method performs the search and returns a Tuple with the sorted items and a counter for the total items in the table:
public Tuple<IList<T>, int> Search(PaginationOptions paginationOptions){ if (!string.IsNullOrEmpty(paginationOptions.OrderBy)) { CheckFilterField(paginationOptions.OrderBy); } int offset = (paginationOptions.Page - 1) * paginationOptions.ItemsPerPage; string orderBy = !string.IsNullOrEmpty(paginationOptions.OrderBy) ? paginationOptions.OrderBy : $"{prefix}.title"; string order = paginationOptions.Desc ? "DESC" : "ASC"; using (NpgsqlConnection connection = GetConnection()) { string query = $"{GetQueryfields()} {GetFromClause()} {BuildWhere(paginationOptions.Filters)}"; string itemsQuery = $"SELECT {query} ORDER BY {orderBy} {order}"; NpgsqlCommand command = BuildCommand(connection, itemsQuery, paginationOptions.Filters); IDataReader reader = command.ExecuteReader(); ISet<Guid> guids = new HashSet<Guid>(paginationOptions.ItemsPerPage); while (reader.Read()) { Guid guid = reader.GetGuid(0); if (!guids.Contains(guid)) { guids.Add(guid); } } ISet<Guid> filteredGuids = guids.Skip(offset).Take(paginationOptions.ItemsPerPage).ToHashSet(); IList<T> items = GetItems(filteredGuids); return Tuple.Create(items, guids.Count); }}In words: In each entity there are the query fields and the FROM clause defiend. They are splitted because I need the FROM clause in another method as well.
The WHERE (prepared statement) and ORDER BY are built dynamically using the parameters. TheBuildCommand creates theNpgsqlCommand and sets the parameters. Then I use Dapper for a raw query in order to get the ids of the requested items, then I load them using theEF and at the end ISkip andTake in order to have the right pagination.
The problem ist thatEF does not allow to add an ORDER BY clause for raw queries, it is only available throug theLinq expression:
context.AnEntity.FromSqlRaw("Select * from users ORDER BY id").OrderBy(x => x.Title);ORDER BY id is ignored, items are sorted by the expression. If no orderby linq expression is used, the framework addsORDER BY entity.id. Otherwise I could have done followings:
string itemsQuery = $"SELECT {query} ORDER BY {orderBy} {order}";context.AnEntity.FromSqlRaw(itemsQuery).Skip(offset).Take(limit)...It works. Even on a table with 1mil a query takes 2,8sec
Comments? Improvment hints?
Edit:
I ended up with a query which loads the paged data in 2,2sec over a table with 1mil rows. Is it an acceptable result?
- \$\begingroup\$Is the Guid-column the primary key?\$\endgroup\$Heslacher– Heslacher2021-07-22 14:20:00 +00:00CommentedJul 22, 2021 at 14:20
- \$\begingroup\$Did you look at DynamicLinq or even making your own expression tree for the orderbys? example -codereview.stackexchange.com/a/256123/52662. Also you state its an endpoint if it's a webapi project what about just adding OData support for the pagenation? OData has some flexibility to intercept the iqueryable.\$\endgroup\$CharlesNRice– CharlesNRice2021-07-22 14:24:16 +00:00CommentedJul 22, 2021 at 14:24
- \$\begingroup\$yes, the
id (uuid)column is the primary key.\$\endgroup\$Emaborsa– Emaborsa2021-07-22 14:24:44 +00:00CommentedJul 22, 2021 at 14:24 - \$\begingroup\$@CharlesNRice Didn't know about
DynamicLinq, will take a look. I am not fan ofODatabut I will give it a try.\$\endgroup\$Emaborsa– Emaborsa2021-07-23 06:22:31 +00:00CommentedJul 23, 2021 at 6:22
2 Answers2
- [
PaginationOptions] RenameDestoOrderByDescendingand removeOrderBy, this would be more manageable as you already have a defaultorder byin your query. - I can't see a check for the fields ? if not, try to ensure that every field name exists on your table before building the query.
- Use
Dapperto count and select queries, as they're faster in dapper thanEF. - Since this is a Web API, a
Tuplewould be an overkill for your controller, you might consider returningIEnumerableorIDictionary<int, IList<T>>or any simpler type, just try to simplify the results to the consumer. - you can use
FETCH clauseinstead of LINQ'sTakeandSkip, which would make it faster and also would decrease the allocated memory. referrer to thispage to read more about it.(FYI : PostgreSQL, SQL Server, Oracle, and MySQL supportsFETCH clausewith some caveats). guidsis already aHashSet<Guid>so checking the guid in the loop is unnecessary, as theguids.Add(guid)would ignore the value if it's already in the collection.
- \$\begingroup\$For the
CountandSelectI am using Npgsql, no EF nor Dapper. ThisSearchmethod is not in the controller, it is in an abstract class, called by the controller where i manage the returned items.TakeandSkipare not used on a IO operation since the Guids are already in memory, so it makes no difference, right? Thanks for the rest!\$\endgroup\$Emaborsa– Emaborsa2021-07-23 05:48:16 +00:00CommentedJul 23, 2021 at 5:48 - \$\begingroup\$@Emaborsa if you are using
Npgsqlyou should state that, and edit your post to clearify this, as the first line of your post states that you're usingDapperandEF. for the method, I know it's not the controller. what I meant is the service should always returns simple types, as you don't want to add more data processing inside your controller.\$\endgroup\$iSR5– iSR52021-07-23 08:20:07 +00:00CommentedJul 23, 2021 at 8:20 - \$\begingroup\$not sure how things work on your application, but if you just storing guids to re-query the DB with the full data, unless if it's a requirement, you might loose some performance for that. though using
Fetch clausewould give you what you need with faster results.\$\endgroup\$iSR5– iSR52021-07-23 08:27:14 +00:00CommentedJul 23, 2021 at 8:27 - \$\begingroup\$About the method: But I need to return the pagend data AND the total items count. About of performance: I agree that is non sense to retrieve all the data and then filter it for the paging, but since the performance is not too bad, I though it was a good solution anyway. I will review the query,.\$\endgroup\$Emaborsa– Emaborsa2021-07-23 08:34:22 +00:00CommentedJul 23, 2021 at 8:34
Just a quick shot at
IDataReader reader = command.ExecuteReader();ISet<Guid> guids = new HashSet<Guid>(paginationOptions.ItemsPerPage);while (reader.Read()){ Guid guid = reader.GetGuid(0); if (!guids.Contains(guid)) { guids.Add(guid); }}ISet<Guid> filteredGuids = guids.Skip(offset).Take(paginationOptions.ItemsPerPage).ToHashSet();Because the Guid-column is your primary-key-column, you should just use aList<Guid>. You won't need to check ifreader.GetGuid(0) exists because that just can't happen.
I wouldn't create the formerHashSet<Guid> by using the capacity ofpaginationOptions.ItemsPerPage but would use a fixed number which should be in the lower range but not too low. Because if you use this overloaded constructor, then you should use it in a way that it scales good. Having a low number for capacity, you may reach the point fast that there aren't any slots left meaning theHashSet needs to be resized. If the number for capacity is too high it may take more time at initialization because its createing two array with the length of the passed capacity.
Take a look at thereference-source forHashSet<T>
This applies for aList<T> as well.
To come to an end I would change the code a bove to
IDataReader reader = command.ExecuteReader();List<Guid> guids = new List<Guid>(2000);while (reader.Read()){ guids.Add(reader.GetGuid(0));}ISet<Guid> filteredGuids = guids.Skip(offset).Take(paginationOptions.ItemsPerPage).ToHashSet();but would extract the2000 to a meaningful named constant.
- \$\begingroup\$Thanks for the hint, but If the query has a join with, the result could include two rows with the same id and different joined values which are neede for filtering. That's why I used a ISet. I could use a IList, but in this case I
d need the checkif contains...`.\$\endgroup\$Emaborsa– Emaborsa2021-07-23 05:53:58 +00:00CommentedJul 23, 2021 at 5:53 - \$\begingroup\$I've just tried to change from
ISet<Guid> guids = new HashSet<Guid>(paginationOptions.ItemsPerPage);toList<Guid> guids = new List<Guid>(2000);. For the same query the first takes 500ms, the latter 12sec.\$\endgroup\$Emaborsa– Emaborsa2021-07-23 06:15:32 +00:00CommentedJul 23, 2021 at 6:15
You mustlog in to answer this question.
Explore related questions
See similar questions with these tags.

