1
\$\begingroup\$

I am working withC#,Npgsql,EF Core andPostgres.
I defined an endpoint for a paginated search, where the filters and orderBy column are dynamic. The endpoint accepts aPaginationOptions instance:

public class PaginationOptions{    public int Page { get; set; }    public int ItemsPerPage { get; set; }    public string OrderBy { get; set; }    public bool Desc { get; set; }    public IList<FilterValues> Filters { get; set; }}public class FilterValues{    public string FieldName { get; set; }    public IEnumerable<string> Values { get; set; }}

The following method performs the search and returns a Tuple with the sorted items and a counter for the total items in the table:

public Tuple<IList<T>, int> Search(PaginationOptions paginationOptions){    if (!string.IsNullOrEmpty(paginationOptions.OrderBy))    {        CheckFilterField(paginationOptions.OrderBy);    }    int offset = (paginationOptions.Page - 1) * paginationOptions.ItemsPerPage;    string orderBy = !string.IsNullOrEmpty(paginationOptions.OrderBy) ? paginationOptions.OrderBy : $"{prefix}.title";    string order = paginationOptions.Desc ? "DESC" : "ASC";    using (NpgsqlConnection connection = GetConnection())    {        string query = $"{GetQueryfields()} {GetFromClause()} {BuildWhere(paginationOptions.Filters)}";        string itemsQuery = $"SELECT {query} ORDER BY {orderBy} {order}";        NpgsqlCommand command = BuildCommand(connection, itemsQuery, paginationOptions.Filters);        IDataReader reader = command.ExecuteReader();        ISet<Guid> guids = new HashSet<Guid>(paginationOptions.ItemsPerPage);        while (reader.Read())        {            Guid guid = reader.GetGuid(0);            if (!guids.Contains(guid))            {                guids.Add(guid);            }        }        ISet<Guid> filteredGuids = guids.Skip(offset).Take(paginationOptions.ItemsPerPage).ToHashSet();        IList<T> items = GetItems(filteredGuids);        return Tuple.Create(items, guids.Count);    }}

In words: In each entity there are the query fields and the FROM clause defiend. They are splitted because I need the FROM clause in another method as well.
The WHERE (prepared statement) and ORDER BY are built dynamically using the parameters. TheBuildCommand creates theNpgsqlCommand and sets the parameters. Then I use Dapper for a raw query in order to get the ids of the requested items, then I load them using theEF and at the end ISkip andTake in order to have the right pagination.
The problem ist thatEF does not allow to add an ORDER BY clause for raw queries, it is only available throug theLinq expression:

context.AnEntity.FromSqlRaw("Select * from users ORDER BY id").OrderBy(x => x.Title);

ORDER BY id is ignored, items are sorted by the expression. If no orderby linq expression is used, the framework addsORDER BY entity.id. Otherwise I could have done followings:

string itemsQuery = $"SELECT {query} ORDER BY {orderBy} {order}";context.AnEntity.FromSqlRaw(itemsQuery).Skip(offset).Take(limit)...

It works. Even on a table with 1mil a query takes 2,8sec
Comments? Improvment hints?

Edit:
I ended up with a query which loads the paged data in 2,2sec over a table with 1mil rows. Is it an acceptable result?

pacmaninbw's user avatar
pacmaninbw
26.2k13 gold badges47 silver badges114 bronze badges
askedJul 22, 2021 at 14:01
Emaborsa's user avatar
\$\endgroup\$
4
  • \$\begingroup\$Is the Guid-column the primary key?\$\endgroup\$CommentedJul 22, 2021 at 14:20
  • \$\begingroup\$Did you look at DynamicLinq or even making your own expression tree for the orderbys? example -codereview.stackexchange.com/a/256123/52662. Also you state its an endpoint if it's a webapi project what about just adding OData support for the pagenation? OData has some flexibility to intercept the iqueryable.\$\endgroup\$CommentedJul 22, 2021 at 14:24
  • \$\begingroup\$yes, theid (uuid) column is the primary key.\$\endgroup\$CommentedJul 22, 2021 at 14:24
  • \$\begingroup\$@CharlesNRice Didn't know aboutDynamicLinq, will take a look. I am not fan ofOData but I will give it a try.\$\endgroup\$CommentedJul 23, 2021 at 6:22

2 Answers2

1
\$\begingroup\$
  • [PaginationOptions] RenameDes toOrderByDescending and removeOrderBy, this would be more manageable as you already have a defaultorder by in your query.
  • I can't see a check for the fields ? if not, try to ensure that every field name exists on your table before building the query.
  • UseDapper to count and select queries, as they're faster in dapper thanEF.
  • Since this is a Web API, aTuple would be an overkill for your controller, you might consider returningIEnumerable orIDictionary<int, IList<T>> or any simpler type, just try to simplify the results to the consumer.
  • you can useFETCH clause instead of LINQ'sTake andSkip, which would make it faster and also would decrease the allocated memory. referrer to thispage to read more about it.(FYI : PostgreSQL, SQL Server, Oracle, and MySQL supportsFETCH clause with some caveats).
  • guids is already aHashSet<Guid> so checking the guid in the loop is unnecessary, as theguids.Add(guid) would ignore the value if it's already in the collection.
answeredJul 23, 2021 at 3:13
iSR5's user avatar
\$\endgroup\$
4
  • \$\begingroup\$For theCount andSelect I am using Npgsql, no EF nor Dapper. ThisSearch method is not in the controller, it is in an abstract class, called by the controller where i manage the returned items.Take andSkip are not used on a IO operation since the Guids are already in memory, so it makes no difference, right? Thanks for the rest!\$\endgroup\$CommentedJul 23, 2021 at 5:48
  • \$\begingroup\$@Emaborsa if you are usingNpgsql you should state that, and edit your post to clearify this, as the first line of your post states that you're usingDapper andEF. for the method, I know it's not the controller. what I meant is the service should always returns simple types, as you don't want to add more data processing inside your controller.\$\endgroup\$CommentedJul 23, 2021 at 8:20
  • \$\begingroup\$not sure how things work on your application, but if you just storing guids to re-query the DB with the full data, unless if it's a requirement, you might loose some performance for that. though usingFetch clause would give you what you need with faster results.\$\endgroup\$CommentedJul 23, 2021 at 8:27
  • \$\begingroup\$About the method: But I need to return the pagend data AND the total items count. About of performance: I agree that is non sense to retrieve all the data and then filter it for the paging, but since the performance is not too bad, I though it was a good solution anyway. I will review the query,.\$\endgroup\$CommentedJul 23, 2021 at 8:34
-1
\$\begingroup\$

Just a quick shot at

IDataReader reader = command.ExecuteReader();ISet<Guid> guids = new HashSet<Guid>(paginationOptions.ItemsPerPage);while (reader.Read()){    Guid guid = reader.GetGuid(0);    if (!guids.Contains(guid))    {        guids.Add(guid);    }}ISet<Guid> filteredGuids = guids.Skip(offset).Take(paginationOptions.ItemsPerPage).ToHashSet();

Because the Guid-column is your primary-key-column, you should just use aList<Guid>. You won't need to check ifreader.GetGuid(0) exists because that just can't happen.

I wouldn't create the formerHashSet<Guid> by using the capacity ofpaginationOptions.ItemsPerPage but would use a fixed number which should be in the lower range but not too low. Because if you use this overloaded constructor, then you should use it in a way that it scales good. Having a low number for capacity, you may reach the point fast that there aren't any slots left meaning theHashSet needs to be resized. If the number for capacity is too high it may take more time at initialization because its createing two array with the length of the passed capacity.

Take a look at thereference-source forHashSet<T>

This applies for aList<T> as well.

To come to an end I would change the code a bove to

IDataReader reader = command.ExecuteReader();List<Guid> guids = new List<Guid>(2000);while (reader.Read()){    guids.Add(reader.GetGuid(0));}ISet<Guid> filteredGuids = guids.Skip(offset).Take(paginationOptions.ItemsPerPage).ToHashSet();

but would extract the2000 to a meaningful named constant.

answeredJul 22, 2021 at 14:44
Heslacher's user avatar
\$\endgroup\$
2
  • \$\begingroup\$Thanks for the hint, but If the query has a join with, the result could include two rows with the same id and different joined values which are neede for filtering. That's why I used a ISet. I could use a IList, but in this case Id need the checkif contains...`.\$\endgroup\$CommentedJul 23, 2021 at 5:53
  • \$\begingroup\$I've just tried to change fromISet<Guid> guids = new HashSet<Guid>(paginationOptions.ItemsPerPage); toList<Guid> guids = new List<Guid>(2000);. For the same query the first takes 500ms, the latter 12sec.\$\endgroup\$CommentedJul 23, 2021 at 6:15

You mustlog in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.