Posted on
Resolving API Records Limit Issue with Pagination
For a bit of context, Kronos is a workforce management software use to help employers manage their employees, time and attendance, scheduling and payroll. This post isn't necessarily supposed to be a tutorial, but I wanted to document the experience with this since it was helpful to me, it may be helpful to someone else. Please feel free to comment if you have questions or would like further clarification or context on something.
I was writing a program that would pull records from saved reports using their report IDs from Kronos. It was known that the API call had a limit of 100,000 records so the reports were saved in a way that they had under 100k records.
This was fine until all of the sudden, I'm calling the API to get reports data in order to insert into a SQL database, and the result set is only 50,000 records when previously it was around 60-70k records.
After checking the documentation, I realized I could resolve this issue using pagination. Here's how I tested and resolved the issue...
Set up and Testing in Postman
I created a collection in Postman titled 'Kronos Reports' and set up my requests. The first POST request for logging in to the tenant and retrieving the token, setting it as an environment variable to use in the next POST request for retrieving the saved reports. The APIs used:
{{baseurl}}/ta/rest/v1/login
{{baseurl}}/ta/rest/v1/report/saved/{{reportID}}
I initially had started out using their GET request for the reports, however, in order to apply pagination you need to use a post request and put the following json in the response body:
{ "page":{ "number":0, "size":50000 }}
Page number needs to start at 0, and will return the first 50000 records (or whichever amount you choose). I set it at 50,000 since that seemed to be the cap after throttling.
Once I set my relevant headers, I created a Post-response script to write the record counts to the console:
constxml2js=require('xml2js');letjsonResponse;xml2js.parseString(pm.response.text(),{explicitArray:false},(err,result)=>{ jsonResponse=result;});constrows=jsonResponse.result.body.row;constrowCount=Array.isArray(rows)?rows.length:(rows?1:0);console.log("Number of rows:",rowCount);
Then I could call the API and see that for page 0, records returned were 50000, set the page to 1 and ran again which showed the records returned were 18927. Perfect, we're done.
Implementing in the code
My project is a .NET 8.0 Console App written in C# and using the latest version of RestSharp (version 111.4.1 at the time of this writing).
I adjusted my API call to useMethod.Post
, and wrapped it in a while loop. I set a bool variablehasMorePages
initially to true, and created variables for the page number and page size, initially set to0
and50000
, respectively.
RestSharp has anAddJsonBody()
method in which I used to insert the pagination json code to the request body:
request3.AddJsonBody(new{page=new{number=pageNumber,size=pageSize}});
I also created another variableList<row> allRecords
to add the results to. The code loops through as long ashasMorePages
is true. If the results count is less than the page size,hasMorePages
is assigned to false and exits the loop. Otherwise, thepageNumber
is incremented by1
.
Adding Records from Multiple Reports to be added to Database together
Since we have the 100,000 record limit on pulling saved reports data, this resulted in splitting our report into 3 parts. Since we will run this daily to update the data, we want to combine all of the reports in the code so that we only need to run it and insert into the database once.
For this I just wrapped my API call in the while loop in a foreach loop that loops through a list of the reportIDs, passing the temporaryreportId
variable in the RestRequest.
Here is the final API call code:
varoptions=newRestClientOptions(strBaseURL){ConfigureMessageHandler=handler=>{varhttpClientHandler=(HttpClientHandler)handler;httpClientHandler.CookieContainer=newCookieContainer();httpClientHandler.UseCookies=true;returnhttpClientHandler;},Timeout=Timeout.InfiniteTimeSpan,UserAgent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36"};varclient=newRestClient(options);varrequest2=newRestRequest("/ta/rest/v1/login",Method.Post);request2.AddHeader("Api-Key",strAPIKey);request2.AddHeader("Accept","application/json");request2.AddParameter("application/json",JsonConvert.SerializeObject(me),ParameterType.RequestBody);varresponse2=awaitclient.ExecuteAsync(request2);Kronos_Login_ResponseBodyresponsebody=JsonConvert.DeserializeObject<Kronos_Login_ResponseBody>(response2.Content);stringstrToken=responsebody.token;boolhasMorePages=true;intpageNumber=0;intpageSize=50000;varoptions2=newRestClientOptions(strBaseURL){ConfigureMessageHandler=handler=>{varhttpClientHandler=(HttpClientHandler)handler;httpClientHandler.CookieContainer=newCookieContainer();httpClientHandler.UseCookies=true;returnhttpClientHandler;},Timeout=Timeout.InfiniteTimeSpan,};varclient2=newRestClient(options2);List<Row>allRecords=newList<Row>();// just example report numbersList<string>reportIds=newList<string>{"123456789","987654321","456478415"};foreach(stringreportIdinreportIds){while(hasMorePages){varrequest3=newRestRequest($"/ta/rest/v1/report/saved/{reportId}",Method.Post);request3.AddHeader("Authentication","Bearer "+strToken);request3.AddHeader("Accept","application/xml");request3.AddHeader("company",strCompanySysID);request3.AddHeader("Content-Type","application/json");request3.AddJsonBody(new{page=new{number=pageNumber,size=pageSize}});varresponse3=awaitclient2.ExecuteAsync(request3);XmlDocumentdoc=newXmlDocument();doc.LoadXml(response3.Content);stringKronosXMLResponse=JsonConvert.SerializeXmlNode(doc);RootKronosEmployeeHoursReport=JsonConvert.DeserializeObject<Root>(KronosXMLResponse);allRecords.AddRange(KronosEmployeeHoursReport.result.body.row);if(KronosEmployeeHoursReport.result.body.row.Count<pageSize){hasMorePages=false;}else{pageNumber++;}}hasMorePages=true;}
Further, you want to add ordering by fields to the body so that each page doesn't produce duplicates. You can get the field names by calling the GET request with the additional /metadata on the end after the report id to include in the body of the post request. Here I used the employeeID and date:
{"page":{"number":0,"size":50000},"fields":["EmplEmployeeId","CalendarDate"],"orders":[{"field_name":"EmplEmployeeId","is_descending":false},{"field_name":"CalendarDate","is_descending":false}]}
C# code:
request4.AddJsonBody(new{page=new{number=pageNumber,size=pageSize},orders=new[]{new{field_name="EmplEmployeeId",is_descending=false},new{field_name="CalendarDate",is_descending=false}}});
Resources:
https://secure2.saashr.com/ta/docs/rest/public/?r=v1report_saved_(settings_id)
https://restsharp.dev/docs/usage/request/#json-body
https://community.postman.com/t/occational-warning-that-xml2json-is-deptrecated/59745/5
Top comments(0)
For further actions, you may consider blocking this person and/orreporting abuse