Make requests to fetch a project's author's info concurrently #6
Labels
No labels
Agent/Chrome-Android
Agent/Chrome-Desktop
Agent/Chrome-iOS
Agent/Firefox-Android
Agent/Firefox-Desktop
Agent/Firefox-iOS
Agent/Safari-Desktop
Agent/Safari-iOS
Code/Backend
Code/DevOps
Code/Frontend
Kind/Bug
Kind/Documentation
Kind/Enhancement
Kind/Feature
Kind/Security
Kind/Testing
Priority/Critical
Priority/High
Priority/Low
Priority/Medium
Reviewed/Confirmed
Reviewed/Duplicate
Reviewed/Invalid
Reviewed/Won't Fix
Status/Abandoned
Status/Blocked
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
aniram/cidadon#6
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Currently fetching a project(level 1), its author(level 2) and the author's party(level 3) is taking 9.90 seconds(query: São Paulo, results: 15) on Firefox because all those requests are happening synchronously. The requests from level 1 to 2 could be all concurrent.
I know beforehand how many requests I will make from level 1 to 2 since I'm only showing 15 projects max. per page.
Therefore I could iterate over each of those projects and make the request to fetch the author concurrently for each item of the loop. From level 2 to 3 I don't really have an option since I need to know the author's ID to fetch information on their party but optimizing from level 1 to 2 would still make the page much faster.
Let's examine a simple scenario leaving level 3 out:
If I have N projects and M authors in total the page will only load after N+M requests.
But running M requests concurrently the page would load after N+1 requests.
I expect the load time to halve. Nevertheless further optimizations must be made from level 2 to 3.
There are two issues I have noticed so far:
CIDAuthorsto the modelProjectalthough a project on the API does not contain this property. I'm not sure yet if this is the best solution.I have considered a few things to circumvent them.
For example downloading the API's tables, transforming it into a SQL database with the structure as I need it and fetching information directly from my DB or another possibility would be to download the tables, create the DB with the structure as they are and fetch only the data from the DB that is unlikely to change soon or ever like a member's name, their picture, a project's description etc.
The former "solution" would probably cause many problems as the API changes, and also as I learn about my misconceptions of the API and the project's domain.
The first draft(commit
34079647f4dc02cd) reduced the page load time by half for queries like 'São Paulo' but it's still about one 1 second on average slower than the official website. It also does not use mutexes nor does it range over the channel's values yet, it does use buffered channels though. The next draft should contain mutexes and range over the values properly. Ideally I can trim it down by one more second.I thought of iterating over the channel's values but it wasn't possible because doing it requires closing the channel. If I do close the channel manually(infinite range) or autom.(range over channel) then the program will panic since the goroutines might be still about to send some value and eventually do so. What I did instead which decreased the load time about 44ms was to run the goroutine that would fetch an author's information inside the goroutine that fetch all authors.
I decided deliberately not to use WaitGroups because although they would make it clearer when I'm waiting for my goroutines to finish instead of looping through all projects, they would just add code overhead and not necessarily improve anything. I would definitely have to use them if I did not know in advance how many results I was expecting from all goroutines, but this is not the case since amount of results = amount of goroutines = amount of projects as each goroutine only sends one value over.
Mutex is next but I fear the page load's time cannot be decreased any further.
Food for thought:
i) Use in-memory cache(redis, memcached) for user's favorites.
ii) Use a caching layer -> on a dedicated service retrieve projects, authors and related data from the API as CSV files*, transform and store them as my own entity with the properties as I need them(aggregate information scattered across requests into one entity) to my DB and fetch from there. Depending how fresh the data needs to be, retrieve from the API accordingly.
*their website reclaims they would provide CSV files updated daily with whole tables. I could use those files instead of making requests timed by a cronjob which could overload their system unnecessarily.
@aniram wrote in https://git.marina.sh/aniram/cidadon/issues/6#issuecomment-5:
Using mutex made the code clearer and more intuitive but it seems like it also added a few 1/100 of milliseconds to the page's load time. I guess that's the price of readability.
aniram referenced this issue2026-01-28 19:31:26 +00:00