How We Attracted 40 Million Users to the Online Automotive Marketplace
And why constant optimization is not needed.
Join the DZone community and get the full member experience.Join For Free
During the years of work in IT and NIX United company, in particular, I noticed that the larger the project is and the faster the development grows, the more the team has to change the logic and improve functionality. In large projects, constant refactoring is an inevitable process. But sometimes there are problems with it. You shouldn't be afraid of them. During such moments, you have a good chance to get new skills and pump up your expertise. Having coped with all the difficulties, you’ll gain even greater client trust.
In this article, I will talk about the mistake our team kept making until we figured out how to solve this problem. I have highlighted what, in my opinion, required the greatest efficiency from the developers.
That Comes With A Story
The product we were dealing with was a US car dealer website. When we came into the project, an MVP was being planned. At that time, 80% of the current functionality was not even supposed to be done. But the project was constantly growing, changing, and many interactions were added. Even when the customer thought that he moved to the support stage, we were still introducing new features. It got to the point that we made releases almost every week.
One of such features became an inventory — a back-end process that affects the relevance of the information that the user sees. Meanwhile, the user never knew about it. Inventory was responsible for the daily processing of data from dealers — information about cars and their photos. Every day, the website was updated and new data added on an average of 5 million cars. Also, this process needed to be implemented quickly.
At first, the customer did not have a clear vision of what he wanted to get in the end. Subsequently, everything happened in a hurry. Some inaccuracies led to the fact that data processing took 14 hours. There was no adequate logging: what steps we went through, and if something crashed, then where and why. There was also a high failure rate, for example, with the emergence of an incorrect file. The problem was not even in the number of failures, as in the fact that each time data processing had to be started over manually with every step being watched.
Taken together, it led to a high probability of data loss. If the process crashed at night, and there was no logging, the next morning, with a manual restart, the system ignored the new file. Accordingly, the new data was lost and yesterday's information was not updated. As a result, users saw irrelevant data. The customer suffered, we suffered.
To solve the problem, we went by way of parallelizing the processes and minimized manual control. Now the whole process takes a couple of hours. Since it starts about two hours before the customer's business time, the site has time to fully update. And if there was a hitch, it could be quickly restarted with one click.
Indexing SPA Application
The Crawler couldn’t see our website. Site pages were either not indexed or indexed incorrectly. This affected metrics and search queries and the mood of the customer in general.
Our SPA (Single page application) was implemented in React. For the past five years, the issue of SEO has been very relevant for such applications. Having searched for the solution to the problem on the Internet, we stumbled upon two completely polar opinions: some believe that there are no problems with indexing SPA applications, others say that there is a problem, and they are even inferior to the monolith.
At the start of the project in 2017, popular search engines usually partially or completely did not support indexing JS applications, with the exception of Google. We hoped the Pretender would solve all our problems. But we were not out of the woods yet.
Amid problems with indexing, another difficulty arose — a large amount of JS and third-party JS scripts with which we integrated. Their number only grew, since it was the initiative of the customer. We realized that at some point this would affect performance. So we approached radical changes — we rewrote the main pages of the site to a monolith (the main page of the search query and the page with cars). They were the most requested and the most heavily loaded with JS. Positive changes were not long in coming. Thus, we learned to investigate problems deeply and encourage a customer to do it with us.
Huge Database and Its Processing
We quickly reached 15 million users (spoiler alert: we reached 40 million in the process). Taking into account our previous experience, we understood that it was necessary to establish logging that causes no problems with data processing. In a way, we could fully implement the functionality set.
During the research, we went through large companies such as Stackoverflow, Netflix, SoundCloud, and GitHub that had tasks similar to ours. All of them used 2 tools — Elasticsearch and Redis. We also chose them for our project.
What did we need to do? We needed a high load system with the possibility to operate a huge database. The website had various dealership programs for its users, and they took part in searching and filtering data. It was also required to provide the ability to save the cars and search queries you liked. This is the client's pain. He didn't like the fact that after spending some time on the site, using a lot of filters, and sorting to search, you could lose it in an instant.
There were over 20 types of interconnected filters. The choice of one directly influenced the capabilities of the others. The sorting included difficult conditions for a variety of indicators. For example, one of the sortings had multi-stage conditions and affected the priority level of a particular car in the list.
Referring to Elasticsearch and Redis, we immediately delineated their areas of influence. Elasticsearch was a kind of storage of basic information and helped with processing certain data and search results with filters. At first, Redis was assigned the role of a small cache store, and eventually has grown in both variability and functionality. This helped us to solve problems with data. The new challenges were functional and were more about on-the-fly changes from the client. We dealt with them fairly quickly and efficiently.
When a Competitor has More and Faster
Every customer wants to have the largest, most productive, recognizable, and, most importantly, profitable product. Our client is no exception. He was constantly looking for ways to improve the resource. Car dealer websites are very popular in the USA. There are many analogs and, accordingly, competitors. We needed to find a way to stand out.
In the process of work, we compared resources on 2 indicators:
- Visual page loading — when the user can see it and interact with it.
- Full load — when all scripts and processes are loaded to the visual content.
Due to many integrations and some shortcomings on our part, we were far behind our competitors in the full load. Also, the low performance of the search page was very disappointing. An urgent optimization was required. Thus, we would exceed or at least go catch up with competitors and become more user-friendly.
It is important to note here that at some point the client began to actively propose innovations.
We changed and added a lot in a hurry. The final business logic differed greatly from the original one. Constantly overdue deadlines.
Due to lack of time, we did not consider the features of the new business logic in sufficient detail. Therefore, there were several shortcomings on our part. This, among other things, was the reason to start product optimization.
Among all the optimizations, I want to highlight caching.
The process was divided into 2 stages:
- Our admins were responsible for caching on Nginx;
- Developers were in charge of the part with PHP optimization.
Having analyzed the entities and structures, we have identified the points that we can cache separately. Moreover, we found out which entities change most/least often, what information they carry, and how important they are for users. Therefore, we not only cached them separately but also selected different cache lifetimes, as well as additional processes that could be recreated unscheduled under certain conditions. For example, the likelihood that new filters would appear was much less than that the dealer would send a new car that fully matched the current filters (brand, model). In this case, the viability of the cache was less. In this way, we minimized the possibility of fail with the relevance of the data.
After improving caching, along with other optimizations, we saw a positive result in the work of the entire site.
The statistic spoke for itself: in the visual load, we bypassed our competitors, and in full, we came close to the first competitor, which can also be considered a success.
Pursuit of Better Performance
I like the phrase of the American mathematician, author of the bestselling "The Art of Computer Programming" Donald Knuth:
“If you optimize everything you can, you will be forever unhappy.”
But the client liked the optimization result so much that he aimed at the fastest performance possible. I found several tools to measure it and threw the lion's share of our efforts into analyzing these tools.
We tried several. For example, Dareboost and Google Audits.
Dareboost was a full-fledged site with a lot of metrics and allowed frame-by-frame page loading down to milliseconds, the second was a regular tab in the Chrome inspector that did the same, but in less detail.
In Google Audits, you can select the type of application for analysis (mobile or desktop) and indicators to check the site.
The resource divides vulnerabilities into five categories:
- Progressive Web App;
- Best Practice;
Our product was not a PWA, so we tested it across the rest of the categories.
All Dareboost functionality is available for a paid subscription. The resource even knows how to determine what technologies were used to create the page. Based on the results of the analysis, you can build graphs and identify points that should be improved in the future: what can be corrected, although it does not carry something critical, and problems requiring urgent solutions.
With respect to our site, Dareboost and Google Audits have concurred. Most often, they noted the lack of meta tags and HTTP headers. Or, in the opinion of the tools, they were not set up securely enough.
Gradually, our pursuit of better performance came down to improving caching, adding HTTP headers and meta tags to pages, and making the mobile version user-friendly. All changes took place in the shortest possible time. The rest of the time we tried to explain to the customer that any subsequent changes would just be a waste of resources, rather than give a desirable effect.
It got to the point of absurdity: Google Audits said that we did not have a corporate color in the meta tag, which influenced the color of the URL for a search query. But with its addition, nothing visually changed, though we spent time on this task.
Timely Update of the Relevance of Dependencies
The standard update of Elastic and Laravel led to the fact that the libraries we were using were no longer supported at all. I had to add some layers manually. Easy, but time-consuming. In our case, the regular update resulted in almost 50 hours of work by one specialist.
We thought: shouldn't we write scripts that would monitor the libraries? In the major versions of Laravel, they write about updates and fixes for security vulnerabilities, which is very important. I think not a single customer will refuse to take the time to resolve security issues. As soon as we took control of this moment, the process became easier. The following updates were regular and took a minimal amount of time.
What I would like to say in the end: failures are an integral part of the life of a large project. There is no need to get upset because of them, pointing fingers, or be frightened every time before the next difficulties. Just hit the ground running, lift the system, and cheer up the client. Having figured out what you didn’t know before, or by explaining something new to a colleague, in the future you will be able to cope with bigger challenges faster.
Opinions expressed by DZone contributors are their own.