Removing Performance Bottlenecks in Distributed System

markus spiske 109588 unsplash

One of our clients is a global leader in investment banking. One of their Order Management Systems had some performance concerns. When the Sales people found delays in entering orders, it became a critical priority for the development team to address these issues, so that order entry was highly performant.

The production support team had noticed performance issues in the RDBMS. The system was heavily dependent on the RDBMS for its function. Hence solving the RDBMS related issues was central to application performance.

First step was to identify long running queries on the database and to tune them. Tools like AppDynamics helped in identifying those. However, improving query performance turned out to be non-trivial. The query plans did not show signs of problems, like table scans. The problem turned out to be different.

The database logs and AppDynamics revealed that there was a lot of contention in the database. Queries were spending more time waiting for a lock, than actually executing. The most contentious place was a table used to generate sequential IDs for all entities. This table maintained the last ID used in the sequence for each entity. The IDs were generated using a stored proc, running in a transaction.

The issue was that the ID generation was done as part of a transaction that was processing a single request. The DB schema is normalized, as expected for an OLTP system. Processing any request involves transaction across multiple tables. Most of the request processing involved generation of new IDs. Thus the ID generation became a point of contention across the system.

The solution was found by taking the ID generation process out of the main request processing transaction. The application first figured out all the entity IDs that needed to be generated for processing any request. The IDs were generated in a single “outside” transaction and then used in the request processing.

In summary, RDBMS performance was hampered due to contention, rather than bad query plans. It was fixed by moving the contentious queries to a separate transaction.  

https://techgraph.co/tech/how-high-performing-engineering-teams-use-power-trunk-based-development
Posted by imidas | 29 July 2022
In the initial days of software development, programmers did not have the extravagance of sophisticated version control systems. Instead, they relied on labor-intensive, expensive, and inefficient processes to keep a…
API-first Approach to Developmen
Posted by imidas | 28 July 2022
With the rise of cloud computing, it is no surprise to find organizations building processes around microservices and continuous delivery. In a cloud-based environment, the "traditional" code-first approach toward application…
Cloud-Agnostic Strategies
Posted by imidas | 20 July 2022
The buzz around digital transformation has caused cloud adoption to touch new heights. The public cloud market is expected to reach $947.3 billion by 2026.  As organizations look to adopt…