When you are conducting performance testing, the goal of testing is to expose performance bottlenecks and performance issues, however you want to make sure that any bottlenecks experienced are genuine and not a product of unrealistic (not production-like) environments, load and conditions.
PERFORMANCE TESTING IN NON-PRODUCTION SPEC ENVIRONMENTS
One of the first issues to look at is that performance and load test environments, for cost reasons, tend to be much smaller versions of the production environment. Such differences can include:
• They have less memory.
• They have fewer and smaller physical CPU’s.
• They have fewer and less efficient disk arrays.
• They have single or fewer instances of servers, e.g. 2 database servers in performance test, but 3 database servers in production or even no clustering at all.
• The system software & hardware versions and specification can vary.
• The data can be older and the amount of data can be significantly lower than in production.
• There also may be different authentication, firewall or load balancing configurations in place.
• The data volumes may be less, and the operating system may be 32-bit instead of 64-bit or vice versa.
The further away the performance test environment is to production, the more you will need to caveat your tests and results accordingly (for example, on a performance test environment which is about 50% the specification of production, you may want factor down the traffic you generate in the performance test by the same amount).
DIFFICULTY IN REPLICATING PRODUCTION TRAFFIC
As well as having an environment for performance test which is not production-like, it is difficult to replicate a set of traffic which accurately mirrors the activity that will be experienced in production.
In order to do this, we perform volumetric analysis prior to designing any performance test scenarios, a process that attempts to capture the amount of traffic to simulate in the performance test, but even with such a study, replicating the exact production traffic is tricky.
• Firstly, in a performance test engagement a number of use cases will be generated to replicate the user journeys happening on production, however it may be the case due to time or budget constraints that not all of them can be scripted, and thus only a subset of use cases / functionalities are automated.
• Data can be mass produced in a uniform manner, affecting the way data is stored and accessed on the database. Some database tables can contain too little data, others too much compared to production.
• The behaviour of users can be unrealistic, for instance, if a requisition is raised on a system, it is not generally fulfilled in reality until a few days later. In a performance test, which lasts only a matter of hours, it may be have to be just a few minutes later.
• Workload being processed varies in the normal course of events, during a performance test it can remain uniform.
All of this can create many problems for the performance tester, which can cause the tester to ask how accurate is the performance test ? Understanding the differences between any performance test environment and the production environment is essential. This can help detect and understand an artificial bottleneck, which is something quite different from a performance bottleneck.
An artificial bottleneck is essentially a performance problem that is a direct result of a difference between the production environment or workload and the performance test environment or workload. It is not a performance bottleneck. A performance bottleneck is something that could or is happening in production.
When a performance bottleneck is found, the performance tester must investigate the symptoms in an attempt to try and pinpoint the cause of the issue. Care must be taken to distinguish between a genuine performance bottleneck and an artificial bottleneck brought about purely because of differences in performance test compared to production.
EXAMPLES OF ARTIFICIAL BOTTLENECKS
Unrealistic user activity and traffic, plus inferior environments for performance tests, may create unrealistic bottlenecks. It is the responsibility of the performance tester to be able to assess any bottlenecks generated to see if they could have been generated by conditions which are not production like.
• DATABASE LOCKING - Performance testing results in a subset of functionality being automated. The entire workload is then spread across this subset of automation resulting in some functions been driven at a higher frequency during performance test than would be seen in production. This can result in database locking which would not occur in production. The solution? The problem function should be driven no higher than the maximum peak workload expected in production.
• DATA - If a large amount of data has been recently added to the database via a data creation exercise, database tables can become disorganised, resulting in inefficient use of indexes. Additionally, if performance testing started with the database tables nearly empty of data, the optimiser can incorrectly decide that index use is not required. The solution? Run a database reorganisation and refresh optimiser statistics at regular intervals is large amounts of data is being added to the database.
• RESPONSE TIMES & EXCESSIVE DB I/O - The assumption here is that the database buffer pool is smaller in performance test than it is in production due to a shortage of memory in the performance test environment. Unfortunately, this is a case of the poor performance of some users impacting the performance of all users. The solution ? Once the problem function (or functions) is identified, attempt to decrease the workload for those functions to minimise physical I/O. If this is not possible, it may be time for a memory upgrade of the server. Memory upgrades are usually relatively straightforward in terms cost and time.
• MEMORY LEAK - This term describes how memory is gradually consumed over time inferring that at some point, there will be no free physical memory available on the server. Performance test environments often differ from production environments in terms of house-keeping. In production, the application could for example be restarted each night (thus refreshing the physical memory and removing any objects in memory which cannot be recycled), in performance test it may have been nine weeks since it was last restarted. The amount of memory in production is often substantially more than in performance test. The solution? Base memory leak calculations on the production memory availability with reference to the production house-keeping regime. Remember that once physical memory is fully consumed, virtual memory is still available so it’s not the end of the world as we know it.
Even when the performance tester deems a genuine problem is found, the intial reaction from the developers, the DBA's, the architects, the project management, pretty much everyone really is that the problem observed is not a real problem, it is an artifact of the test tool. There seems to be a disconnect from reality, a lack of understanding that the behaviour of the application changes depending on the workload. None-the-less, the performance tester should be able to articulate why the problem found is genuine, not artificial.
It is down to the performance tester to come up with some approach that will satisfy everyone concerned that the problem being observed is or is not an artificial bottleneck generated by to the performance test tool or unrealistic scenarios. This is a scientific approach, develop a theory, then design a test that should prove or disprove that theory. It does not really matter if the theory is right or wrong, as every time you devise a new test and execute it, you learn something new about the behaviour of the application. The performance tester has to work hard to gain the trust of the project and through thorough performance analysis, demonstrate that a performance problem is just that, a performance bottleneck that would cause an impact in production.
TEST TOOL GENERATED ARTIFICIAL BOTTLENECKS
Sometimes the test tool and design of the performance test scenarios can be responsible for generating artificial bottlenecks. Issues could include:
• ACTIVITY RAMP-UP - Ramping up the workload too quickly in the performance test tool. If virtual users start too quickly, the application struggles to open connections and server requests. For most applications, this is abnormal behaviour that you would not normally observe in a production environment.
• INCONSISTENT WORKLOAD - Configuring the workload in such a way that it is spread unevenly across the period of an hour, for instance users with just 2 iterations to execute waiting for excessive amount of time between iterations causing a quiet patch in the middle of a performance test.
• APPLICATION RESTART BEFORE TEST - Restarting the application just before performance testing starts could cause issues. While this can help to maintain consistency of results, throwing a large number of users at an application where data cache is not populated and software and hardware connections are not established can cause an unrealistic spike in performance as well as much longer than expected response times.
In reality, production application starts normally happen early in the morning when no / few users are around. The first few users of the day will experience longer response times, but these will be limited. By the time a much larger workload is starting to ramp up (such as in a performance test), data cache, threads and connections are already open and available and will not cause users these long response times.
• THINK TIMES INCLUDED IN TRANSACTION TIMES - Putting a think time inside the start and end transaction markers denoting response time so that the recorded response time for a transaction is much longer than it actually is.
In summary, it is the performance testers job to validate that any bottlenecks and performance issues found during testing are genuine, as it would a waste of both testers and developers time to fix issues which will simply not happen in production. As a performance tester, it is important to back up your findings and state your case to ensure legitimate issues are progressed and fixed before they become issues in production.