It is quite usual that customers ask for a performance study for their systems. Performance is a big pain in the ass because as I always said it depends on the point of view of everyone. Lets take FBL3N transaction for example. If we execute it without any filter it will probably take ages to show the data, even if it finish correctly (no memory dump). Does this mean we have a performance issue within the system? Absolutely no, it means the user doesn’t know how to use the transaction/report properly…
Why am I telling this? Because a lot of times we won’t have a performance issue. Even with that we need to know when there is a performance issue on the system so, how do we do it? I want to show you a small performance study I did for an SAP ERP system a year ago.
First idea is to know how we can measure the system performance and which tools we have. In this case my client wanted to check an SAP ERP. The details of the SAP system are:
First thing we should do to check if there is a performance issue is to execute transaction ST03N. On this transaction you have the response times for each of the task types, including the parts of response time for each of elements of the system. So if just took the response time, CPU time, etc. of the last 3 months and I did some graphics and tables:
Considering the data provided on both the table and the graphic, is there a performance issue on the system? The recommended value for the Average Response Time per Dialog Step is between 1.000ms and 1.500ms. On this example we have an Average Response Time of 1.089 which is higher than the lower threshold of the recommended value. It is not good but it is not the end of the world… Anyway, we can improve the response time so the next step is to check which elements can be improved. We check back the table and see which values has “Yes” on the Anomaly field:
You can see it easier on the following table, the Database and Roll Wait time have the higher percentage:
Once we found which elements have the higher response time the next step is to check how they are working and how they can be improved. In this case we focus on the database since it was the element with higher possibilities of improving.
We can do several actions on the database so we can improve the response time:
It is really important to check the database version to see if it is up to date. In this case the version was 11.2.0.2 and the SPB was 11.2.0.2.8.201211. If we check the Support Portal we can find which is the latest version and the latest SPB available. Considering the version and SPB it is a good idea to update the database to 11.2.0.5 at least and install the latest SPB.
Newer database version and SPB could fix performance issues, security issues, etc. SAP Support can ask you to update your database version if they see it is not up to date, remember to update the version frequently.
SAP releases different notes showing the proper parameters for the database depending on its version. You have notes available for version 9i, 10g, 11g and 12c. It is really important to understand that SAP needs some parameters with specific values. It doesn’t matter you have been administrating Oracle databases for a thousand of years, just follow the SAP Note.
In this case the SAP Note that shows the parameters for Oracle 11g is 1431798 – Oracle 11.2.0: Database Parameter Settings. If you want to check automatically the database parameter you can use the script from SAP Note 1171650 – Automated Oracle DB parameter check. If you want to update the database parameters automatically you can use the Linux script included in SAP Note 2124058 – Oracle Parameter Updater. I’m not going to write the parameters I changed, just remember to adjust the event parameters properly!
Next step is to do a small database analysis using the Workload Reporting on transaction ST04 (you can also use DB02 or DBACOCKPIT transactions). Choose the begin and end snapshots and click on AWR Report:
Don’t be afraid, this report will show a lot of information and you could feel a little bit overwhelmed. First step is to check the database buffers. As you can see in the next screenshot, the Hit Ratio for the database buffers is quite good:
On the database event side we can see that the first one is db file sequential read. This could be caused by different issues, we have to dig deeper:
If we take a deeper look we see that the % DB time if far superior for the db file sequential read compared with the rest of events in the database:
The database events on a chart looks like the following image.
About the wait class within the database the first one is User I/O:
So it is quite clear we have an I/O problem within the database, right? Next step is to discover what could be causing the high I/O times so we can fix it and we can improve the I/O times and the database performance. In order to improve the I/O within the database we can:
On the AWR Report there is a point showing the performance of the database datafiles. As you can see in the following image there are some datafiles with high read time
The idea is to check the performance of the disks that contains those data files in order to see if there is any problem with the disk. You can check the I/O performance in the operating system using the command iostat -Nxm 1 -d. Check if there is any filesystem with high Wait and Util values and these filesystems are the same ones that appears on the report.
I checked on the EWA Report which was the table with most physical and logical reads on the database. The first one on the physical reads side was a Z table with a 60% of physical reads. On the logical reads side the first one was another Z table:
I checked the queries executed on these tables and I found that the queries did a full scan of the table (SELECT * FROM). There were a couple of times when they used some fields on the WHERE clause but there was no index on the table for this fields. The idea here was to optimize the queries as much as possible for no full scans will be done to these tables. Just remember, less full scans means less reads on the table wich means less read on the database which means better I/O.
Another point to reduce the database I/O is the database memory parameters related with the SGA, PGA, etc. We have to distinct between PGA and SGA (sorry but I’m going to quote the Oracle documentation):
From version Oracle 11g it is possible to use the Oracle Automatic Memory Management so it is not required to correctly set the SGA and PGA parameters. You only have to set the parameters MEMORY_TARGET and MEMORY_MAX_TARGET and Oracle will manage the memory automatically. Anyway I wanted to set the PGA and SGA parameters and not use the AMM so I checked the current values:
The values were not bad but they could be improved. For the SGA you can execute the following query in your database and it will show the recommend value:
SELECT * FROM v$sga_target_advice ORDER BY sga_size;
In this database there was no value returned from that query so I decided to raise the value for the SGA Max Size to 14GB considering the use and size of the database. Also related with the SGA, we checked the recommended values for the Data Buffer and Shared Pool. On transaction ST04 we can see the recommended values for the Data Buffer:
As you can see if we raise the Data Buffer from 4896 to 9600 the disk time goes from 85,30% to 41,50% so I changed the parameter DB_CACHE_SIZE to the recommended value. About the Shared Pool, the ratios were quite good so I decided not to change it:
About the PGA, the current value was 1,6GB and the cache hit percentage was 96%. Checking the recommende values I saw that the cache hit percentage didn’t improve so much so I left the PGA with the same value:
Another important topic in the database side is the database statistics. I just lost the count of how many database I saw with really good hardware and correct parameters but with a crappy performance because of the lack of statics. The job for creating the database should be executed every night on transaction DB13. I checked the DB13 schedule on the system and I found the following statistics job:
The interesting part is the last one:
-t ALL -c 1 -p 10
These parameters mean we will generate the statistics for all tables (-t ALL) with at least 1% changes from the last optimizing job (-c 1) and with 10 parallel process. This job should be enough as a daily maintenance task. Anyway, we will schedule a monthly task to generate deeper statistics. This should be done during a maintenance window so it won’t affect the users:
brconnect -u / -c -f stats -t oradict_stats >> statistics_log.txt
brconnect -u / -c -f stats -t system_stats >> statistics_log.txt
brconnect -u / -c -f stats -t info_cubes -c 1 -p 6 >>statistics_log.txt
brconnect -u / -c -f stats -o SAPSR3 -t all -m +I -s P10 -f allsel,collect,method,precision,keep -p 4 >>statistics_log.txt
The 4 commands took about 12 hours to be completed and they were executed with the SAP instance stopped.
The current size of the database is 1.540GB with a free space of 77,15. This makes a total size of 1553,30GB
The database growth during the last months is the following one:
The total growth during the last 12 months is about 400GB which is quite high. Considering this we can say we have a problem on the long term with the database size. A bigger database size means that queries to the database will take longer if they are not optimizing since the amount of data contained is higher. In order to reduce the size we can do the following actions:
Option 1 takes a long time and it should be done really carefully. Option 3 is an independent project and it usually takes time. So our choice is to go for option 2 since is available as a short-term action. We take a look to the database objects with higher growth in the past weeks:
We check the tablespaces and we find a tablespace named PSAPSR3731. This tablespace was used for a SPS update long time ago, the objects that contains could be moved to PSAPSR3 tablespace and we can delete it afterwards.
I checked the queries executed on the database using transaction ST04 and clicking on Perfomance – SQL Stamente Analysis – Shared Cursos Cache. Over there you can sort the queries for disk reads, number of executions, CPU Time, etc. The idea here is to find queries with high values in the number of executions, disk reads, CPU Time, CPU Time/Exec, etc. in order to analyze them. After checking the results I saw that there were a lot of Z queries. I’m sorry but I cannot give you a detail about the queries. The steps followed in order to improve them were:
Also I checked the EWA report in order to see the objects that were affected within the database by high CPU use, I/O, Execution Time, etc. You should focus on improve the queries using those objects or take additional measures for improve the queries for those objects. For example, using high I/O disk on the datafiles that contains those objects.
After finish the study the tasks proposed for improve the performance within the system where the following ones:
After performing some of the task described before I checked the response time of the SAP system to see if I could find any improvement on the response time:
As you can see the Average Response Time wen from 1.752 in the worst case to 642,8 in the best case. This is the graph:
I was quite happy with the result we achieve after the whole process. This was the usual case of a SAP system that growths on data, users, etc. after some years without a proper capacity management. It could be an example about how a end-user thinks that there is a performance issue within the system but the overall performance in the system is within the recommended values by SAP. There is always room for a little improvement.
I hope you guys liked the articled, I will appreciate any comment you want to do about it. Cheers!
I hope you all survived the log4j Apocalypse and Christmas, hopefully you had a great…
I know it is being a while since I posted the last time. So far…
UPDATE 11/25/2020 Adobe released a new patch that solves the issue a few days ago:…
Good news for you fanatics of SAP GUI! SAP will release SAP GUI 7.70 in…
Quick update since I don't have a lot of time lately. During my 2019 job…
It's been a while since the last time I wrote an entry in my blog.…
View Comments
This a great great article!