In this post, I would show how I did troubleshooting on a long running job. My investigation found that runtime of the job was mostly spent on reading a SAP table VLKPA. The SQL access path chosen by database to read the table is correct based on then-existing VLKPA index. But the index used is not efficient based on SQL trace analysis. I used SAP transaction SM37, SM51/ SM50, SAP ST05, /SDF/MON in this analysis:
- SM37 to check job status and find where the job is running.
- SM51/SM50 is to hop to the server where the job is running and check what the job was doing.
- ST05 is to do a quick SQL trace on process when only database access performance is a concern.
- /SDF/MON is to review job history to see where the time is spent by the job.
1. The performance issue
Our customer is using SAP program SAP_DELIVERY_COMPLETE to update delivery status. Number of deliveries per month can be 100K-500K. They were asking why this particular job was still running when it was expected to finish and what was going wrong in this particular case based on experience from other production box..
2. The trouble-shooting of the performance issue
2.1 Collect information, Verify and monitor job/process running status
With job name and starting date, process ID and executing server of the job can be obtained, then corresponding SAP work process can be reviewed via SAP work process monitor SM50/SM51. I monitored the worked process in SM50 for some minutes by refreshing the screen again and again, and found that the work process stuck on deleting VLKPA table for a while, did something else, then quickly back to action of deleting VLKPA table again – this repeated again and again. The deletion was often lasted over 30 seconds but under 1 minutes. Following is a SM50 screen copy: Figure 1 -SM50 action and table SM50 confirmed that the job was spending a lot of time on “DB” side. Meanwhile, I checked that DB01 database lock and confirmed that lock contention was not a cause which slowed down the deletion during monitor period. System monitor ST06 confirmed that CPU/Memory was not a concern.
2.2 Run and analyze SQL trace on the running work process
Since SM50 monitoring showed that the running process spent most of time deletion. So I did a ST05 SQL trace for several minutes. Yes, ST12 trace would do the same purpose. But ST05 is more efficient than ST12 in such case – since there are less step/key-strokes to enable the trace. Figure 2 is the SQL trace summary or Summarized SQL statements screens from ST05 – the default screen which is sorted by duration. Figure 2 – ST05 Summarized SQL statements (Default screen – sort by duration) Based on ST05 SQL summarized statement screen, you can see that 8 execution of the delete statement costs total 270 seconds and deleted 8 records. Each deletion is about 33 seconds. Based on this performance, 100K deliveries might take 100K x 33 = 3,300K (seconds) which is 916 hrs ( 3,300K/3.6K = 916hrs) to be deleted from VLKPA table. Is there other SQL delete statement and what are their performance, I sorted ST05 Summarized SQL statement screen by SQL statement. I have following screen: Figure 3 – SQL Summarized Statement sort by statements Now you review Figure 3 screen and pay attention to Time/Exec column. You can see clearly that one execution of “delete from VLKPA” SQL statement took 33 seconds is longer one. For other SQLs, time/execution is less than 13 micro seconds. Especially, deleting VLPMA SQL statement has deleted more records during each execution but spent much less time, why? Let’s now review SQL execution plan to see what the table access strategy for VLKPA and VLKMA tables: From Figure 4, You can see that VLKPA~0 index is used to delete VLKPA based on field VBELN – VBELN is quite selective but it is last field of the index. So database has to scan through the whole primary index tree to find corresponding entry to be deleted which has a high cost especially when table involved is big – based on number of distinct VBELN value “22M” – it is not a small table as well. Figure 5 table VLPMA access strategy Figure 5 shows local index VLPMA~Z01 is used to access the table VLPMA. The only one field in Z01 index is “VBELN” – which is very selective based on number of distinctive volume. This explains why the deletion is efficient as well. Pay attention to attention to the “Estimated Costs” info as well. Lower cost indicates “better” access path normally. VLKPA – deliveries by partner function. VLPMA – delivery items by material. Both are SD index tables. VLPMA is much bigger than VLKPA. VLKPA has about 20M entries while VLPMA has about 400M. But SQL trace showed that execution of deleting VLKMA SQL statement is over 2,000 times faster than the execution of SQL statement for deleting VLKPA SQL statement. So Z01 index is making difference. I also list table access strategy for table ARFCSSTATE here, so you can understand why deleting from ARFCSSTATE is even more efficient – it is using primary index with highlighted fields. Figure 6 ARFCSSTATE table access strategy How to know whether a table access path chosen by database is the best one to a situation and what are other alternatives if table access path is not efficient – like create a new index, database hints, statistics updates etc., I am planning to come out a post on this in the future.
2.3 Checking job history
You can use /SDF/MON to understand what a “live” job was doing in the past. For the case on hand, /SDF/MON shows that process was doing a lot of data deletion from VLKPA table for the whole day up to the point when I started investigation. Following is work-process overview ( for a 8 minutes at the starting and end of period which was reviewed). Figure 7 SDF/MON – work process overview You can click How to use /sdf/mon to review work process status history for a running SAP job?if you are wondering how to get all work process status history for a job.
2.4 Summary of program performance investigation
Based on SM50 monitoring and /SDF/MON job running history, we know that the program was spending a lot of time on deleting data from VLKPA table. SQL trace and analysis review find that VLKPA SQL deleting statement is using primary index which is not efficient. Based on performance of VLPMA SQL deleting statement from the same program, we know the performance deletion can be over 2,000K times faster if a right index can be introduced.
3 Solution to the performance issue
The solution to performance issue is to create an index on VLKPA table using field Mandt and VBELN. The program was running fine in another production box –where VLKPA table has an index with Mandt and VBELN
4. Further clarification
If you are only need to find out execution plan/table access path for a SQL which is currently executing, there are several ways to do this.
- You can use SAP database performance monitor ST04 to find out execution plan for a “live” SQL statement.
- You can use ST04 SQL cache.
- You can get the SQL statement via work process monitor SM50 and then use ST05 to check the execution plan without trace.
- You can use /SDF/MON work process snap-shots to catch SQL and SQL execution plan etc..
So you do not have to do SQL trace or ABAP trace to solve application performance issue which is related to inefficient table access path. This post goes through performance trouble-shooting on an expensive SQL statement (related to database) via SM50 monitoring and /SDF/MON work process snapshots analysis etc. related to a ‘live’ SAP job. I have another post which goes through performance trouble-shooting on expensive ABAP operation (not related to database) via SM50 and /SDF/MON etc., you can click here to read the post for details. So my case posting covered two most typical performance issues trouble-shooting related to single program design. In this post, I mentioned several performance tools. You can read my post – how to analyze SAP program performance via SAP transaction ST12 if you would like to know performance tuning process and analysis tips. You can read my post- how to run SAP SM50/SM66 to monitor SAP work process if you would like to know how to run SM50 and understand SAP work process monitor. SAP /SDF/MON is my favorite SAP performance transaction which should be scheduled for every system to facilitate performance monitoring and trouble-shooting. You can read my post – /SDF/MON steps to collect SAP performance snap-shots if you are interested. Again it is on my radar screen to write a post on SAP database performance monitor ST04 in the future.