I was contacted on one production issue – Almost all SAP background work processes were in “on-hold” status with RFC info in SAP SM50 work process monitor transaction and seemed like not moving. I was asked why? Please continue reading to understand the reason which I identified through analysis via transaction SM50, STAD, RZ12. This post would cover:
- The issue – SAP work process in “on hold” status with RFC info,
- Trouble-shooting of SAP work process in “on hold status”,
- Impact of SAP work process on hold and
- Further clarification
1. The issue
SAP workload monitor are showing that majority of SAP background work process were in “on-hold” status with status information or reason “RFC response” as showed in below Figure 1.
Figure 1 SAP SM50 – WORK PROCESS IN “ON HOLD” STATUS WITH “RFC”
If you need help on SAP SM50, you can refer to my post on how to run SAP SM50/SM66.
SM50 in Figure 1 shows BTC work processes on hold were related to “RFC Response”. Figure 1 also tell us all Background processes were busy, that should be a result of long running BTC due to “RFC Call” performance. Are those RFC calls to external system? Or Are those RFC calls to internal system? Do we have RFC resource issue locally or do we have RFC resource issue remotely?
2.1 System was having No Free DIA WP for handling RFC call
So I checked whether the system has free DIA work processes for handling RFC call via SAP transaction SPBT. You can check it via SAP SMQS as well.
Figure 2 SAP SPBT – RFC resource availabilities check
Figure 2 shows that the system allows up to 25 DIA WP for RFC processing but at the moment, there was no free DIA WP for handling any new “RFC” call because “ALL PBT resources are currently busy”. So you might wonder who was exhausting system RFC resources and what we can do to fix or mitigate the situation.
SAP transaction SPBT is mentioned in section of how to validate RFC configuration in my post how to configure RFC server group via RZ12.
2.2 Who were exhausting system RFC resources?
Who were exhausting the system RFC resources? I checked this via SAP workload monitor SM50. Following is the result
Figure 3 SAP SM50 – DIA WP consumption
Figure 3 told us that most of DIA work processes were occupied by one single user – this is not a normal online user. Then you might wonder origin of those RFC calls.
2.3 Where were origin of RFC calls?
I used STAD to find out what SAP job/program were firing those RFC calls or where were those RFC calls from?
Figure 4 STAD – RFC calls
You can normally find the origin/parent of RFCs via SAP transaction STAD, Display details info and then check “Client” info. In this case, we found that RFC showed in Figure 4 were coming from another SAP system – where many work processes were running under a single online SAP user which needs information from the system I was looking into.
The reason that SAP work processes were in “Stop” or “on hold” status with a reason of “RFC response” was due to contention of RFC resources in the system. Available RFC resources (DIA work processes) were occupied by a storming RFC calls from a remote SAP system.
4. Impact of SAP background work process in “On-hold”
Impact is on two areas – SAP background job could not start on time in the system due to shortage of background process created by the on-hold or stop status and job started would run longer if it needed to make RFC calls.
4.1 Long running of back ground processes
Following screen shows processing an idoc took over 1 and half hours while it can normally finished in seconds.
Figure 5 Background process ran longer than normal
Figure 6 SAP STAD – Work process with big RFC time
Based on Figure 6, we know over 99.9% of runtime is spent on RFC call. Further review, you can know that RBDAPP01 is making a RFC call which was supposed to be executed locally and it spent most of RFC time on waiting. Please refer to my post on explanation of STAD statistical data if you would like to know details.
4.2 Long waiting of background processes
“Delay” column in SAP SM37 Screen shows how long a background job has to wait before it is dispatched to an idle BTC work process. Following screen shot is tried to give you an impact of the issue.
Figure 7 SAP SM37 – Job delay due to shortage of BTC work processes created by RFC issue
You can see that job delay is varied – that depends on job schedule time and status of system.
5 Further clarification
Technically, you can fix the issue immediately by terminating the corresponding program/process in the remote system. But in our case, I just added 8 more DIA WPs to RFC resources VIA RZ12 since almost no DIA sap users were online at that earlier AM system time, the issue were gone by itself about half hours after the addition. So in this situation, just waiting and let time cure the issue if there is no business impact some times. Too many RFC work processes can cause SAP work process dead-locks sometimes.
A SAP work process can be in “RFC” status for many reasons like network, remote server issue, expensive RFC call etc. But this case, it is due to local resource issue.
Normally more BTC work processes can be seen left in “on hold” or “stop” status than DIA WP. A DIA WP would be rolled out when it is in “waiting” status after connection is established, so it is available for other online/RFC request. This is a typical SAP design. Interact activities executed by online SAP users are conducted through SAP DIA work process. You normally has much more SAP online users than available SAP DIA work process.
Further follow-up is needed to understand the interface configuration and volume control and end user activities in the remote system, so a system configuration tuning or program design change can be made to avoid the reoccurrence of this issue.