Troubleshooting Tips & Tricks for workflow issues

This note will provide you with some troubleshooting tips so you can analyse any workflow problems in your system. The note is divided into the following areas.
[1] Agent Assignment/Agent Determination
Work items are not going to the correct users or to any users in your system. You need to troubleshoot the issue. Once you have identified the workflow template and step where the problem occurs you need to check the following:
Possible Agents

Identify the task assigned to the step in the workflow template via the builder and click on the 'Agent Assignment' button. This brings you to 'Maintain agent assignment' screen for the task. (Alternatively display the task via transaction PFTS or double click on the task ID in the step. From the menu goto Additional Data => Agent Assignment => Maintain or Display)
If there are no possible agents assigned to the task or it is not set to status 'General' (available to all users) then this is most likely the reason for the agent assignment issue. You need to add users/positions/Org Units/etc to the task or set it as General.
Responsible Agents

Responsible agents are assigned in the workflow step in the 'Agents' section. Typically you will see Positions, Expressions or Rules used to determine the responsible agents. You must check to see if any agents are returned at runtime. If the possible agent assignment of your task is ok then you need to test the responsible agents. Troubleshoot as follows:
Rule                                                         

If you use a Rule then you can simulate the rule execution via the 'Simulate rule resolution' button in PFAC. A rule will receive data at runtime from the workflow container so please check the binding in the workflow definition to see what data should be passed from the workflow container to the rule container. Then select the work item instance that has the problem and use the data from the workflow container for your simulation. When using the Simulation option in PFAC make sure you use F4 to enter the test container data. If the agents returned are not what you expected then you must check the rule implementation e.g. do you use a custom function module or have you maintained Responsibilities.                                                                

Special feature for authorizations in the workflow
Expression   
                                               
If you use an expression to determine the responsible agents then you will need to check the work item instance where the problem has occurred and see what value is contained in the expression in the workflow container. If it is a value that is unexpected then you must identify where the expression is filled in the workflow and investigate why the value is incorrect. A good way to find out exactly which steps use a particular container element is by right clicking on the container element in the workflow builder and selecting the "Display were-used list in graphic". This will highlight all the steps where the container element/expression is used.
Useful Transactions

PFAC - Maintain/Display Standard Rule
PFTC - Task Maintain/Display
SWDD - Workflow Builder
PPOSW - Display Orgainsational Plan
[2] Hanging Workflow
This area is broken down into 4 sections

Work item enqueue
SM58/SWU2
Short Dump during method execution
Transactions to restart work item in error

Work Item ENQUEUE
If a work item is being executed and the workflow runtime system tries to access the work item then it will not be successful due to the lock or enqueue currently on the work item. Some examples are:
Example 1

An asynchronous task is being executed by a user. While the work item is being executed the terminating event is raised in the system and tries to set the work item status to COMPLETE but cannot due to the lock/enqueue.In this case the event is buffered in the event queue. As soon as the work item is released, it will deliver the buffered event immediately.
Example 2

You use parallel processing where you have a fork with 2 branches (1 Branch necessary for completion). In one branch is a dialog activity step and in the other is a 'Wait for Event' step. While the dialog step is being executed by a user the Wait for Event step receives its event and continues along the branch and completes the fork (Remember only 1 branch needed for completion). Once the end of fork is reached the dialog work item should be set to status Logically Deleted but this does not occur due to the lock/enqueue on the work item while the user is executing it. Since a callback is essential for a workflow to continue running, this callback is suspended (stored in the SWP_SUSPEN table). These callbacks are started again via the RSWWERRE report. If you have not scheduled report RSWWERRE then the work items will remain in table SWP_SUSPEN and the work items will not receive their callback and will therefor will not continue.
Troubleshooting
If there are examples of work items hanging then check table SWP_SUSPEN to see if the callback work item ID is there. If it is then make sure you have the RSWWERRE job running in order to redeliver it. If RSWWERRE is running and the entry is not being delivered then please check for notes using the search term "RSWWERRE" and "SWP_SUSPEN". If there is no entry in SWP_SUSPEN, then check the workflow definition to see if the work item is asynchronous i.e. needs a terminating event as in Example 1 above. Check the event queue via transaction SWEQADM to see if the terminating event is being buffered there. If it is then it should automatically be redelivered so do a notes search in relation to the event queue.
SM58 or SWU2
Sometimes if there are network problems, system crash, tRFC &qRFC issues in your system you may find that work items will hang. If you check the workflow technical log you will see work items in status "In Process" or "Complete" that will not contine to the next step. The best option here is to check SM58 to see if there are entries here in relation to the tRFC execution.
If you want to find out the corresponding work item to the entry in SM58 then double click on the entry under 'Transaction ID' and this will provide some useful information such as the standard task, business object, object instance, related event and work item ID. These details are not always provided but you should be able to get enough information to be able to track down the work item ID e.g. If it has the Task ID then goto SWI1, enter the Task ID as well as the date & time of the entry and you should be able to narrow it down.
Useful Information
The entries in SM58 can be restarted as long as the issue that caused them to be there in the first place has been resolved. You can select the entry in the list and then goto the menu: EDIT => Execute LUW to restart one entry. If you want to process many entries then goto the menu EDIT => Execute LUWs or schedule the report RSARRCEX with the correct data.
Short Dump during method execution
Check ST22 to see if there are any related short dumps to the hanging work item. If you search ST22 with the date and time the work item was executed as well as the executing user or WF-BATCH. You can usually find out exactly where the dump occurred and if it is custom code or SAP Standard code. A short dump may occur for example in custom code due to poor exception handling within the method code. If this is the case then the best option is to test the object method via transaction SWO1 with the data from the work item container which you can get from the workflow log.
If the object method is SAP standard code then you will need to identify which application area is responsible for the object method and create a message in this component.
Useful Information

Goto SWO1, enter the business object and click the Display button. Then click on the Basic Data button . You will see the Program name associated with the business object listed here so please just run report RSSTATUS to find the conponent that owns the object. e.g. Object FORMABSENC
Program name: SWUFORMA
Run report RSSTATUS: Component BC-BMT
Transactions to restart work item in error
SWPR - Workflow Restart After Error
SWPC - Continue Workflows After System Crash
SWF_ADM_SUSPEND - Restart Suspended Workflow Callbacks
SWF_ADM_SWWWIDH - Restart Suspended Deadline Callbacks
[3] Performance in SAP Business Workplace (SBWP)
The overall performance of of SBWP can be controlled by a number of factors which are listed below.
Archiving
Table access & performance
Performance of SBWP (Available BAdI's)
RFC
Archiving
Please put a WORKITEM archiving strategy in place so you can keep your workflow table size under control. You use transaction SARA and archiving object WORKITEM in order to archive work items and this will list the tables that will be affected. The lower the number of entries in the workflow runtime tables the beteer the performance of the workflow engine.
Table access & performance

Performance of SBWP (Available BAdI's)
If you are experiencing performance issue in SBWP the main reason being that users have far too many work items in their inboxes (Several thousand). Some customers have a business need to have all work items in their inboxes rather than use more specific agent assignment (Call Centre scenario). Therefor several BAdI's were provided to improve performance e.g. reduce the number of work items in users inboxes
  • WF_BWP_SELECT_FILTER                                         
    This BAdI enables you to limit the number of the work items displayed by filtering. It is mainly suited to scenarios where all users are working on the same inventory of work items (for example, call center).
     
  • WF_BWP_DYN_COLUMN                                           
    Hiding the dynamic columns improves performance in the Business Workplace. If this is not possible, you can implement the BAdI WF_BWP_DYN_COLUMN to determine the values of the dynamic columns directly from the application data.
     
  • WF_BWP_OBJ_ATTRIBUTE                                         
    With this BAdI, it is possible to set the default attributes of the dominant object (_WI_Object_ID) and the grouping characteristic (_WI_Group_ID). The default attributes are used for grouping according to content, and grouping according to sort key and for hiding the group object column and work item content. The BAdI is available with Note
RFC
If you go with registering the destination you must have your RFC parameters set correctly so that there are not delays in processing work item execution due to lack of available dialog processes and so on.

 [4] Events
Deactivation of Event Linkage
When the event linkage gets deactivated, a mail is sent to the SBWP of the WF administrator, detailing the cause of the error. Ususlly this is due to incorrect data being passed from the Object event to the event receiver/workflow template. Please check the inbox(not workflow inbox) of the Wf admin for mails around the time of the deactivation and you should get your answer.
Have you thought about using the Event Queue (Transaction SWEQADM)?

You can use the event queue to try an balance the load of workflows being triggered. The event queue will store the results of the event linkage temporarily in a database table after the start conditions and check functions have been evaluated (If you use Start Conditions).
However, you can also use the event queue to store events that are in error and when you have corrected you can redeliver the events.
In SWE2, set the flag 'Enable event queue' in the event linkage.
Also set the 'Behaviour upon feedback' = Do not change linkage.
You can set the basic data of the event queue with transaction SWEQADM.
Troubleshooting Deactivation
Activate the Event log (transaction SWELS). As soon as the linkage gets deactivated again, please check the event log with transaction SWEL and it should have an entry for all successful entries but will also have entry for the one that deactivated the linkage. If you double click on this it should give some information on why this is happening. (It is not advised to leave the Event log switched in Production for long periods as it may cause performane issues. Switch off once the linkage is deactivated)
What event is triggering the workflow? Is it SAP standard or a custom designed event. Sometimes with custom designed/raised events, not all information is passed from the event to the receiver/workflow and so will cause deactivation.
In order to analyse the event container you could do the following: Define a new entry in SWE2:
Enter the Event, for the receiver type enter a user name (your user name), and for the receiver function module enter SWE_EVENT_MAIL. You will then get the event container in an email for every event and can see if some data is missing. It could be possible that a mandatory input element in the workflow is not passed from the event.
Useful Transactions

SWELS - Switch the event trace on/off
SWEL - View the event trace
SWU0 - Event Simulation
SWUE - Create Event
SWEQADM - Event Queue
SWETYP - Type Linkages
SWEINST - Instance Linkages
[5] Transport

(a) Workflow Templates

 Versions & Activation
This is how versions work in relation to transporting Workflow Templates
Create or make a change to a workflow in your development system and transport it to Test.
If the workflow already exists in Test and it has running instances then a new version of the workflow is created automatically in Test. This is done so that the existing instances can use the old version and new instances created after the transport can use the new version. If the workflow already exists but had no existing instances (In Test) then it would simply be overwritten with the new transport.
You do NOT need to have versions synchronized between systems.
Troubleshoot versions & activation
Do the Source and Target systems have the same system date & time?
If you created any new container elements in your workflow can you make sure that the their data references also exist in the test system.
Did you create any new tasks and add them to the workflow in the development system. If so please make sure that you also transported the task to the quality system.
Have you checked transaction SWDM -> Extras -> Transported workflows in the target system? It will show up in red if there are any issues.
Useful Transactions
SWDM - Business Workflow Explorer
(b) Agent Assignment
The options for transporting agent assignment (e.g. Possible agents of tasks) or organization structures are as follows:
[1] Automatic transport of all Org plan changes.
    Table T77S0, entry TRSP.CORR is set to space.
[2] Manual transport of Org plan changes, by manually selecting what
    part of the Org plan is to be transported. Table T77S0, entry
    TRSP.CORR is set to 'X'. Program RHMOVE30 performs the transport.
[3] Transport via an object lock by using a change request. Table
    T77S0, entry TRSP.CORR is set to 'X'. Program RHMOVE50 performs the
    transport.

[6] Extended Notifications & Delta Pull
If you are experiencing performance issues with SWN_SELSEN or RSWNUWLSEL then please review the following notes and tips.
The selection of the notification/workitems is logged by using the application log. You can influence the log level via transaction SWNCONFIG. Call transaction SWNCONFIG -> Goto the General Settings. There is a setting called MAX_PROBCLASS. When you choose e.g.log level 1 then only the very important information is logged. Changing the log level may help with performance.
The main tasks of program RSWNUWLSEL (in FULL mode) are
Selection of all open dialog and deadline work items in the system. Open means status READY, SELECTED, STARTED and COMMITTED
Determine the agents of these work items
Create notifications (relation work item to user)
Store them in table SWN_NOTIF. This means insert new notifications, logically delete obsolete notifications and update notifications.
Check the following:
Are there open work items belonging to tasks where there may be an agent assigment issue. e.g. If a task or tasks have a position assigned to them but the position no longer exists. If this is the case then RSWNUWLSEL tries to determine the agents of work items without any result and will slow performance. One way to determine this is to run transaction SWI2_ADM1 to see if there are many work items without any agents. If this is the case then reorganizing your work items in your system will help. Also check if the work items are still relevant and if not maybe they can be archived or deleted. For information about deleting and archiving work items, please take a look at note 573656.
In addition there is program RSWNNOTIFDEL. This program deletes entries from table SWN_NOTIF which are in status logically deleted. A notification gets status logically deleted the corresponding work items is completed (processed). So to keep the number of entries in SWN_NOTIF as small as possible you should schedule RSWNNOTIFDEL periodically. A suggestion would be running it with the default value 10 days. If this is not sufficient you could change the value later on.
[7] Transactions for troubleshooting
SWUD - Workflow Disgnosis
SWDM - Business Workflow Explorer
SWUS - Test Workflow
SWU8/SWF_TRC - Workflow Trace On/Off
SWU9/SWF_TRC - Workflow Trace Display
SWF_APPL_DISPLAY - Display Application Log
SWU0 - Event Simulation
SWELS - Event Trace On/Off
SWEL - Event Trace Display
SWPR - Workflow Restart After Error
SWPC - Continue Workflows After System Crash
SWI2_DIAG - Diagnosis of Workflows with Errors
SWU2/SM58 - Workflow RFC Monitor