Note 766349 - FAQ: Oracle SQL optimization

Summary
Symptom
    1. What is SQL optimization?
    2. How important is SQL optimization?
    3. Which criteria are used to determine the load of an SQL statement?
    4. Where can I find information on the executed SQL statements?
    5. How do I find the most processing-intensive SQL statements overall?
    6. How do I find information on an SQL statement that is currently running?
    7. How can I determine what values are hidden behind the bind variables?
    8. Why does the SQL statement appear different on database level than in the ABAP source code?
    9. How many processing-intensive SQL statements should be observed as part of SQL optimization?
    10. What are selection and join conditions?
    11. What is a run schedule?
    12. How can I display the run schedule for an SQL statement?
    13. What are the characteristic components of a run schedule?
    14. What is an optimal run schedule?
    15. What utilities are available for further analysis of a processing-intensive SQL statement?
    16. How can I optimize a processing-intensive SQL statement?
    17. What must be considered in terms of the index design?
    18. What must I take into consideration with regard to the query design?
    19. What analysis tools are available at Oracle level?
    20. What information do the "Estimated Costs" and "Estimated Rows" provide for an SQL statement?
    21. Is it possible to determine a buffer quality for individual tables?
    22. What happens when there is an overflow of the Buffer Get?
    23. How can I determine when a complex access is made in the execution plan?
    24. Can I see a sample analysis?
    25. Is the ST05 data always reliable?
    26. Where can I find further information on SQL optimization?
Other terms
FAQ, frequently asked questions
Solution
    1. What is SQL optimization?
              SQL optimization (or SQL statement tuning or Shared Cursor cache analysis) refers to the identification, analysis and optimization of SQL statements that are responsible for the highest load in relation to I/O and CPU consumption on database level. These statements are also called "processing-intensive SQL statements".
              Due to the complexity of the topic, this note can only offer an outline description of an analysis. However, SAP offers various books, training courses and services on the subject (see also the final question in this note).
    2. How important is SQL optimization?
              Regular analysis and optimization of processing-intensive SQL statements is the MOST important basis for efficient system operation. Resource-intensive SQL statements are directly responsible for increased I/O and CPU activities and indirectly responsible for subsequent problems such as buffer busy waits or unusually high I/O times.
    3. Which criteria are used to determine the load of an SQL statement?
              The two main criteria for the load that creates an SQL statement are:
  • Number of disk reads (or physical reads)
    • Number of blocks that were not in the memory of the Oracle buffer and therefore had to be imported from the disk
    • Measure of the I/O load of an SQL statement
  • Number of buffer gets (or logical reads or reads)
    • Number of the blocks read in Oracle Buffer Pool in the memory
    • Measure of the CPU and memory load of an SQL statement
              The following criteria are also relevant and should not be neglected, even though they generally have less influence upon the database performance:
  • Number of processed rows
    • Number of table entries returned by an SQL statement
    • Measurement for the network load of an SQL statement
  • Number of executions (or better:  number of fetches, if available)
    • Measurement  for the communication load with the database
  • CPU-Time
    • Measurement of the CPU load of an SQL statement
  • Elapsed-Time
    • Measurement of the actual duration of an SQL statement
  • Number of direct writes (for Oracle >=10g)
    • Measurement for PSAPTEMPT and direct path write activities (for example sorting, LOB accesses, parallel query, hash-joins, bitmap operations)
              The SQL statements most relevant for SQL optimization are "Disk Read" and "Buffer Get". It is irrelevant whether a statement is executed often, causing only a small load each time or whether it is executed just once causing a large load. It is the total load triggered by the statement that is decisive and not the load per execution.
              In the R/3 environment, statements with bind variables are parsed (:A0, :A1 ...). These bind variables may contain other specific values. Nevertheless, all statements that have only varying values are considered as an IN statement (on a bind variable level).
    4. Where can I find information on the executed SQL statements?
              The central source of information for SQL statements is the Shared Cursor Cache, which you can access through an SAP system by selecting
              Transaction ST04
-> Detail Analysis Menu
-> SQL Request
                This cache includes the following information for all SQL statements executed since the last database start that were not displaced again from the shared cursor cache:
  • Executions: Number of executions
  • Disk reads: Number of blocks read from the disk
  • Disk reads/execution: Number of blocks read by the disk per execution
  • Buffer gets: Number of blocks read from the buffer pool
  • Buffer gets/execution: Number of blocks read per execution from the pool buffer
  • Rows processed: Number of processed rows
  • Rows processed/execution: Number of rows processed for each execution
  • Buffer gets/Row: Number of blocks read per processed row from the buffer pool
  • CPU Time: Consumed CPU time (SAP basis >= 6.40)
  • Elapsed Time: Duration of the execution (SAP basis >= 6.40)
  • SQL statement: Text of the SQL statement
  • Program name: Name of the calling program that was executed when parsing the statement.
  • "Callpoint in the ABAP program" button: Exact location in the ABAP source code from where the statement call originates at the time of parsing.
              There are also buttons to define a reset time and start an evaluation since this reset time only.
    5. How do I find the most processing-intensive SQL statements overall?
              You receive the SQL statements that are currently most processing intensive in terms of one of the three load criteria by sorting the Shared Cursor Cache entries into disk reads, buffer gets or processed rows. It is best to sort according to the three criteria one after the other and to optimize the statements with the highest load in each case.
              To automatically determine the most processing-intensive statements, you can also use the report /SDF/RSORADLD_NEW (or RSORADLD or /SDF/RSORADLD), which lists all statements by default that are responsible for more than 2% of the disk reads or 5% of the buffer gets.
              The SAP Early Watch Alert Services also gives you an overview of the most processing-intensive SQL statements.
    6. How do I find information on an SQL statement that is currently running?
              If a work process is busy accessing a table in SM50/SM66 for an extended period, you can determine the relevant Oracle session using the Client-PID as described in Note 618868.
              To obtain details of the block accesses, you can determine this statement in the second step in the Shared Cursor Cache.
    7. How can I determine what values are hidden behind the bind variables?
              In some cases the ABAP source code contains literals or constants that are transferred 1:1 to the database. Therefore an examination of the ABAP source code will provide information about the values transferred to the database.
              Otherwise, up to Oracle 9i you can only determine the content of the bind variables by taking measures such as the following BEFORE executing an SQL statement:
  • Activate an SQL trace using Transaction ST05
  • J2EE Environment: Activate an SQL trace using a web browser URL (http://<hostname>:<port>/SQLTrace) or using SAP J2EE Engine Visual Administrator
  • Activate an ORADEBUG trace (Note 613872)
  • Debug the ABAP program that launches the SQL statement
  • Oracle 10g or higher: Enhanced SQL auditing (Note 1128663)
              No further information about the contents of the bind variables can be obtained until 9i.
              As of Oracle 10g, the view V$SQL_BIND_CAPTURE is available, which stores the contents of bind variables for the executed SQL statements. With this information, you can use the following query to determine the bind variable contents specified for an SQL statement:

SELECT
  SUBSTR(SBC.NAME, 1, 10) BIND,
  SUBSTR(SBC.VALUE_STRING, 1, 50) VALUE,
  COUNT(*) "NUMBER"
FROM
  V$SQL_BIND_CAPTURE SBC, V$SQL S
WHERE
  S.SQL_TEXT LIKE '<sql_statement_pattern>' AND
  S.SQL_ID = SBC.SQL_ID
GROUP BY NAME, VALUE_STRING
ORDER BY 1, 2;
              The bind variables are updated in V$SQL_BIND_CAPTURE every 15 minutes at the earliest. If you require a faster refresh for  analysis purposes, you can temporarily set the underlying underscore parameter _CURSOR_BIND_CAPTURE_INTERVAL to a second value than 900 (that is, the default setting of 15 minutes):

ALTER SYSTEM SET "_CURSOR_BIND_CAPTURE_INTERVAL"=<seconds>;
              By default, only the first 400 byte of the bind variable content of an SQL statement are saved. For statements with a lot of bind variables, this may mean that the values of the last bind variables are not captured. If you require further variable content beyond 400 byte, you can set the _CURSOR_BIND_CAPTURE_AREA_SIZE parameter to a value <bytes> of more than 400, for example:

ALTER SYSTEM SET "_CURSOR_BIND_CAPTURE_AREA_SIZE"=<bytes>;
    8. Why does the SQL statement appear different on database level than in the ABAP source code?
              The Open-SQL statements executed from R/3 are transferred to the database via the database interface (DBI). In many cases, the statement is modified in the DBI before being transferred to the database:
  • If a column with an empty variable is compared in the WHERE section, the DBI omits this condition.
  • When you use FOR ALL ENTRIES, the program distributes a value list depending on the DBI parameters described in Note 48230 (particularly rsdb/max_blocking_factor, rsdb/max_in_blocking_factor) into statements with short IN lists or OR linkages.
           Note that RSPARAM always displays a value of -1 for these parameters when you use the default values. You can determine the value actually in use by referring to the dev_w* work process trace (transaction ST11).
           If the FOR ALL ENTRIES list is empty, all data is generally selected from the current client ("SELECT ... FROM <table> WHERE MANDT = :A0"). All conditions in the WHERE part are ignored.
  • Accesses to R/3 pool and cluster tables (that are not available as independent tables at an Oracle level) are converted into accesses on the related table pools or table clusters.
  • If tables in R/3 are completely or generically buffered, the buffers are reloaded, if necessary, with special DBI statements (for example, "SELECT * FROM <table> WHERE MANDT = :A0 ORDER BY <primary_key_fields>" for completely buffered tables) that may be completely different to the statement from the ABAP source code.
  • The DBI provides ABAP statements that have "SELECT SINGLE" or "UP TO ROWS" without ORDER BY with the FIRST_ROWS hint and "WHERE ROWNUM <= ..." (Note 135048).
  • Some operations (for example, kernel procedures, bulk operations, generations) can generate SQL statements although there is no regular SQL statement in the ABAP source code.
  • Even when you use conversion exits, there can be significant differences between ABAP source code and database-side statements (such as additional conditions that are not explicitly present in the ABAP source code).
  • IN conditions from the ABAP source code may be converted into any number of possible conditions on the database level, depending on your selection criteria: "=", "IN", "LIKE", "BETWEEN", ">", "<", ">=", "<="
  • Columns that appear in both the selection list and the WHERE condition are removed from the selection list if it is clear from the WHERE condition what the column's value must be.
  • If an expression ends with a space followed by a placeholder, the system generates an OR connection as follows:
           SQL statement: ... WHERE <column> LIKE '<string> %'
           Statement after DBI transformation: ... WHERE (<column> LIKE '<string> %' OR <column> LIKE '<string>')
  • The existence of synchronous matchcodes provides an explanation for SELECT COUNT operations on the database in connection with DML operations (INSERT, UPDATE, DELETE) in ABAP. For more information, see Note 1093107.
    9. How many processing-intensive SQL statements should be observed as part of SQL optimization?
              There is no single, general response to this question. The answer depends primarily on the corresponding rows of optimization and the current system status. Therefore, in one case, optimization of a single highly processing-intensive SQL statement can ensure that a previously barely usable system function again, while in another case, more than 50 statements must be optimized to achieve the required improvement in performance.
              Indicators regarding whether a system is already well tuned or not are outlined in Note 618868.
    10. What are selection and join conditions?
              During a selection condition, a column is compared with one or several specific values (for example, "MANDT = :A0", "BDAT > '20050821'"). In join conditions, two column values from different tables are compared (for example, "T_00.MANDT = T_01.MANDT")
              Only the selection conditions can ever be used to enter a join. Only when you access the inner table of a nested loop join (see below) can the join conditions also be used.
              Selection and join conditions are specified in the WHERE part of the SQL statement and - in the case of views - are also specified within the view definition in Transaction SE11. These view conditions MUST NOT be overlooked when analyzing SQL statements.
    11. What is a run schedule?
              A run schedule (or explain, access path) shows how Oracle accesses the requested data (index access, full table scan, join type ...).
    12. How can I display the run schedule for an SQL statement?
              Up to and including Oracle 8i, the run schedule could be determined using the Oracle Explain function. As of Oracle 9i it is also buffered in the V$SQL_PLAN performance view.
              From the shared cursor cache overview, you can go to the execution plan using the Explain button or by clicking on the SQL statement and selecting the Explain button on the dialog box.
              Whether the explains displayed in R/3 are based on V$SQL_PLAN or the previous explain depends on the R/3 Release and Support Package version. If the display of the R/3 explain does not contain any further information, the normal Oracle explain is used. If you find explicit information concerning V$SQL_PLAN, the data comes from this view.
    13. What are the characteristic components of a run schedule?
  • TABLE ACCESS FULL
           During a full table scan, all table blocks up to the high water mark (last ever filled block) are read
  • INDEX RANGE SCAN
           Depending on the selection conditions, a subarea of an index is read
  • INDEX UNIQUE SCAN
           In the selection conditions, all key fields of a primary index are specified with "=" so that no more than one entry of the index is returned.
  • INDEX FULL SCAN
           The entire index tree is read from left to right (-> sorted result)
  • INDEX FAST FULL SCAN (as of Oracle 9i)
           The blocks of the index are read in the sequence on the hard disk (due to DB_FILE_MULTIBLOCK_READ_COUNT this is faster than INDEX FULL SCAN, but the result is not sorted)
  • INDEX SKIP SCAN (as of Oracle 9i)
           This is comparable with several INDEX RANGE SCANs and also takes into account index fields behind an index field that is not specified in the selection conditions or is not specified with "=".
  • TABLE ACCESS BY INDEX ROWID
           Read the complete data record of the table based on the ROWID that was determined with a preceding index access
  • NESTED LOOPS
           Two tables are joined, where the system is accessed using the table in the run schedule directly under NESTED LOOPS and the second table is accessed with all of the entries that are returned by this table.
  • SORT JOIN / MERGE JOIN
           With the sort merge join, the relevant entries are read in parallel from both tables. The two resulting quantities are then sorted and finally combined for the final resulting quantity.
  • MERGE JOIN CARTESIAN / BUFFER SORT
           A Cartesian MERGE JOIN (which often appears in relation to a BUFFER SORT) generates the Cartesian product of the two resulting quantities. If both of the resulting quantities have x and y elements, the Cartesian product result contains x * y records.
  • HASH JOIN
           A hash table is set up from the access table (the table directly under HASH JOIN in the run schedule). The entries of the second table are then checked on this hash table. If the join conditions are fulfilled, the data record will be returned.
  • VIEW index$_join$_001
           A line such as this in the execution plan indicates an index join. You can use this join type to combine columns from several indexes using a hash join, before the table needs to be accessed. Index joins are especially useful if there are several indexes that are moderately selective, but that are very selective if used in combination.
    14. What is an optimal run schedule?
              An optimal run schedule only needs a minimum of blocks to be read. The following is especially important for optimal data access:
  • In most cases, an index access is better than a full table scan. A full table scan is mainly useful when a large portion of the table entries must be read.
  • Otherwise, you must ensure that the selective conditions are supported by an index. The greater the selectivity of a condition, the more the resulting quantity will be restricted. Note: there is no direct relationship between the selectivity and the number of different values of a column. For example, if a column only ever contains the default value ' ' (and has therefore only a single value), but in the selection conditions you are searching using a value other than ' ', the condition is very selective. On the other hand, if a column contains 1,000,000 different values, but 90% of the entries have the value '000000000000000', a comparison of the column using '000000000000000' will be very unselective.
           Also refer to the criteria specified below for the index design.
  • Has joins were deactivated in the OLTP environment until Oracle 9i. In OLAP environments and as of Oracle 10g, they can be used effectively.
  • Merge Join Cartesians should hardly ever occur. If they do, there is likely a serious error (completely obsolete statistics, missing Join or selection conditions, and so on). Also, ensure that the Oracle parameter OPTIMIZER_INDEX_CACHING has a value of 50 or lower.
  • Nested loop joins are usually preferable to sort merge joins. Sort merge joins are only useful if the expected resulting quantities are small, based on the selection conditions for both tables, and there is no good join condition.
  • The access table of a nested loop join and a hash join must be the table with the smaller resulting set.
    15. What utilities are available for further analysis of a processing-intensive SQL statement?
              The following examinations can be used for further analysis, depending on the analyses in the shared cursor cache in ST04:
      a) Actual values in bind variables
                       To decide what conditions really are selective (that is, which significantly limit the resulting quantity), typical values must be determined in the bind variables. Use transaction ST05 to create an SQL trace that includes the execution of the statement. The trace must be started on the instance on which the transaction is executed with the relevant statement. Selecting the statement and choosing the "Replace Variables" button gives you a statement with the actual values, instead of binding variables.
                       If no SQL trace is possible ad hoc, you can also go to the calling ABAP program, because you are working in various places with literals instead of ABAP variables, which displays the values used in the plaintext.
      b) Number of suitable table entries
                       You can use the actual values from an SQL statement to determine the selectivity by selecting the number of table entries that match the conditions. This can happen - for the current client - using transaction SE16. To do this, enter the individual conditions for the table and select the "Number of Entries" button.
                       At Oracle level, you can determine the number of suitable entries using the following command:

SELECT COUNT(*) FROM <table> WHERE <condition_list>;
                       The advantage of this method it that unlike SE16, it is not limited to the current client.
                       Important: A large number of blocks are imported with these methods, which can increase the I/O load. Do not execute these actions on a large scale when the system has a heavy load.
      c) Column value distribution
                       Use Transaction DB05 for a rough overview of what and how often value combinations of one or several columns appear in the table.
                       DB05 does not state WHICH column combination appears how often. In addition, the frequency is only broken down by intervals (1-10, 11-100, ...).  broken down. The result of a DB05 analysis is therefore only of limited use.
                       It is particularly useful for columns with a limited number of value combinations to determine each of these combinations including the exact number of occurrences. This can be done as follows at Oracle level:

ALTER SESSION SET DB_FILE_MULTIBLOCK_READ_COUNT = 128;
SELECT <column1>, ..., <columnN>, COUNT(*) FROM <table>
GROUP BY <column1>, ..., <columnN>;
                       The ALTER SESSION accelerates the subsequent SELECT (during a full table scan, the system then reads 128 instead of only 8 or 32 blocks for each I/O request from the disk). Since the parameter affects the cost accounting, it can only be set to 128 here, and not generally.
                       As of Release 4.6C, you can use transaction TAANA to determine the value distribution of columns.
                       Important: A large number of blocks are imported with these methods, which can increase the I/O load. Do not execute these actions on a large scale when the system has a heavy load.
      d) Index fragmentation
                       To prevent a large number of block accesses being caused by a fragmented index, check the index fragmentation as described in Note 771929.
      e) Source of the SQL statement
                       You can use the report name and source code button in the shared cursor cache to reach the callpoint in ABAP. If it is still not clear there in which context the source code will run, a where-used list can return further information.
                       Information on the triggering transaction can also be determined by capturing a current query in SM66 or by a permanent SQL trace restricted to the table using ST05.
                       If no report name is displayed in the shared cursor cache, the statement is called from a tool or script outside the R/3 system. However, there is also a number of external tools that can be identified with a separate name.
                       If branching using a source code button fails, you may have dynamically generated source code.
                       If several reports execute an identical statement, the report that first executed the statement is always displayed.
                       You cannot branch to the source code for statements that originated in the R/3 kernel.
      f) Block accesses for a specific statement execution
                       To test alternative access plans or partial accesses of a join, SQL statements can be executed at Oracle level. To ensure that it is easy to find their key performance indicators (such as the number of buffer gets and disk reads for each execution) in the shared cursor cache, you should add a unique and easily identifiable comment to the statement to be executed:

SELECT /* <comment> */ ...
                       Now you can easily search for the statement by using the search term <comment> in the Shared Cursor Cache.
      g) Access paths for varied SQL statements
                       In various cases it can be useful to determine the access path for varied SQL statements, for instance:
    • Effect of hints
    • Effect of more or fewer OR concatenations or IN list entries
    • Effect of changed Oracle parameters
    • Determining the costs for partial statements
                       You can use the "Enter SQL statement" button to determine the access path for any statement in transaction ST05. However, you can only change dynamic Oracle parameters with newer releases and Support Packages.
    16. How can I optimize a processing-intensive SQL statement?
              Before carrying out technical tuning of a processing intensive SQL statement, you should always check from the application view to what extent the SQL statement is even useful. For example, if the SQL statement is the result of an inadequately filled input template, a poorly-qualified SE16 SELECT (report /1BCDWB/DB*), an incorrect query (report AQ*) or a design error in a customer-specific development, the statement must be optimized from a non-technical view or avoided completely. If the statement comes from the standard SAP system, you can check whether a general solution is already available during an SAP note search.
              Below is a general overview of classes of expensive SQL statements and possible technical tuning approaches:
  • Accesses to objects of the Oracle DDIC (DBA views)
           Queries on DBA views are often very processing-intensive. For statements on DBA views, it does not make sense to optimize the access by creating new indexes or specifying hints.
           These accesses are usually caused by monitoring tools (transaction RZ20, BR*TOOLS, external monitoring tools). You must therefore first determine the initializing tool and check to what extent the query can be completely deactivated or the frequency of the execution can be reduced.
           If long runtimes and a high number of BUFFER GETs occur during DBA_SEGMENT access in particular, then refer to Note 871455.
  • Accesses within the F4 search help (M_V* views)
           Search help runtimes may vary, depending on the selection criteria. Since all possible queries cannot be optimized to the same extent, it is important to limit query constellations to the main query constellations only. Furthermore, you can implement a &SUBSTITUTE LITERALS& hint in accordance with Note 1008433. This hint partially results in optimal accesses, even if there are no histograms.
  • Large number of executions
           If an SQL statement is processing intensive, primarily due to the large number of executions, you must check whether the number of executions can be reduced in the application.
           If the statement in question contains a FOR ALL ENTRIES expression, it may also be possible to increase the blocking factors as described in Note 881083.
  • Large number of processed rows for each execution
           If an SQL statement reads a very large number of data records with each execution, you must check in the application whether the scope of the dataset can be reduced (for example, by specifying more conditions in the WHERE part).
  • Large number of buffer gets in each execution
           Check if the use of resources by Oracle is optimal. Consider the following issues:
    • Does the Oracle Optimizer fail to choose the current best possible access path?
                    If a detailed analysis of the SQL statement means that Oracle does not decide on the optimal access path, you should first check whether the CBO statistics correctly reflect the current dataset. To make sure, you can generate new statistics that are as accurate as possible (Note 588668). You should also ensure that a statistical run is scheduled at least once a week as described in Note 132861.
                    If the access path is corrected and no side effects occur, you can also create statistics as described in Note 756335 on tables that do not usually receive any statistics as described in 122718.
                    If the access path is also not optimal with good statistics, this is often due to "features" of the RBO and CBO, which are described in Note 176754. In such cases you can improve Oracle's access by using hints (see Note 130480) or by adjusting the statistic values (see Note 724545).
                    Additional reasons for an incorrect access path are the wrong Oracle parameter settings or the use of an obsolete patch set. For more information, see Note 618868.
                    To understand the CBO decisions better, also refer to Note 750631, which contains approximation formulas for the cost accounting.
                    Under certain conditions, the displayed access path is also incorrect because Oracle performs another access internally. For more information, see Note 723879.
    • Can the access be optimized by creating an additional index?
                    If the existing selective conditions are not sufficiently supported by an index, a corresponding index can be created for the optimization or - if no side effects are possible - an existing index can be adjusted.
                    Note that in the R/3 system, some central tables are accessed using special index tables (Note 185530 for SD, Note 191492 for MM/WM, and Note 187906 for PP/PM). In the cases described, you should open the initial screen using an index table, so that no other index must be created. The same applies for accesses on the table BSEG, for the index tables such as BSIS, BSAS, BSID, and BSAD.
    • Is the number of buffer gets inexplicably high?
                    If possible, roughly calculate how many blocks must be read for the statement. A maximum of 5 block accesses are required per execution for each INDEX UNIQUE SCAN (or, more specifically: BLEVEL of index + 1).
                    If a significantly greater number of blocks is read with INDEX RANGE SCANs that your estimates would lead you to expect, this may be due to an index fragmentation. For more information, see Note 771929.
                    When you use LIKE conditions, a placeholder at the beginning or in the middle of the search template may cause a greater number of buffer gets, since Oracle can only optimally use placeholders that are at the end of the search term.
                    Another possible reason for a large number of Buffer Gets is a high volume of network communication between the database and SAP. However, this effect is usually negligible.
                    If you execute mass-data operations ("FROM TABLE <internal_table>" expressions in the ABAP statement), the underlying SQL statement is executed for each data record of <internal_table>, while the system only records a single execution in the shared cursor cache. This causes an unnecessarily high number of block accesses for each execution.
                    If a table contains more than 255 columns, for technical reasons, Oracle splits the records into fragments that each contain 255 columns or fewer ("Intra Block Chaining"). If the system accesses columns with a COLUMN_ID greater than 255, an additional buffer get is required. For columns with a COLUMN_ID greater than 511, two additional buffer gets are required, and so on.
                    If the query contains a sorting such as ORDER BY, the system may have to sort a large data volume before it returns the first data records based on this sorting. Therefore, a large number of buffer gets may be required, even if the system eventually transfers only few data records. In these cases, an index that supports sorting or the prevention of ORDER BY can solve the problem.
                    In individual cases, the high number of buffer gets is caused by the actual run schedule differing from the displayed run schedule. This can be checked as described in Note 723879.
                    Also refer to SAP Note 712624 for further reasons for a large number of Buffer Gets.
  • Large number of disk reads per execution
           A high number of disk reads is often associated with full table scans, because the blocks imported during a full table scan are suppressed more quickly than blocks that were read during an index access. In such a case, you must use appropriate measures to force an index access (create a new index, adjust the statistics, hints, ...).
           A large number of disk reads often occurs in connection with index range scans and subsequent table accesses with ROWID if the clustering factor of the index is high. For more information, see Note 832343.
           To avoid blocks of a critical table being suppressed from the buffer pool after some time, and later being read again from disk, you can set up a keep pool as described in Note 762808.
           If the application needs to read a lot, or all, of the entries of a table repeatedly, and the table is not too large, consider buffering the table completely in the SAP system. You must ensure that the delayed SAP buffer synchronization will cause no problems for the application, and that the table's change rate is not too high.
           Otherwise the same applies during the analysis of statements with a high number of disk reads as for statements with a high number of buffer gets.
  • High CPU-time
           The usual cause of high CPU load is a large number of BUFFER GETs. Note 712624 contains analysis options and more characteristic reasons for increased CPU consumption.
  • High Elapsed time
           In many cases, a high elapsed time that cannot be explained by DISK READs or BUFFER GETs is caused by lock situations, especially enqueues (refer to Note 745639). If necessary, you can use ORADEBUG (refer to Note 613872) to create a trace of the statement in question to obtain more information.
  • Large number of direct write operations
           Statements with a high number of direct write operations can be analyzed as follows (depending on the triggering operation):
    • (Sorting: Check whether the statement must sort an unnecessarily large amount of data, or if the sorting can be completely avoided [for example, by choosing an index that supports the sorting sequence).
    • LOB accesses: Check whether it would be useful to cache LOB data, as described in Note 563359.
    • Parallel query: Check whether the relevant SQL statement can be processed usefully without a parallel query (see Note 651060).
    • Hash operations: Check whether the data scope of a hash operation can be reduced, or if the hash operation can be completely avoided (for example, by avoiding an undesirable hash join).
    • Bitmap operations: Check whether a bitmap operation (for example star transformation) can be usefully optimized.
    17. What must be considered in terms of the index design?
              For more information, see Note 912620.
    18. What must I take into consideration with regard to the query design?
              To ensure optimal SQL performance, you must take the following rules into account when you create an SQL statement:
  • Ensure that the selective conditions in the WHERE clause are supported by an index.
  • Select as few columns as possible.
  • Check whether it is sufficient to use only the data existing in an index for a query so that a table access becomes obsolete.
  • Avoid using not-equal conditions because these cannot be analyzed optimally by the cost-based optimizer (CBO). Use range conditions instead (if possible).
  • Specify as many selective conditions as possible.
  • Avoid using unnecessary conditions that are not selective (such as "GJAHR" BETWEEN '0000' and '9999') because these are confusing for the cost-based optimizer (CBO).
  • Avoid using unrequired range conditions because these can only be analyzed in a limited way by the cost-based optimizer (CBO) (for example, "GJAHR" BETWEEN '2010' AND '2010' --> better: "GJAHR" = '2010').
    19. What analysis tools are available at Oracle level?
              For various reasons, an SQL analysis may be useful at database level directly (for example, if no R/3 functions are available in a Java-only environment). Therefore, useful Oracle functions for SQL analysis are described below:
  • Creating an SQL trace
           You can use ORADEBUG (Note 613872) or DBMS_SYSTEM (Note 602667) to create an SQL trace of any session.
           As of Oracle 10g, you can use the AUDIT command to activate an enhanced auditing of SQL statements in accordance with Note 1128663.
  • Displaying an Explain plan
           With the following command, the run schedule is displayed for all subsequent SQL statements, instead of them being executed:
           SET AUTOTRACE TRACEONLY EXPLAIN
           You can deactivate this mode with:
           SET AUTOTRACE OFF
           If you want to create an Explain for a statement with bind variables, the above approach does not work. You can use the following option instead:
           EXPLAIN PLAN FOR <sql_statement>;
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);
  • Display block accesses of an SQL statement
           Among other things, the following command counts the Disk Reads and Buffer Gets of the subsequent SQL statements (without the actual result of the query being returned):
           SET AUTOTRACE TRACEONLY
           You can deactivate this mode with:
           SET AUTOTRACE OFF
  • PLAN statistics (Oracle >=9i)
           As of Oracle 9i you can collect detailed PLAN statistics such as processed records, buffer gets, disk reads or required time. Since the collection of these extensive statistics has a significant effect on database performance, this option is normally deactivated. You can use SQLPLUS to activate and evaluate these PLAN statistics on session level for individual tests. To do this, proceed as follows:
    • Activating the PLAN statistics:

      ALTER SESSION SET STATISTICS_LEVEL=ALL;
    • Executing the relevant SQL statements
    • Query from V$SQL_PLAN_STATISTICS_ALL

      SELECT
        SUBSTR(LPAD(' ', PS.DEPTH) || OPERATION || ' ' ||
          OPTIONS || DECODE(OBJECT_NAME, NULL, NULL, ' (' ||
          OBJECT_NAME || ')'), 1, 40) ACTION,
        PS.COST,
        PS.CARDINALITY,
        PS.LAST_OUTPUT_ROWS "ROWS_PROC",
        PS.LAST_CR_BUFFER_GETS + PS.LAST_CU_BUFFER_GETS BUFFER_GETS,
        PS.LAST_DISK_READS DISK_READS,
        PS.LAST_ELAPSED_TIME TIME
      FROM
        V$SQL_PLAN_STATISTICS_ALL PS,
        V$SQL S,
      WHERE
        S.SQL_TEXT LIKE '<sql_statement_pattern>' AND
        PS.ADDRESS=S.ADDRESS AND
        PS.CHILD_NUMBER = S.CHILD_NUMBER
      ORDER BY
        PS.CHILD_NUMBER,
        PS.ID;
                    The columns COST and CARDINALITY contain the CBO estimates, while the columns ROWS_PROC, BUFFER_GETS, DISK_READS and TIME (in microseconds) display the actual statistics of the last execution.
    • Deactivating the PLAN statistics:

      ALTER SESSION SET STATISTICS_LEVEL=TYPICAL;
  • PLAN statistics (Oracle >= 10g)
           As of Oracle 10g, you can also use the GATHER_PLAN_STATISTICS hint:
    • Execute the SQL statement using the GATHER_PLAN_STATISTICS hint:

      SELECT /*+ GATHER_PLAN_STATISTICS */ ...
    • Indicate the execution plan using the following command:

      SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,
        NULL, 'ALLSTATS'));
                    The records and the time for each execution step are contained in the columns "A-Rows" and "A-Time".
  • Sample data
           In some situations, it is useful to get an impression of the data composition without having to scan the entire table or without having to rely on how representative the first records of a table are. You can use the following command to randomly select data records:

SELECT * FROM <table> SAMPLE (<percent>);
    20. What information do the "Estimated Costs" and "Estimated Rows" provide for an SQL statement?
              An EXPLAIN displays "Estimated Costs" and "Estimated Rows", which are simply the CBO's calculation results (refer to Note 7550631). Since these results are based upon a number of assumptions (column values are distributed equally, statistics), and depend upon the database parameter settings, the calculated costs and rows are useful only within a margin for error. High "Estimated Costs" and "Estimated Rows" are therefore neither a satisfactory nor a necessary indication of an expensive SQL statement. Also, the calculated costs have no actual effect upon the performance - the deciding costs are always the actual ones, in the form of BUFFER GETs, DISK READs, or processed rows.
    21. Is it possible to determine a buffer quality for individual tables?
              The Oracle Buffer Quality (see Note 618868 for more information) is a global statistic of the database. Since individual SQL statements can strongly affect this global value (both positively and negatively), a table-specific buffer quality is desirable. You can use the following SQL statement (based on V$SEGMENT_STATISTICS) to determine the 20 tables that have the poorest buffer quality (this query only considers tables with at least 10,000 buffer gets):

SELECT * FROM
(SELECT
  S1.OBJECT_NAME OBJECT,
   S1.VALUE LOGICAL_READS,
  S2.VALUE PHYSICAL_READS,
   ROUND((S1.VALUE - S2.VALUE) / S1.VALUE * 100, 2) "QUALITY (%)"
  FROM
    V$SEGMENT_STATISTICS S1, V$SEGMENT_STATISTICS S2
  WHERE
    S1.OBJ# = S2.OBJ# AND
    S1.VALUE > 10000 AND
    S1.STATISTIC_NAME = 'logical reads' AND
    S2.STATISTIC_NAME = 'physical reads'
  ORDER BY 4 ASC
)
WHERE ROWNUM <=20;
    22. What happens when there is an overflow of the Buffer Get?
              For Oracle Release 9i or lower, if the SQL statement executes more than two billion Buffer Gets, the value overflows in the Shared Cursor Cache in the negative area. For Oracle Release 10g, the limit is four billion. When the system reaches this limit, the count begins at 0 again. In both cases, the result is that the system can no longer immediately recognize a large number of Buffer Gets. In addition, in many cases, the Shared Cursor Cache analysis ignores statements with negative Buffer Gets values, and as a result, the most expensive SQL statements regarding Buffer Gets are not even displayed. To recognize the affected SQL statements despite the overflow, you can select the following options:
  • Check whether there are accesses with a negative number of Buffer Gets:

    SELECT * FROM V$SQL WHERE BUFFER_GETS < 0;
  • Check the plausibility of the data - if the system displays a number of Buffer Gets that is significantly lower than is possible, an overflow may be responsible.
  • When you use Oracle 10g, also consider the AWR data from DBA_HIST_SQLSTAT that was aggregated on the basis of Snapshot, since overflows between two snapshots (generally found in one-hour intervals) are improbable.
    23. How can I determine when a complex access is made in the execution plan?
              We are unable to provide a general answer to this question. However, the following individual approaches are available:
  • As of Oracle 10g, the Active Session History (V$ACTIVE_SESSION_HISTORY) can provide very detailed information about a session's current activities (including the segment currently being accessed).
  • If the session executes read I/O accesses such as "db file sequential read" or "db file scattered read", you can use the event parameters P1 (file number) and P2 (block number) to determine the segment currently being accessed (see Note 619188).
  • If CPU usage is high, you can use V$SEGMENT_STATISTICS to determine whether the number of buffer gets for individual objects involved in the join increases significantly. However, note that V$SEGMENT_STATISTICS contains global data and does not solely refer to the long-running program.
  • If the session executes extensive sort operations and hash operations, you can use V$SQL_WORKAREA_ACTIVE to determine which steps in the access plan currently require the PGA memory and PSAPTEMP tablespace, and to what extent.
  • For certain activities such as full table scans and sorting activities, you can also use V$SESSION_LONGOPS to determine the current activity.
    24. Can I see a sample analysis?
              The following SQL statement guarantees several thousand disk reads and buffer gets per execution, although hardly any entries are returned:

SELECT *
FROM "LTAP" T_00 , "LTAK" T_01
WHERE
  ( T_01 . "MANDT" = :A0 AND
    T_01 . "LGNUM" = T_00 . "LGNUM" AND
    T_01 . "TANUM" = T_00 . "TANUM" ) AND
  T_00 . "MANDT" = :A1 AND
  T_00 . "LGNUM" = :A2 AND
  T_00 . "PQUIT" = :A3
              The execution path contains a sort merge join:

SELECT STATEMENT ( Estimated Costs = 1,712 , Estimated #Rows = 52,660 )
     MERGE JOIN
        TABLE ACCESS BY INDEX ROWID LTAP
            INDEX RANGE SCAN LTAP~0
        SORT JOIN
             TABLE ACCESS BY INDEX ROWID LTAK
                INDEX RANGE SCAN LTAK~R
              A nested loop join is most likely preferable to a sort merge join. Because - leaving aside the unselective MANDT - there are only specifications for LTAP selection conditions in the WHERE part (LGNUM, PQUIT columns), you can check whether LTAP is suitable as an access table for a nested loop join. A suitable index already exists:

NONUNIQUE  Index  LTAP~M

Column Name                            #Distinct

MANDT                                          1
LGNUM                                          7
PQUIT                                          2
MATNR                                    24.280
              As you can see in the #Distinct column, neither LGNUM nor PQUIT have many different values. However, this information does not yet allow any direct conclusions to be drawn on the selectivity of the conditions.
              Even without an SQL trace, in this case you can already read the PQUIT value ' ' from the ABAP source code:

SELECT LTAP~LGNUM LTAP~TANUM LTAP~TAPOS
     LTAP~VLTYP LTAP~NLTYP LTAK~BWLVS
    LTAK~BDATU LTAK~BZEIT
    INTO CORRESPONDING FIELDS OF TABLE IT_LTAP
      FROM LTAP JOIN LTAK
         ON  LTAK~LGNUM = LTAP~LGNUM AND
            LTAK~TANUM = LTAP~TANUM
      WHERE LTAP~LGNUM = P_LGNUM
         AND LTAP~PQUIT = ' '.
              Now you can determine the exact value distribution for LGNUM and PQUIT using a GROUP BY:

SELECT LGNUM, PQUIT, COUNT(*) FROM LTAP GROUP BY LGNUM, PQUIT;

LGN P   COUNT(*)
--- - ----------
I01 X        10
I02            5
I02 X      33955
I03 X      3088
P01          164
P01 X      81941
S01            2
S01 X      67807
S02          10
S02 X      3201
W01           33
W01 X     139158
              You can tell that only very few entries ever exist for the used condition PQUIT = ' '. The condition is therefore very selective, although PQUIT only assumes two different values. A nested loop join with access through LTAP is therefore far preferable to a sort merge join.
              The following options are available to persuade Oracle to use a nested loop join:
  • Specification of a hint (see Note 772497)
  • Sufficiently increase the statistics value for the distinct values of PQUIT, to make access using the LTAP~M index more attractive for Oracle (Note 724545).
              One solution within the application is to split the join into two separate statements as described in Note 187935.
              As an alternative, you can also use a SUBSTITUTE_LITERALS hint and histograms (refer to Note 811852).
    25. Is the ST05 data always reliable?
              In a number of situations, the measured values may not be correct in transaction ST05. For example:
  • Some operating system may not measure times on the basis of microseconds. Due to an incorrect measurement of time in TRU64, for example, all time data may be a multiple of 976 ms, as 976 ms (1000000 / 1024 ms) is the most precise time information possible.
  • Check whether there are problems with time measurement in the system (for example, AMD x64 -> Notes 1036194, 1047734). If in doubt, you can check the plausibility of high relative time data by comparing it to the absolute time data in the extended trace. Only if the absolute times increase at least at the same rate are the statement-specific relative runtimes correct.
  • The system may assign incorrect statements to runtimes and it may not display other statements at all if the SAP kernel mixes up data during the measurement of time. This may occur in particular when the system writes messages of the following type in the work process trace file (dev_w<number>) at the same time:

    *** WARNING ==> TrStackPeek: record 1 requested, which is not filled
                                in stack (0)
           In such a case, log a customer message with SAP, so the cause for this problem can be analyzed.
  • SQL accesses may not be logged at all if there are problems at a deeper level. It is also worth looking at the relevant work process trace file. The following error indicates that SQL trace records could not be written successfully:

    M  *** ERROR => TrWriteEntry: TrBuffAddRec failed with rc=5 (No Memory)
                                  (253 bytes) [trfile.c     1342]
    26. Where can I find further information on SQL optimization?
              In addition to covering general performance topics, the final section of Note 618868 contains references to books, SAP training and SAP services that have the SQL optimization as a central topic.
Header Data
Release Status:Released for Customer
Released on:03.01.2013  09:09:27
Master Language:German
Priority:Recommendations/additional info
Category:FAQ
Primary Component:BC-DB-ORA Oracle
Secondary Components:SV-BO Backoffice Service Delivery
Affected Releases
Release-Independent
Related Notes

 
1601951 - Self Service 'SQL Statement Tuning' - Prerequisites and FAQ
 
1128663 - Oracle 10g: SQL trace using auditing
 
1100926 - FAQ: Network performance
 
1093107 - Costly SQL statement due to synchronous matchcodes
 
1037755 - Performance problems with ASSM tablespaces
 
913247 - Performance problems due to open changes
 
912620 - FAQ: Oracle indexes
 
896717 - Problems with tables in ASSM tablespaces
 
881083 - Blocking factors on Oracle-based systems
 
853576 - Oracle 10g: Performance analysis w/ ASH and Oracle Advisors
 
832343 - FAQ: Clustering factor
 
825653 - Oracle: Common misconceptions
 
806554 - FAQ: I/O-intensive database operations
 
797629 - FAQ: Oracle histograms
 
793113 - FAQ: Oracle I/O configuration
 
772497 - FAQ: Oracle Hints
 
771929 - FAQ: Index fragmentation
 
767414 - FAQ: Oracle latches
 
762808 - FAQ: Keep pool
 
756335 - Statistics in tables w/ heavily fluctuating volumes of data
 
750631 - Approximations for cost calculation of the CBO
 
745639 - FAQ: Oracle enqueues
 
724545 - Adjusting the CBO statistics manually using DBMS_STATS
 
723879 - Explain: System displays an incorrect access path
 
712624 - High CPU consumption by Oracle
 
706478 - Preventing Basis tables from increasing considerably
 
651060 - FAQ: Oracle Parallel Execution
 
619188 - FAQ: Oracle wait events
 
618868 - FAQ: Oracle performance
 
613872 - Oracle traces with ORADEBUG
 
602667 - Perform a SQL Trace on an oracle utility or external program
 
588668 - FAQ: Database statistics
 
563359 - Performance optimization for tables with LOB columns
 
502782 - Composite SAP Note ora-4030
 
354080 - Note collection for Oracle performance problems
 
329914 - Expensive SQL statements composite note
 
191492 - Performance: Customer developments in MM/WM
 
187906 - Performance: Customer developments in PP and PM
 
185530 - Performance: Customer developments in SD
 
176754 - Problems with CBO and RBO
 
135048 - Long runtimes statements w/ ROWNUM <= condition
 
132861 - CBO: Statistics creation with SAPDBA or BRCONNECT
 
131372 - CBO: Tuning of expensive SQL statements
 
130480 - Database hints in Open SQL for Oracle
 
122718 - CBO: Tables with special treatment
 
48230 - Parameterization for SELECT ... FOR ALL ENTRIES statement
 
3807 - Error messages regarding rollback and undo segments

No comments: