Note 407260 - FAQs: Compression of InfoCubes

Summary
Symptom
This note gives some explanation for the compression of InfoCubes with ORACLE as db-platform.
Compression on other db-platform might differ from this.
Other terms
InfoCubes, Compression, Aggregates, F-table, E-table, partitioning,
ora-4030, ORACLE, Performance, Komprimierung
Reason and Prerequisites
Questions:
    1. What is the extent of compression we should expect from the portion we are loading?
    2. When the compression is stopped, will we have lost any data from the cube?
    3. What is the optimum size a chunk of data to be compressed?
    4. Does compression lock the entire fact table? even if only selected records are being compressed?
    5. Should compression run with the indexes on or off?
    6. What can I do if the performance of the compression is bad or becomes bad? Or what can I do if query performance after compression is bad?
Solution
In general:
First of all you should check whether the P-index on the e-facttable exists. If this index is missing compression will be practically impossible. If this index does not exist, you can recreate this index by activating the cube again. Please check the activation log to see whether the creation was successful.
There is one exception from this rule: If only one request is choosen for compression and this is the first request to be compressed for that cube, then the P-index is dropped and after the compression the index is recreated again automatically. This is done for performance reasons.

Answers:
    1. The compression ratio is completely determined by the data you are loading. Compression does only mean that data-tuples which have the identical 'logical' key in the facttable (logical key includes all the dimension identities with the exception of the 'technical' package dimension) are combined into a single record.
              So for example if you are loading data on a daily basis but your cube does only contain the month as finest time characteristics you might get a compression ratio of 1/30.
              The other extreme; if every record you are loading is different from the records you have loaded before (e.g. your record contains a sequence number), then the compression ratio will be 1, which means that there is no compression at all. Nevertheless even in this case you should compress the data if you are using partitioning on the E-facttable because only for compressed data partitioning is used. Please see css-note 385163 for more details about partitioning.
              If you are absolutely sure, that there are no duplicates in the records you can consider the optimization which is described in the css-note 0375132.
    2. The data should never become inconsistent by running a compression. Even if you stop the process manually a consistent state should be reaches. But it depends on the phase in which the compression was when it was canceled whether the requests (or at least some of them) are compressed or whether the changes are rolled back.
              The compression of a single request can be diveded into 2 main phases.
      a) In the first phase the following actions are executed:
                       Insert or update every row of the request, that should be compressed into the E-facttable
                       Delete the entry for the corresponding request out of the package dimension of the cube
                       Change the 'compr-dual'-flag in the table rsmdatastate
                       Finally a COMMIT is is executed.
      b) In the second phase the remaining data in the F-facttable is deleted.
                       This is either done by a 'DROP PARTITION' or by a 'DELETE'. As this data is not accessible in queries (the entry of package dimension is deleted) it does not matter if this phase is terminated.
                       Concluding this:
                       If the process is terminated while the compression of a request is in phase (a), the data is rolled back, but if the compression is terminated in phase (b) no rollback is executed. The only problem here is, that the f-facttable might contain unusable data. This data can be deleted with the function module RSCDS_DEL_OLD_REQUESTS. For running this function module you only have to enter the name of the infocube. If you want you can also specify the dimension id of the request you want to delete (if you know this ID); if no ID is specified the module deletes all the entries without a corresponding entry in the package-dimension.
                       If you are compressing several requests in a single run and the process breaks during the compression of the request x all smaller requests are committed and so only the request x is handled as described above.
    3. The only size limitation for the compression is, that the complete rollback information of the compression of a single request must fit into the rollback-segments. For every record in the request which should be compressed either an update of the corresponding record in the E-facttable is executed or the record is newly inserted. As for the deletion normally a 'DROP PARTITION' is used the deletion is not critical for the rollback. As both operations are not so expensive (in terms of space) this should not be critical.
              Performance heavily dependent on the hardware. As a rule of the thumb you might expect that you can compress about 2 million rows per hour if the cube does not contain non-cumulative keyfigures and if it contains such keyfigures we would expect about 1 million rows.
    4. It is not allowed to run two compressions concurrently on the same cube. But for example loading into a cube on which a compression runs should be possible, if you don´t try to compress requests which are still in the phase of loading/updating data into the cube.
    5. Compression is forbidden if a selective deletion is running on this cube and compression is forbidden while a attribute/hierarchy change run is active.
    6. It is very important that either the 'P' or the primary index '0' on the E-facttable exists during the compression.
              Please verify the existence of this index with transaction DB02. Without one of these indexes the compression will not run!!
              If you are running queries parallel to the compression you have to leave the secondary indexes active.
              If you encounter the error ORA-4030 during the compression you should drop the secondary indexes on the e-facttable. This can be achieved by using transaction SE14. If you are using the tabstrip in the adminstrator workbench the secondary indexes on the f-facttable will be dropped, too. (If there are requests which are smaller than 10 percent of f-facttable then the indexes on the f-facttable should be active because then the reading of the requests can be speed up by using the secondary index on the package dimension.) After that you should start the compression again.
              Deleting the secondary indexes on the E facttable of an infocube that should be compressed may be useful (somemtimes even necessary) to prevent ressource shortages on the database. Since the secondary indexes are needed for reporting (not for compression) , queries may take much longer in the time when the secondary E table indexes are not there.
              If you want to delete the secondary indexes only on the E facttable, you should use the function RSDU_INFOCUBE_INDEXES_DROP (and specify the parameters I_INFOCUBE = <techn. cubename> and I_FACTTAB = <name of E facttable>). If you want to rebuild the indexes use the function RSDU_INFOCUBE_INDEXES_REPAIR (same parameter as above).
              To check which indexes are there, you may use transaction RSRV and there select the elementary database check for the indexes of an infocube and its aggregates. That check is more informative than the lights on the performance tabstrip in the infocube maintenance.
    7. As already stated above it is absolutely necessary, that a concatenated index over all dimensions exits. This index normally has the suffix 'P'. Without this index a compression is not possible! If that index does not exist, the compression tries to build it. If that fails (forwhatever reason) the compression terminates.
              If you normally do not drop the secondary indexes during compression, then these indexes might degenerate after some compression-runs and therefore you should rebuild the indexes from time to time. Otherwise you might see performance degradation over time.
              As the distribution of data of the E-facttable and the F-facttable is changed by a compression, the query performance can be influenced significantly. Normally compression should lead to a better performance but you have to take care, that the statistics are up to date, so that the optimizer can choose an appropriate access path. This means, that after the first compression of a significant amount of data the E-facttable of the cube should be analyzed, because otherwise the optimizer still assumes, that this table is empty. Because of the same reason you should not analyze the F-facttable if all the requests are compressed because then again the optimizer assumes that the F-facttable is empty. Therefore you should analyze the F-facttable when a normal amount of uncompressed requests is in the cube.

Header Data
Release Status:Released for Customer
Released on:14.07.2010  07:17:36
Master Language:English
Priority:Recommendations/additional info
Category:Consulting
Primary Component:BW-BEX-OT-DBIF-CON Condensor
Secondary Components:BW-SYS-DB-ORA BW ORACLE
Affected Releases
Software
Component
Release
From
Release
To
Release
And
subsequent
SAP_APPL
46C
46C
46C
 
SAP_BW
20
20B
20B
 
SAP_BASIS
46
46D
46D
 
Correction delivered in Support Package
Support
Packages
Release
Package
Name
SAP_BW
20B
Related Notes

 
1461926 - FAQ: BW report SAP_INFOCUBE_DESIGNS
 
1233601 - Message RSDA 228 when archived data is deleted
 
590370 - Too many uncompressed request (f table partitions)
 
486395 - Error during request deletion of F-fact table, 20BSP20-21CSP
 
455281 - Selective deletion dumps with ORA-03113 in report GP
 
430486 - Overview/repair of F fact table of an BW InfoCube
 
428440 - Selective Deletion dumps with at RAISE x_message
 
409338 - Compression and upgrade from BW1.2 to BW2.0
 
385163 - Partitioning on ORACLE since BW 2.0
 
375132 - Performance optimization for InfoCube condensation

No comments: