How to use Repository Diagnostic Tool (reposcan)

Introduction

The scope of the tool is to scan and repair inconsistencies between the FRS and CMS database. The inconsistencies can be derived from but are not limited to hot backups, disaster recovery or a network failure during a database write, insert or update. This tool does not guarantee that it will catch or fix all of the inconsistencies between the FRS and CMS database.
The RDT scans for two types of inconsistencies. Below is the list of inconsistency, by type, which can be found by RDT.

a - Object to file inconsistencies:

It is inconsistencies that can occur between InfoObjects in the CMS database and the corresponding files in the File Repositories. For example:
  • The object exists in the CMS database, but there is no corresponding file in the FRS.
  • The file exists in the FRS, but there is no corresponding file in the CMS database.
  • The size of the file does not match the InfoObject file size.
  • The FRS folder is empty.

b - InfoObject metadata inconsistencies:

It is inconsistencies that can occur in an InfoObject's object definition (metadata) in the CMS database. For example:
  • The object has a missing or invalid Parent Object ID.
  • The object has a missing or invalid Owner Object ID.
  • The object has a missing or invalid Submitter Object ID.
  • The object's last sucessful instance is missing or invalid.
  • The object references a calendar that doesn't exist.
  • The preferred server does not exist.
  • The event or events that this object is waiting on does not exist.
  • This object triggers an event that does not exist.
  • Orphaned Access Control entry.
  • A specific user account has multiple favorites folders.

How to execute the RDT

The RDT tool can be found :
  • on WINDOWS: reposcan.exe is in "<INSTALLDIR>\BusinessObjects Enterprise 12.0\win64_x64\"
    RepoScan Usage is: reposcan.exe -dbdriver <dbdriver> -dbkey <cluster_key> -connect <dbconnectstring> -inputfrsdir <inputfrsdir> -outputfrsdir <outputfrsdir> [options...]
  • on UNIX: "boe_reposcan.sh" is in "<INSTALLDIR>/bobje/"
    RepoScan Usage is: $BOBJDIR/boe_reposcan.sh -dbdriver <dbdriver> -dbkey <cluster_key> -connect <dbconnectstring> -inputfrsdir <inputfrsdir> -outputfrsdir <outputfrsdir> [options...]
  • For all the Operating Systems, the options list is identical excepted for the dbdriver "sqlserverdatabasesubsystem" which doesn't exist on Unix.
Details for each options can be found into the official RDT documentation, BusinessObjects Enterprise XI 3 and SAP Business Intelligence Platform 4.x
List of options is :
-dbdriver <dbdriver>
the name of the driver for the database to connect to. Examples:
For DB2:-dbdriver db2databasesubsystem -connect "UID=<user>;PWD=<password>;DSN=<dsn>"
For MySQL:-dbdriver mysqldatabasesubsystem -connect "UID=<user>;PWD=<password>;DSN=<databasename>;HOSTNAME=<hostname>;PORT=<port>"
For Oracle: -dbdriver oracledatabasesubsystem -connect "UID=<user>;PWD=<password>;DSN=<dsn>"
For SQL Server: -dbdriver sqlserverdatabasesubsystem -connect "UID=<user>;PWD=<password>;DSN=<dsn>"
For SQL Anywhere: -dbdriver sqlanywheredatabasesubsystem -connect "UID=<user>;PWD=<password>;DSN=<dsn>"
For Sybase: -dbdriver sybasedatabasesubsystem -connect "UID=<user>;PWD=<password>;DSN=<dsn>"
For SAP HANA: -dbdriver newdbdatabasesubsystem -connect "UID=<user>;PWD=<password>;DSN=<dsn>"
-dbkeyfor BI4.x only onwards: the cluster key for your BI platform deployment. For more information on cluster keys see “Securing BI Platform” in the SAP BusinessObjects Business Intelligence Platform Administrator Guide
-connect <dbconnectstring>
the database connection string has to be in the format "UID=<user>;PWD=<password>;DSN=<dsn>".
-inputfrsdir <inputfrsdir>
the root directory of the Input File Repository Server's file store.
-outputfrsdir <outputfrsdir>
the root directory of the Output File Repository Server's file store.
-optionsfile <optionsfile>
the path to the file that specifies the reposcan options.
-outputdir <outputdir>
the directory for storing XML result files.
-repair
reposcan should repair whichever inconsistencies it finds (default: off).
-count <max errors>
the maximum number of inconsistencies to scan for (default: 1000).
-scanfrs
only scan inconsistencies related to the FRS (default: always scan the FRS).
-scancms
only scan inconsistencies in infoobjects (default: always scan infoobjects).
-startid <objid>
start searching for inconsistencies from the given object ID. (default: 0)
-submitterid <objid>
repair submitter ID inconsistencies with the given user ID. (default: no repair)
-requestport <port> (star)
specify which port to use for CORBA
-numericip (star)
specify that numeric IPs should be used for CORBA
-ipv6 <interface> (star)
specify which numeric IPv6 address to use for CORBA
-port <interface> (star)
specify which IPv4 interface to use for CORBA
-threads <count> (star)
specify how many CORBA threads to use
-protocol ssl (**)
specify that Reposcan should run in SSL mode
-ssl_certdir <directory> (**)
specify which directory contains the SSL certificates and configuration
-ssl_trustedcertificate <file> (**)
the filename of the signed CA certificate
-ssl_mycertificate <file> (**)
the filename of the signed certificate
-ssl_mykey <file> (**)
the filename of the private key
-ssl_mykey_passphrase <file> (**)
the filename that contains the plaintext passphrase
(* :The following parameters are used when running the Repository Diagnostic Tool against a live system with a clustered CMS database and firewalls.)
(** :The following parameters are used when the RDT uses SSL to communicate with the CMS database that it scans.)

Output of RDT

The output files would be generated at the directory: <BOE install>\BusinessObjects Enterprise 12.0\reposcan, there are two xsl files in this directory: rdt_scan.xsl and rdt_repair.xsl, which help to generate a friendly view when you open the output xml on a Web broswer.
The name format of scan output file is Repo_Scan_YYYY_MM_DD_HH_MM_SS.xml (e.g. Repo_Scan_2009_10_26_11_24_21.xml).
The name format of repair output file is Repo_Repair_YYYY_MM_DD_HH_MM_SS.xml (e.g. Repo_Repair_2009_10_26_11_24_21.xml).
Example of scan output
Example of repair output

How to simplify the command line 

To simplify the command line,  users may use the parameter "optionsfile" which specifies the file path to a parameter file. The parameter file is a text file that lists each command-line option and its values. Be sure the file should have one parameter per line.
For example, users can run the command line:
   "$BOBJDIR/boe_reposcan.sh -optionsfile "./reposcan_options.txt""
where the text file "reposcan_options.txt" will contain the following options:
example: reposcan_options.txt
-dbdriver mysqldatabasesubsystem
-dbkey G0ldeneye
-connect "UID=dbroot;PWD=xxxxxxx;DSN=BOE120;HOSTNAME=serverlinux1;PORT=6306"
-inputfrsdir "<XI3.1 installation path>/bobje/enterprise120/FileStore/Input"
-outputfrsdir "<XI3.1 installation path>/bobje/enterprise120/FileStore/Output"
-scanfrs
-scancms
Besides, as it is a command line users may use shell or bat files instead of the optionsfile parameter.

Best practices

In what scenarios does the RDT need to run?

     1.Before and after migration, that is to do the scan on source machine before the migration and do the scan on target machine after the migration
     2. After disaster recovery or restore from backup
     3. After CMC server crashes
     4. After FRS input, output servers crash
     5. When users cannot access/view reports. For example report links disappeared or reports links appear but cannot be opened, even if users have sufficient security/access rights.
     6. If CMS server fails to start, users may use the RDT as a diagnotistic tool. 

What should be taken special care when running RDT

Be careful with the parameter "-repair" which tells the RDT to repair all inconsistencies. Repair actions that the RDT takes while users are accessing your deployment can create further inconsistencies.  So do not to run the RDT against a live CMS database and FRS. Or at least not to use the "-repair" option against live CMS. 

How often to run the RDT

There is not a guideline on how often to run RDT. Normally operated system would normally not require RDT to be run periodically, unless cases listed above occur.
Anyway, if the environment is not stable from either software or hardware perspective, for example, there is occasional network failure or hard disk failure, then users may consider to run RDT as a routine maintenance task. 

What is the speed of the RDT

The speed depends on system profermance, to give an approximate idea on the speed the RDT, when scaning a BOE system on a 2-CPU(2.4GHz) box with 20000 InfoObjects, it takes 5 seconds to set the Database connection and initialize the scan, 45 seconds to scan, 10 seconds to generate the output files.

What type of inconsistencies are not supported

There are some types of inconsistencies can not be detected by RDT, for example:
  • Invalid Target ID (SI_TARGETID)
  • Invalid alias target ID (SI_ALIAS_TARGETID)
  • Invalid SubGroup ID (item in SI_SUBGROUPS collection)
  • Invalid ServerGroup ID (item in SI_SERVERGROUPS collection)
  • Invalid Input File ID (item in SI_INPUT_FILES collection)
  • Invalid Output File ID (item in SI_OUTPUT_FILES collection)
  • Invalid Relationship ID
  • Invalid SI_SCOPEBATCH_SCOPE_PRINCIPALS

List of Inconsistencies and corresponding actions taken on RDT

inconsistency between CMS and FRS
Inconsistency
Action
The object exists in the CMS database, but there is no corresponding file in the FRS.
Removes the object from the CMS database.
The file exists in the FRS, but there is no corresponding file in the CMS database.
When you republish the file, an object is created in the CMS database.
The size of the file does not match the InfoObject file size.
The RDT updates the file size in the CMS database.
The FRS folder is empty.
Removes the empty directory.

inconsistency in the CMS metadata
Inconsistency
Action
The object has a missing or invalid Parent Object ID.
Moves the object and any child objects to a repair folder. Only the administrator has access to this folder.
The object has a missing or invalid Owner Object ID.
Assigns the value of the Administrator's ID to the objects Owner ID.
The object has a missing or invalid Submitter Object ID.
If you provide a value from the --submitterid parameter, the RDT applies the value for the object's submitter ID.
If you provide a value from the --submit terid parameter, the RDT applies the value for the object's submitter ID.
When you reschedule the object, the CMS automatically recalculates the ID.
The object references a calendar that doesn't exist.
When you reschedule the object, the CMS applies a calendar to the object.
The preferred server does not exist.
When you reschedule the object, the CMS applies a server group to the object.
The event or events that this object is waiting on does not exist.
The RDT removes the missing events.
This object triggers an event that does not exist.
Removes the missing events.
Orphaned Access Control entry.
Removes the missing principal (s).
The preferred server does not exist.
Removes the objects missing entries from the object's server group list.
A specific user account has multiple favorites folders.
The RDT consolidates the user's Favorites folders into a single folder.

References and links

Official documentation

No comments: