What does Db2Diff compare anyway ?

Operating System: OS NAME

KERNEL VERSION

TOTAL CPU’s

CONFIGURED CPU’s

TOTAL MEMORY

OS VERSION

ARCHITECTURE TYPE Registry Variables: MISSING/UNSET Registry Variables between the two environments

Registry Variables set to different values Instance Environment: INSTANCE NAME

RELEASE NUMBER

SERVICE LEVEL

BUILD LEVEL

PROGRAM TEMPORARY FIX (PTF)

FIXPACK NUMBER

DB2 VERSION DBM Configuration (DBM CFG): COMPARE DBM CFG Parameters when they were not set to AUTOMATIC and reports mismatches

When either one of the DBM CFG is set to AUTOMATIC, reports only when the other one is not set to AUTOMATIC DB Configuration (DB CFG): COMPARE DB CFG Parameters when they were not set to AUTOMATIC and reports mismatches

When either one of the DB CFG is set to AUTOMATIC, reports only when the other one is not set to AUTOMATIC Database Objects: As of now only tables and all its dependent objects such as PK’s, FK’s, Indexes, Tablespaces, bufferpools and their properties are being compared. You can view high level things that are being compared in the image shown above. It compares a whole lot! This initial developed version compares the following

How do you run Db2Diff ?

Determine the version of your java run time environment, remember you can run this utility from linux, unix and windows anywhere as long as you have java installed. This utility comes prepackaged with all the libraries you need. You can run the following command to determine the version of your JRE.

java -version java version "1.6.0_40" OpenJDK Runtime Environment (IcedTea6 1.13.12) (rhel-1.13.12.6.el6_8-x86_64) OpenJDK 64-Bit Server VM (build 23.40-b40, mixed mode) 1 2 3 4 java - version java version "1.6.0_40" OpenJDK Runtime Environment ( IcedTea6 1.13.12 ) ( rhel - 1.13.12.6.el6_8 - x86_64 ) OpenJDK 64 - Bit Server VM ( build 23.40 - b40 , mixed mode )

In the interest of readers time, before discussing the hoops that I had to jump through to arrive at a solution, I am going to show how to run Db2Diff and compare two DB2 environments. Remember you can even compare two different schemas with in the same database. EXPAND/COLLAPSE →

2. Download the respective db2utils.jar and db2diff.prop files depending on the version of your JRE from below link.

Download db2utils.jar

3. Update the properties file (db2diff.prop) with proper connection information for your source and target databases. If you want to compare the whole databases between the two environments then simply leave srcschema and trgschema properties empty. If you just want to compare just single schema (schemas can be different e.g. srcschema=”PROD” and trgschema=”TEST”) populate srcschema and trgschema respectively.

4. Run the downloaded jar file as follows, remember to keep db2diff.prop in the same directory as the db2utils.jar file. It will display a clean and formatted output of all related database objects for the tables as shown below.

5. It is important to remember in the command “java -cp db2utils-1.x.jar Db2Diff”, Db2Diff is case sensitive. It is the name of the java class that gets called up on running the command. This utility should work on all DB2 versions above V10.1.

6. If you run this utility for comparing just one schema, environment comparison will be bypassed i.e. Db2Diff will not compare Hardware Info, Registry Variables, DBM CFG, DB CFG.

[U:\Desktop\db2utils]java -cp db2utils-1.7.jar Db2Diff Connected Successfully to jdbc:db2://server1:12345/sample:user=dbuser Connected Successfully to jdbc:db2://server2:12345/test:user=dbuser COMPARING SYSTEM CONFIGURATION BETWEEN SOURCE AND TARGET: Comparing HARDWARE & OS Configurations: No Discrepancies Comparing REGISTRY VARIABLE Configuration: MISSING/UNSET REGISTRY VARIABLE IN TARGET --> "DB2DBDFT", SOURCE set to "SAMPLE" Comparing DATABASE MANAGER Configuration: DBM CFG Parameter "NODETYPE" MISMATCH --> SOURCE "DPF_SERVER" TARGET "DSF_SERVER" DBM CFG Parameter "SVCENAME" MISMATCH --> SOURCE "50010" TARGET "60008" DBM CFG Parameter "SPM_NAME" MISMATCH --> SOURCE "xxxxxxx" TARGET "yyyyyyy" DBM CFG Parameter "CUR_EFF_CODE_LVL" MISMATCH --> SOURCE "V:10 R:5 M:0 F:8 I:0 SB:0" TARGET "V:10 R:5 M:0 F:6 I:0 SB:0" DBM CFG Parameter "CUR_EFF_ARCH_LVL" MISMATCH --> SOURCE "V:10 R:5 M:0 F:8 I:0 SB:0" TARGET "V:10 R:5 M:0 F:6 I:0 SB:0" Comparing DATABASE Configuration: DB CFG Parameter "SORTHEAP" AUTOMATIC FLAG MISMATCH --> SOURCE "NONE" TARGET "AUTOMATIC" DB CFG Parameter "LOCKLIST" AUTOMATIC FLAG MISMATCH --> SOURCE "NONE" TARGET "AUTOMATIC" DB CFG Parameter "PCKCACHESZ" AUTOMATIC FLAG MISMATCH --> SOURCE "NONE" TARGET "AUTOMATIC" DB CFG Parameter "SHEAPTHRES_SHR" AUTOMATIC FLAG MISMATCH --> SOURCE "NONE" TARGET "AUTOMATIC" DB CFG Parameter "CATALOGCACHE_SZ" MISMATCH --> SOURCE "-1" TARGET "300" DB CFG Parameter "LOGPRIMARY" MISMATCH --> SOURCE "3" TARGET "13" DB CFG Parameter "PAGESIZE" MISMATCH --> SOURCE "8192" TARGET "4096" DB CFG Parameter "MAXLOCKS" AUTOMATIC FLAG MISMATCH --> SOURCE "NONE" TARGET "AUTOMATIC" DB CFG Parameter "CHNGPGS_THRESH" MISMATCH --> SOURCE "60" TARGET "80" DB CFG Parameter "LOGSECOND" MISMATCH --> SOURCE "10" TARGET "12" DB CFG Parameter "LOGBUFSZ" MISMATCH --> SOURCE "256" TARGET "2149" DB CFG Parameter "SELF_TUNING_MEM" MISMATCH --> SOURCE "OFF" TARGET "ON (Active)" DB CFG Parameter "DB_SEED" MISMATCH --> SOURCE "3802569814" TARGET "3957787418" DB CFG Parameter "LOGFILSIZ" MISMATCH --> SOURCE "1000" TARGET "1024" COMPARING SCHEMA(S) BETWEEN SOURCE AND TARGET: Db2Diff is now comparing the source schema TEST with target schema TEST Following table(s) do not exist in target schema TEST MISSING TABLE --> CUSTMER Comparing table ACCOUNT between source and target WARNING SYSTEM GENERATED PRIMARY KEY NAMES DIFFER --> SOURCE "SQL170901163232470" TARGET "SQL171220162026040" MISMATCH PRIMARY KEY COLUMN SEQUENCE --> SOURCE "ACCTID" POSITION 1 TARGET "ACCTID" POSITION 2 MISMATCH PRIMARY KEY COLUMN SEQUENCE --> SOURCE "NAME" POSITION 2 TARGET "NAME" POSITION 1 DATA TS BUFFERPOOL "NPAGES" MISMATCH --> SOURCE "BP_16K" -2 TARGET BP_16K 10000 DATA TS BUFFERPOOL NUMBER OF BLOCK PAGES MISMATCH --> SOURCE "BP_16K" 0 TARGET BP_16K 1000 DATA TS BUFFERPOOL "BLOCKSIZE" MISMATCH --> SOURCE "BP_16K" 0 TARGET BP_16K 50 Comparing table CHECKACCT between source and target WARNING SYSTEM GENERATED PRIMARY KEY NAMES DIFFER --> SOURCE "SQL170901163233010" TARGET "SQL171220162026080" MISMATCH FOREIGN KEY CHECKACCT_FK's COLUMN "NAME" SEQUENCE --> SOURCE POSITION 2 TARGET POSITION 1 MISMATCH FOREIGN KEY CHECKACCT_FK's COLUMN "ACCTID" SEQUENCE --> SOURCE POSITION 1 TARGET POSITION 2 DATA TS BUFFERPOOL "NPAGES" MISMATCH --> SOURCE "BP_16K" -2 TARGET BP_16K 10000 DATA TS BUFFERPOOL NUMBER OF BLOCK PAGES MISMATCH --> SOURCE "BP_16K" 0 TARGET BP_16K 1000 DATA TS BUFFERPOOL "BLOCKSIZE" MISMATCH --> SOURCE "BP_16K" 0 TARGET BP_16K 50 Comparing table CHECKBOOK between source and target WARNING SYSTEM GENERATED PRIMARY KEY NAMES DIFFER --> SOURCE "SQL170901163233370" TARGET "SQL171220162026120" DATA TS BUFFERPOOL "NPAGES" MISMATCH --> SOURCE "BP_16K" -2 TARGET BP_16K 10000 DATA TS BUFFERPOOL NUMBER OF BLOCK PAGES MISMATCH --> SOURCE "BP_16K" 0 TARGET BP_16K 1000 DATA TS BUFFERPOOL "BLOCKSIZE" MISMATCH --> SOURCE "BP_16K" 0 TARGET BP_16K 50 Comparing table CUSTOMER between source and target TABLE "COLCOUNT" MISMATCH --> SOURCE 2 TARGET 3 TABLE "COMPRESSION" MISMATCH --> SOURCE R TARGET N TABLE "DATACAPTURE" MISMATCH --> SOURCE Y TARGET N MISMATCH COLUMN ORDER BETWEEN SOURCE AND TARGET --> "NAME" SOURCE POSITION 1 TARGET POSITION 0 MISMATCH COLUMN ORDER BETWEEN SOURCE AND TARGET --> "ID" SOURCE POSITION 0 TARGET POSITION 1 INDEX TS "PAGESIZE" MISMATCH --> SOURCE USERSPACE1 8192 TARGET MYTBSP 4096 INDEX TS BUFFERPOOL "NPAGES" MISMATCH --> SOURCE "IBMDEFAULTBP" 1000 TARGET IBMDEFAULTBP -2 INDEX TS BUFFERPOOL "PAGESIZE" MISMATCH --> SOURCE "IBMDEFAULTBP" 8192 TARGET IBMDEFAULTBP 4096 DATA TS "PAGESIZE" MISMATCH --> SOURCE USERSPACE1 8192 TARGET MYTBSP 4096 DATA TS BUFFERPOOL "NPAGES" MISMATCH --> SOURCE "IBMDEFAULTBP" 1000 TARGET IBMDEFAULTBP -2 DATA TS BUFFERPOOL "PAGESIZE" MISMATCH --> SOURCE "IBMDEFAULTBP" 8192 TARGET IBMDEFAULTBP 4096 Comparing table SALARY between source and target TABLE INDEX TS NAME MISMATCH --> SOURCE INDEX_TS TARGET INDEX_TS1 MISMATCH COLUMN "NAME" DATATYPE --> SOURCE VARCHAR(30,0) TARGET VARCHAR(25,0) MISMATCH COLUMN "EMP_DESC" CLOB(102400,0) INLINE LENGTH --> SOURCE 6000 TARGET 140 MISMATCH COLUMN "BONUS" NULLABILITY --> SOURCE Y TARGET N INDEX "SALARY_IDX1" PROPERTY UNIQUERULE MISMATCH --> SOURCE U TARGET D INDEX "SALARY_IDX1" PROPERTY COMPRESSION MISMATCH --> SOURCE N TARGET Y INDEX TS "PAGESIZE" MISMATCH --> SOURCE INDEX_TS 8192 TARGET INDEX_TS1 4096 INDEX TS BUFFERPOOL "NPAGES" MISMATCH --> SOURCE "IBMDEFAULTBP" 1000 TARGET IBMDEFAULTBP -2 INDEX TS BUFFERPOOL "PAGESIZE" MISMATCH --> SOURCE "IBMDEFAULTBP" 8192 TARGET IBMDEFAULTBP 4096 DATA TS "PAGESIZE" MISMATCH --> SOURCE USERSPACE1 8192 TARGET USERSPACE1 4096 DATA TS BUFFERPOOL "NPAGES" MISMATCH --> SOURCE "IBMDEFAULTBP" 1000 TARGET IBMDEFAULTBP -2 DATA TS BUFFERPOOL "PAGESIZE" MISMATCH --> SOURCE "IBMDEFAULTBP" 8192 TARGET IBMDEFAULTBP 4096 Comparing table VACATION between source and target DATA TS BUFFERPOOL "NPAGES" MISMATCH --> SOURCE "BP_16K" -2 TARGET BP_16K 10000 DATA TS BUFFERPOOL NUMBER OF BLOCK PAGES MISMATCH --> SOURCE "BP_16K" 0 TARGET BP_16K 1000 DATA TS BUFFERPOOL "BLOCKSIZE" MISMATCH --> SOURCE "BP_16K" 0 TARGET BP_16K 50 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 [U:\Desktop\db2utils]java - cp db2utils - 1.7.jar Db2Diff Connected Successfully to jdbc:db2: //server1:12345/sample:user=dbuser Connected Successfully to jdbc:db2: //server2:12345/test:user=dbuser COMPARING SYSTEM CONFIGURATION BETWEEN SOURCE AND TARGET: Comparing HARDWARE & OS Configurations: No Discrepancies Comparing REGISTRY VARIABLE Configuration: MISSING / UNSET REGISTRY VARIABLE IN TARGET - - > "DB2DBDFT" , SOURCE set to "SAMPLE" Comparing DATABASE MANAGER Configuration: DBM CFG Parameter "NODETYPE" MISMATCH - - > SOURCE "DPF_SERVER" TARGET "DSF_SERVER" DBM CFG Parameter "SVCENAME" MISMATCH - - > SOURCE "50010" TARGET "60008" DBM CFG Parameter "SPM_NAME" MISMATCH - - > SOURCE "xxxxxxx" TARGET "yyyyyyy" DBM CFG Parameter "CUR_EFF_CODE_LVL" MISMATCH - - > SOURCE "V:10 R:5 M:0 F:8 I:0 SB:0" TARGET "V:10 R:5 M:0 F:6 I:0 SB:0" DBM CFG Parameter "CUR_EFF_ARCH_LVL" MISMATCH - - > SOURCE "V:10 R:5 M:0 F:8 I:0 SB:0" TARGET "V:10 R:5 M:0 F:6 I:0 SB:0" Comparing DATABASE Configuration: DB CFG Parameter "SORTHEAP" AUTOMATIC FLAG MISMATCH - - > SOURCE "NONE" TARGET "AUTOMATIC" DB CFG Parameter "LOCKLIST" AUTOMATIC FLAG MISMATCH - - > SOURCE "NONE" TARGET "AUTOMATIC" DB CFG Parameter "PCKCACHESZ" AUTOMATIC FLAG MISMATCH - - > SOURCE "NONE" TARGET "AUTOMATIC" DB CFG Parameter "SHEAPTHRES_SHR" AUTOMATIC FLAG MISMATCH - - > SOURCE "NONE" TARGET "AUTOMATIC" DB CFG Parameter "CATALOGCACHE_SZ" MISMATCH - - > SOURCE " - 1" TARGET "300" DB CFG Parameter "LOGPRIMARY" MISMATCH - - > SOURCE "3" TARGET "13" DB CFG Parameter "PAGESIZE" MISMATCH - - > SOURCE "8192" TARGET "4096" DB CFG Parameter "MAXLOCKS" AUTOMATIC FLAG MISMATCH - - > SOURCE "NONE" TARGET "AUTOMATIC" DB CFG Parameter "CHNGPGS_THRESH" MISMATCH - - > SOURCE "60" TARGET "80" DB CFG Parameter "LOGSECOND" MISMATCH - - > SOURCE "10" TARGET "12" DB CFG Parameter "LOGBUFSZ" MISMATCH - - > SOURCE "256" TARGET "2149" DB CFG Parameter "SELF_TUNING_MEM" MISMATCH - - > SOURCE "OFF" TARGET "ON ( Active ) " DB CFG Parameter "DB_SEED" MISMATCH - - > SOURCE "3802569814" TARGET "3957787418" DB CFG Parameter "LOGFILSIZ" MISMATCH - - > SOURCE "1000" TARGET "1024" COMPARING SCHEMA ( S ) BETWEEN SOURCE AND TARGET: Db2Diff is now comparing the source schema TEST with target schema TEST Following table ( s ) do not exist in target schema TEST MISSING TABLE - - > CUSTMER Comparing table ACCOUNT between source and target WARNING SYSTEM GENERATED PRIMARY KEY NAMES DIFFER - - > SOURCE "SQL170901163232470" TARGET "SQL171220162026040" MISMATCH PRIMARY KEY COLUMN SEQUENCE - - > SOURCE "ACCTID" POSITION 1 TARGET "ACCTID" POSITION 2 MISMATCH PRIMARY KEY COLUMN SEQUENCE - - > SOURCE "NAME" POSITION 2 TARGET "NAME" POSITION 1 DATA TS BUFFERPOOL "NPAGES" MISMATCH - - > SOURCE "BP_16K" - 2 TARGET BP_16K 10000 DATA TS BUFFERPOOL NUMBER OF BLOCK PAGES MISMATCH - - > SOURCE "BP_16K" 0 TARGET BP_16K 1000 DATA TS BUFFERPOOL "BLOCKSIZE" MISMATCH - - > SOURCE "BP_16K" 0 TARGET BP_16K 50 Comparing table CHECKACCT between source and target WARNING SYSTEM GENERATED PRIMARY KEY NAMES DIFFER - - > SOURCE "SQL170901163233010" TARGET "SQL171220162026080" MISMATCH FOREIGN KEY CHECKACCT_FK's COLUMN "NAME" SEQUENCE - - > SOURCE POSITION 2 TARGET POSITION 1 MISMATCH FOREIGN KEY CHECKACCT_FK's COLUMN "ACCTID" SEQUENCE - - > SOURCE POSITION 1 TARGET POSITION 2 DATA TS BUFFERPOOL "NPAGES" MISMATCH - - > SOURCE "BP_16K" - 2 TARGET BP_16K 10000 DATA TS BUFFERPOOL NUMBER OF BLOCK PAGES MISMATCH - - > SOURCE "BP_16K" 0 TARGET BP_16K 1000 DATA TS BUFFERPOOL "BLOCKSIZE" MISMATCH - - > SOURCE "BP_16K" 0 TARGET BP_16K 50 Comparing table CHECKBOOK between source and target WARNING SYSTEM GENERATED PRIMARY KEY NAMES DIFFER - - > SOURCE "SQL170901163233370" TARGET "SQL171220162026120" DATA TS BUFFERPOOL "NPAGES" MISMATCH - - > SOURCE "BP_16K" - 2 TARGET BP_16K 10000 DATA TS BUFFERPOOL NUMBER OF BLOCK PAGES MISMATCH - - > SOURCE "BP_16K" 0 TARGET BP_16K 1000 DATA TS BUFFERPOOL "BLOCKSIZE" MISMATCH - - > SOURCE "BP_16K" 0 TARGET BP_16K 50 Comparing table CUSTOMER between source and target TABLE "COLCOUNT" MISMATCH - - > SOURCE 2 TARGET 3 TABLE "COMPRESSION" MISMATCH - - > SOURCE R TARGET N TABLE "DATACAPTURE" MISMATCH - - > SOURCE Y TARGET N MISMATCH COLUMN ORDER BETWEEN SOURCE AND TARGET - - > "NAME" SOURCE POSITION 1 TARGET POSITION 0 MISMATCH COLUMN ORDER BETWEEN SOURCE AND TARGET - - > "ID" SOURCE POSITION 0 TARGET POSITION 1 INDEX TS "PAGESIZE" MISMATCH - - > SOURCE USERSPACE1 8192 TARGET MYTBSP 4096 INDEX TS BUFFERPOOL "NPAGES" MISMATCH - - > SOURCE "IBMDEFAULTBP" 1000 TARGET IBMDEFAULTBP - 2 INDEX TS BUFFERPOOL "PAGESIZE" MISMATCH - - > SOURCE "IBMDEFAULTBP" 8192 TARGET IBMDEFAULTBP 4096 DATA TS "PAGESIZE" MISMATCH - - > SOURCE USERSPACE1 8192 TARGET MYTBSP 4096 DATA TS BUFFERPOOL "NPAGES" MISMATCH - - > SOURCE "IBMDEFAULTBP" 1000 TARGET IBMDEFAULTBP - 2 DATA TS BUFFERPOOL "PAGESIZE" MISMATCH - - > SOURCE "IBMDEFAULTBP" 8192 TARGET IBMDEFAULTBP 4096 Comparing table SALARY between source and target TABLE INDEX TS NAME MISMATCH - - > SOURCE INDEX_TS TARGET INDEX_TS1 MISMATCH COLUMN "NAME" DATATYPE - - > SOURCE VARCHAR ( 30 , 0 ) TARGET VARCHAR ( 25 , 0 ) MISMATCH COLUMN "EMP_DESC" CLOB ( 102400 , 0 ) INLINE LENGTH - - > SOURCE 6000 TARGET 140 MISMATCH COLUMN "BONUS" NULLABILITY - - > SOURCE Y TARGET N INDEX "SALARY_IDX1" PROPERTY UNIQUERULE MISMATCH - - > SOURCE U TARGET D INDEX "SALARY_IDX1" PROPERTY COMPRESSION MISMATCH - - > SOURCE N TARGET Y INDEX TS "PAGESIZE" MISMATCH - - > SOURCE INDEX_TS 8192 TARGET INDEX_TS1 4096 INDEX TS BUFFERPOOL "NPAGES" MISMATCH - - > SOURCE "IBMDEFAULTBP" 1000 TARGET IBMDEFAULTBP - 2 INDEX TS BUFFERPOOL "PAGESIZE" MISMATCH - - > SOURCE "IBMDEFAULTBP" 8192 TARGET IBMDEFAULTBP 4096 DATA TS "PAGESIZE" MISMATCH - - > SOURCE USERSPACE1 8192 TARGET USERSPACE1 4096 DATA TS BUFFERPOOL "NPAGES" MISMATCH - - > SOURCE "IBMDEFAULTBP" 1000 TARGET IBMDEFAULTBP - 2 DATA TS BUFFERPOOL "PAGESIZE" MISMATCH - - > SOURCE "IBMDEFAULTBP" 8192 TARGET IBMDEFAULTBP 4096 Comparing table VACATION between source and target DATA TS BUFFERPOOL "NPAGES" MISMATCH - - > SOURCE "BP_16K" - 2 TARGET BP_16K 10000 DATA TS BUFFERPOOL NUMBER OF BLOCK PAGES MISMATCH - - > SOURCE "BP_16K" 0 TARGET BP_16K 1000 DATA TS BUFFERPOOL "BLOCKSIZE" MISMATCH - - > SOURCE "BP_16K" 0 TARGET BP_16K 50

My Journey Towards The Solution:

Method 1 (SQL):

I tried multiple things before getting to the point of building an all in one jar file. Please read on if you would like to know a little bit more about the journey…

Like any normal DBA I tried to solve this problem using export, import, custom SQL with multiple joins etc. and ended up with a horribly performing SQL. I can not run a nasty SQL against a database in good conscience. On top of poor performing SQL, for using native SQL capabilities I had to do some data movement from one database to another to compare the data between two databases. At this point I simply came to an understanding that using SQL to solve this problem is a bad idea. Even using federation is not flexible.

Method 2 (Perl):

#!/usr/bin/perl # Compare two schemas or tables for names, types and length. # Author Version Comments # Raghu.Cherukuru v1.0 Initial version # Raghu.Cherukuru v1.1 Made this program strict pragma compatible # Raghu.Cherukuru v1.2 Version prior to DBI changes # Raghu.Cherukuru v1.3 Added DBI code to build source and target data structures for comparision # Raghu.Cherukuru v1.4 Moved variables to right section. E.g. DBI variables and also made the password user enterable # Raghu.Cherukuru v1.5 Making this script more functional by putting code in functions, good for making modules later # commenting out all the say statements for successful returns or matches. Turn on for debugging use v5.10; use strict; use DBI; # Script variables my $srctabhash_ref; my $srctab_name; my $trgtabhash_ref; my $trgtab_name; my $srccol_name; my $trgcol_name; my %srctabs_lookup=(); my %trgtabs_lookup=(); my %srccol_lookup=(); my %trgcol_lookup=(); my %src_tab_to_cols_hash=(); my %trg_tab_to_cols_hash=(); my %srctab_to_coltype_lookup_hoh=(); my %trgtab_to_coltype_lookup_hoh=(); # DBI variables UPDATABLE Variables. my $sourcedb="XXXXXXXX"; my $targetdb="XXXXXXXX"; my $src_hostname="xxxxxxx.yyy.zzz"; my $trg_hostname="xxxxxxx.yyy.zzz"; my $src_port="12345"; my $trg_port="12345"; my $source_dsn="DBI:DB2:database=$sourcedb;hostname=$src_hostname;port=$src_port"; my $target_dsn="DBI:DB2:database=$targetdb;hostname=$trg_hostname;port=$trg_port";; my $src_schema="XXXXXXX"; my $trg_schema="XXXXXXX"; my $src_uname="xxxxxxxx"; my $trg_uname="xxxxxxxx"; # Temporarily disable the screen input printing, so that password is not visible to other users system("stty -echo"); say "Please enter the source db password:"; chomp(my $src_passwd=); say "Please enter the target db password:"; chomp(my $trg_passwd=); # Turn echo back on the screen for the rest of the program system("stty echo"); # DBI Non-updatable variables. my $src_dbh=DBI->connect($source_dsn, $src_uname, $src_passwd) or die "Could not connect to database $sourcedb"; my $trg_dbh=DBI->connect($target_dsn, $trg_uname, $trg_passwd) or die "Could not connect to database $targetdb"; my $src_tab_sth; my $trg_tab_sth; my $src_col_sth; my $trg_col_sth; my %SOURCE_TAB_DS=(); my %TARGET_TAB_DS=(); sub Build_Sourcetab_Ds { my $src_int_arrayref; my @src_table_names; $src_tab_sth = $src_dbh->prepare("SELECT TABNAME FROM SYSCAT.TABLES WHERE TABSCHEMA='$src_schema'"); $src_tab_sth->execute(); while (my $tab_row = $src_tab_sth->fetchrow_hashref()) { # Build the Source tables list into an array push @src_table_names, $tab_row->{TABNAME}; } foreach my $src_tab_name (@src_table_names) { $src_col_sth = $src_dbh->prepare("SELECT COLNAME,TYPENAME,LENGTH FROM SYSCAT.COLUMNS WHERE TABNAME='$src_tab_name' AND TABSCHEMA='$src_schema'"); $src_col_sth->execute(); my %tab_to_cols_hash=(); # Empty out the table hash before each iteration, so that array will have only one hash corresponding to a table. # Remember %tab_to_cols_hash=undef vs %tab_to_cols_hash=() not the same. First one will assign an undef value and key to the hash, nasty... %tab_to_cols_hash=(); # Need to empty out the hash for each iteration of the table, since datatypes are not unique for each table and column. # Also this is needed to build hoh with tablename as key and hashref as value, where hashref contains column names as key and datatype as value. my %cols_to_types_hash=(); while (my $col_row = $src_col_sth->fetchrow_hashref()) { $cols_to_types_hash{$col_row->{COLNAME}}="$col_row->{TYPENAME}"."($col_row->{LENGTH})"; } $tab_to_cols_hash{$src_tab_name}={%cols_to_types_hash}; # Each element of the array here is the anonymous hash reference with source tablename as key and another anonymous hashref as value. # The deepest anonymous hashref contains values of column names as keys and their data types as values. push @{$src_int_arrayref}, {%tab_to_cols_hash}; } <<'TESTCODE'; foreach my $int_hashref (@{$src_int_arrayref}) { say "$int_hashref"; while ( my ($tab_key, $col_hash_value) = each %{$int_hashref} ) { say "Table Name: $tab_key, $col_hash_value"; } say; } TESTCODE # Build the final Data structure for source schema $SOURCE_TAB_DS{$src_schema}=[@{$src_int_arrayref}]; } sub Build_Targettab_Ds { my $trg_int_arrayref; my @trg_table_names; $trg_tab_sth = $trg_dbh->prepare("SELECT TABNAME FROM SYSCAT.TABLES WHERE TABSCHEMA='$trg_schema'"); $trg_tab_sth->execute(); while (my $tab_row = $trg_tab_sth->fetchrow_hashref()) { # Build the target tables list into an array push @trg_table_names, $tab_row->{TABNAME}; } foreach my $trg_tab_name (@trg_table_names) { $trg_col_sth = $trg_dbh->prepare("SELECT COLNAME,TYPENAME,LENGTH FROM SYSCAT.COLUMNS WHERE TABNAME='$trg_tab_name' AND TABSCHEMA='$trg_schema'"); $trg_col_sth->execute(); my %tab_to_cols_hash=(); # Empty out the table hash before each iteration, so that array will have only one hash corresponding to a table. # Remember %tab_to_cols_hash=undef vs %tab_to_cols_hash=() not the same. First one will assign an undef value and key to the hash, nasty... %tab_to_cols_hash=(); # Need to empty out the hash for each iteration of the table, since datatypes are not unique for each table and column. # Also this is needed to build hoh with tablename as key and hashref as value, where hashref contains column names as key and datatype as value. my %cols_to_types_hash=(); while (my $col_row = $trg_col_sth->fetchrow_hashref()) { $cols_to_types_hash{$col_row->{COLNAME}}="$col_row->{TYPENAME}"."($col_row->{LENGTH})"; } $tab_to_cols_hash{$trg_tab_name}={%cols_to_types_hash}; # Each element of the array here is the anonymous hash reference with target tablename as key and another anonymous hashref as value. # The deepest anonymous hashref contains values of column names as keys and their data types as values. push @{$trg_int_arrayref}, {%tab_to_cols_hash}; } <<'TESTCODE'; foreach my $int_hashref (@{$trg_int_arrayref}) { say "$int_hashref"; while ( my ($tab_key, $col_hash_value) = each %{$int_hashref} ) { say "Table Name: $tab_key, $col_hash_value"; } say; } TESTCODE $TARGET_TAB_DS{$trg_schema}=[@{$trg_int_arrayref}]; } sub Compare_Source_To_Target { foreach my $srcschema (keys %SOURCE_TAB_DS) { if (exists $TARGET_TAB_DS{$srcschema}) { # say "Schema $srcschema exist in target.

"; } else { die "Schema $srcschema does not exist in target.

"; } # Build look up hashes with tables list for source schema from SOURCE_TAB_DS. foreach my $srctabhash_ref (@{$SOURCE_TAB_DS{$srcschema}}) { foreach my $srctab_name (keys %{$srctabhash_ref}) { # Build the hash for tables in source schema for lookup. $srctabs_lookup{$srctab_name}=1; # Build a hash with keys of source table names and values referencing to the deepest hash with columns information $srctab_to_coltype_lookup_hoh{$srctab_name}={%{$srctabhash_ref->{$srctab_name}}}; } } # Build look up hashes with tables list for target schema from TARGET_TAB_DS. foreach my $trgtabhash_ref (@{$TARGET_TAB_DS{$srcschema}}) { foreach my $trgtab_name (keys %{$trgtabhash_ref}) { # Build the hash for tables in target schema for lookup $trgtabs_lookup{$trgtab_name}=1; # Build a hash with keys of target table names and values referencing to the deepest hash with columns information $trgtab_to_coltype_lookup_hoh{$trgtab_name}={%{$trgtabhash_ref->{$trgtab_name}}}; } } # Compare if all the source tables exist in the target. foreach my $srctab_name(keys %srctabs_lookup) { if (! exists $trgtabs_lookup{$srctab_name}) { say "$srctab_name table DOES NOT EXIST in target schema

"; } else { # say "$srctab_name table exists in target schema"; # Now lets compare the columns and types for each of those columns &Col_Typediff_Checker(\%{$srctab_to_coltype_lookup_hoh{$srctab_name}},\%{$trgtab_to_coltype_lookup_hoh{$srctab_name}},$srctab_name); } } } } # In this function we will compare to see if the column names and types are identical between source and target otherwise complain sub Col_Typediff_Checker { my ($source_coltype_hashref, $target_coltype_hashref, $tabname)=@_; my @source_cols=(keys %{$source_coltype_hashref}); my @target_cols=(keys %{$target_coltype_hashref}); my $missing_col_ind=0; if ( @source_cols != @target_cols ) { say "\tNumber of columns for source: " . scalar(@source_cols) . " differ from number of columns on Target: " . scalar(@target_cols); } else { # say "\tNumber of columns between source: " . scalar(@source_cols) . " and target: " . scalar(@target_cols) . " are same"; } foreach my $srccol_name (keys %{$source_coltype_hashref}) { $srccol_lookup{$srccol_name}=1; } foreach my $trgcol_name (keys %{$target_coltype_hashref}) { $trgcol_lookup{$trgcol_name}=1; } # Are the column names same for both source and target and whether all the columns in source are present in target and NOT viceversa foreach my $srccol_name(keys %{$source_coltype_hashref}) { if (! exists $trgcol_lookup{$srccol_name}) { ++$missing_col_ind; say "\t$srccol_name column is NOT PRESENT in target"; } else { # say "\t$srccol_name column is present in target"; } } # Are the datatypes same for both source and target ? Lets check... if (scalar grep { ($source_coltype_hashref->{$_} ne $target_coltype_hashref->{$_}) and (defined $target_coltype_hashref->{$_}) } (keys %{$source_coltype_hashref})) { say "Table: $tabname"; say "\t\tAtleast one of the data types between Source and Target columns DID NOT MATCH"; # Use a while loop if you would like to save memory. Every time each is called it only returns a pair of (key, value) element. while (my ($key,$value) = each %{$source_coltype_hashref}) { say "\t\tSource column: $key -> $value, Target column: $key -> $target_coltype_hashref->{$key}" if (($target_coltype_hashref->{$key} ne $value) and (defined $trgcol_lookup{$key})); } say; # Format I say :-) } else { say "\t\tThere are some missing column(s): Otherwise all the remaining column's data types between source and target columns matched" if ($missing_col_ind >= 1); # say "\tAll the column's data types between source and target columns matched" if ($missing_col_ind == 0); } %srccol_lookup=(); # Empty the hash lookup for the next table's columns iteration undef %trgcol_lookup; # Apparently this does the same as above } # Check if user wants to compare tables between source and target schemas ? say; say "\t$0 will let you control what database objects to compare by prompting you for yes/no"; say "\tEnter either y|yes, case does not matter to compare or anything else to skip"; print "\t$0 will compare tables between source and target schemas would you like to proceed ?[y|yes]"; chomp(my $tab_ind=); say; if ($tab_ind =~ /(y|yes)/i) { Build_Sourcetab_Ds(); Build_Targettab_Ds(); Compare_Source_To_Target(); } $src_dbh->disconnect() or say "Failed to disconnect from $sourcedb"; $trg_dbh->disconnect() or say "Failed to disconnect from $targetdb"; 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 #! / usr / bin / perl # Compare two schemas or tables for names , types and length. # Author Version Comments # Raghu.Cherukuru v1.0 Initial version # Raghu.Cherukuru v1.1 Made this program strict pragma compatible # Raghu.Cherukuru v1.2 Version prior to DBI changes # Raghu.Cherukuru v1.3 Added DBI code to build source and target data structures for comparision # Raghu.Cherukuru v1.4 Moved variables to right section. E.g. DBI variables and also made the password user enterable # Raghu.Cherukuru v1.5 Making this script more functional by putting code in functions , good for making modules later # commenting out all the say statements for successful returns or matches. Turn on for debugging use v5.10; use strict; use DBI; # Script variables my $srctabhash_ref; my $srctab_name; my $trgtabhash_ref; my $trgtab_name; my $srccol_name; my $trgcol_name; my % srctabs_lookup = ( ) ; my % trgtabs_lookup = ( ) ; my % srccol_lookup = ( ) ; my % trgcol_lookup = ( ) ; my % src_tab_to_cols_hash = ( ) ; my % trg_tab_to_cols_hash = ( ) ; my % srctab_to_coltype_lookup_hoh = ( ) ; my % trgtab_to_coltype_lookup_hoh = ( ) ; # DBI variables UPDATABLE Variables. my $sourcedb = "XXXXXXXX"; my $targetdb = "XXXXXXXX"; my $src_hostname = "xxxxxxx.yyy.zzz"; my $trg_hostname = "xxxxxxx.yyy.zzz"; my $src_port = "12345"; my $trg_port = "12345"; my $source_dsn = "DBI:DB2:database = $sourcedb;hostname = $src_hostname;port = $src_port"; my $target_dsn = "DBI:DB2:database = $targetdb;hostname = $trg_hostname;port = $trg_port";; my $src_schema = "XXXXXXX"; my $trg_schema = "XXXXXXX"; my $src_uname = "xxxxxxxx"; my $trg_uname = "xxxxxxxx"; # Temporarily disable the screen input printing , so that password is not visible to other users system ( "stty - echo" ) ; say "Please enter the source db password:"; chomp ( my $src_passwd = ) ; say "Please enter the target db password:"; chomp ( my $trg_passwd = ) ; # Turn echo back on the screen for the rest of the program system ( "stty echo" ) ; # DBI Non - updatable variables. my $src_dbh = DBI - > connect ( $source_dsn , $src_uname , $src_passwd ) or die "Could not connect to database $sourcedb"; my $trg_dbh = DBI - > connect ( $target_dsn , $trg_uname , $trg_passwd ) or die "Could not connect to database $targetdb"; my $src_tab_sth; my $trg_tab_sth; my $src_col_sth; my $trg_col_sth; my % SOURCE_TAB_DS = ( ) ; my % TARGET_TAB_DS = ( ) ; sub Build_Sourcetab_Ds { my $src_int_arrayref; my @src_table_names; $src_tab_sth = $src_dbh - > prepare ( "SELECT TABNAME FROM SYSCAT.TABLES WHERE TABSCHEMA = '$src_schema'" ) ; $src_tab_sth - > execute ( ) ; while ( my $tab_row = $src_tab_sth - > fetchrow_hashref ( ) ) { # Build the Source tables list into an array push @src_table_names , $tab_row - > {TABNAME}; } foreach my $src_tab_name ( @src_table_names ) { $src_col_sth = $src_dbh - > prepare ( "SELECT COLNAME , TYPENAME , LENGTH FROM SYSCAT.COLUMNS WHERE TABNAME = '$src_tab_name' AND TABSCHEMA = '$src_schema'" ) ; $src_col_sth - > execute ( ) ; my % tab_to_cols_hash = ( ) ; # Empty out the table hash before each iteration , so that array will have only one hash corresponding to a table. # Remember % tab_to_cols_hash = undef vs % tab_to_cols_hash = ( ) not the same. First one will assign an undef value and key to the hash , nasty... % tab_to_cols_hash = ( ) ; # Need to empty out the hash for each iteration of the table , since datatypes are not unique for each table and column. # Also this is needed to build hoh with tablename as key and hashref as value , where hashref contains column names as key and datatype as value. my % cols_to_types_hash = ( ) ; while ( my $col_row = $src_col_sth - > fetchrow_hashref ( ) ) { $cols_to_types_hash{$col_row - > {COLNAME}} = "$col_row - > {TYPENAME}"." ( $col_row - > {LENGTH} ) "; } $tab_to_cols_hash{$src_tab_name} = { % cols_to_types_hash}; # Each element of the array here is the anonymous hash reference with source tablename as key and another anonymous hashref as value. # The deepest anonymous hashref contains values of column names as keys and their data types as values. push @{$src_int_arrayref} , { % tab_to_cols_hash}; } < < 'TESTCODE'; foreach my $int_hashref ( @{$src_int_arrayref} ) { say "$int_hashref"; while ( my ( $tab_key , $col_hash_value ) = each % {$int_hashref} ) { say "Table Name: $tab_key , $col_hash_value"; } say; } TESTCODE # Build the final Data structure for source schema $SOURCE_TAB_DS{$src_schema} = [@{$src_int_arrayref}]; } sub Build_Targettab_Ds { my $trg_int_arrayref; my @trg_table_names; $trg_tab_sth = $trg_dbh - > prepare ( "SELECT TABNAME FROM SYSCAT.TABLES WHERE TABSCHEMA = '$trg_schema'" ) ; $trg_tab_sth - > execute ( ) ; while ( my $tab_row = $trg_tab_sth - > fetchrow_hashref ( ) ) { # Build the target tables list into an array push @trg_table_names , $tab_row - > {TABNAME}; } foreach my $trg_tab_name ( @trg_table_names ) { $trg_col_sth = $trg_dbh - > prepare ( "SELECT COLNAME , TYPENAME , LENGTH FROM SYSCAT.COLUMNS WHERE TABNAME = '$trg_tab_name' AND TABSCHEMA = '$trg_schema'" ) ; $trg_col_sth - > execute ( ) ; my % tab_to_cols_hash = ( ) ; # Empty out the table hash before each iteration , so that array will have only one hash corresponding to a table. # Remember % tab_to_cols_hash = undef vs % tab_to_cols_hash = ( ) not the same. First one will assign an undef value and key to the hash , nasty... % tab_to_cols_hash = ( ) ; # Need to empty out the hash for each iteration of the table , since datatypes are not unique for each table and column. # Also this is needed to build hoh with tablename as key and hashref as value , where hashref contains column names as key and datatype as value. my % cols_to_types_hash = ( ) ; while ( my $col_row = $trg_col_sth - > fetchrow_hashref ( ) ) { $cols_to_types_hash{$col_row - > {COLNAME}} = "$col_row - > {TYPENAME}"." ( $col_row - > {LENGTH} ) "; } $tab_to_cols_hash{$trg_tab_name} = { % cols_to_types_hash}; # Each element of the array here is the anonymous hash reference with target tablename as key and another anonymous hashref as value. # The deepest anonymous hashref contains values of column names as keys and their data types as values. push @{$trg_int_arrayref} , { % tab_to_cols_hash}; } < < 'TESTCODE'; foreach my $int_hashref ( @{$trg_int_arrayref} ) { say "$int_hashref"; while ( my ( $tab_key , $col_hash_value ) = each % {$int_hashref} ) { say "Table Name: $tab_key , $col_hash_value"; } say; } TESTCODE $TARGET_TAB_DS{$trg_schema} = [@{$trg_int_arrayref}]; } sub Compare_Source_To_Target { foreach my $srcschema ( keys % SOURCE_TAB_DS ) { if ( exists $TARGET_TAB_DS{$srcschema} ) { # say "Schema $srcschema exist in target.

"; } else { die "Schema $srcschema does not exist in target.

"; } # Build look up hashes with tables list for source schema from SOURCE_TAB_DS. foreach my $srctabhash_ref ( @{$SOURCE_TAB_DS{$srcschema}} ) { foreach my $srctab_name ( keys % {$srctabhash_ref} ) { # Build the hash for tables in source schema for lookup. $srctabs_lookup{$srctab_name} = 1; # Build a hash with keys of source table names and values referencing to the deepest hash with columns information $srctab_to_coltype_lookup_hoh{$srctab_name} = { % {$srctabhash_ref - > {$srctab_name}}}; } } # Build look up hashes with tables list for target schema from TARGET_TAB_DS. foreach my $trgtabhash_ref ( @{$TARGET_TAB_DS{$srcschema}} ) { foreach my $trgtab_name ( keys % {$trgtabhash_ref} ) { # Build the hash for tables in target schema for lookup $trgtabs_lookup{$trgtab_name} = 1; # Build a hash with keys of target table names and values referencing to the deepest hash with columns information $trgtab_to_coltype_lookup_hoh{$trgtab_name} = { % {$trgtabhash_ref - > {$trgtab_name}}}; } } # Compare if all the source tables exist in the target. foreach my $srctab_name ( keys % srctabs_lookup ) { if ( ! exists $trgtabs_lookup{$srctab_name} ) { say "$srctab_name table DOES NOT EXIST in target schema

"; } else { # say "$srctab_name table exists in target schema"; # Now lets compare the columns and types for each of those columns &Col_Typediff_Checker(\%{$srctab_to_coltype_lookup_hoh{$srctab_name}} , \ % {$trgtab_to_coltype_lookup_hoh{$srctab_name}} , $srctab_name ) ; } } } } # In this function we will compare to see if the column names and types are identical between source and target otherwise complain sub Col_Typediff_Checker { my ( $source_coltype_hashref , $target_coltype_hashref , $tabname ) = @_; my @source_cols = ( keys % {$source_coltype_hashref} ) ; my @target_cols = ( keys % {$target_coltype_hashref} ) ; my $missing_col_ind = 0; if ( @source_cols ! = @target_cols ) { say "\tNumber of columns for source: " . scalar ( @source_cols ) . " differ from number of columns on Target: " . scalar ( @target_cols ) ; } else { # say "\tNumber of columns between source: " . scalar ( @source_cols ) . " and target: " . scalar ( @target_cols ) . " are same"; } foreach my $srccol_name ( keys % {$source_coltype_hashref} ) { $srccol_lookup{$srccol_name} = 1; } foreach my $trgcol_name ( keys % {$target_coltype_hashref} ) { $trgcol_lookup{$trgcol_name} = 1; } # Are the column names same for both source and target and whether all the columns in source are present in target and NOT viceversa foreach my $srccol_name ( keys % {$source_coltype_hashref} ) { if ( ! exists $trgcol_lookup{$srccol_name} ) { + + $missing_col_ind; say "\t$srccol_name column is NOT PRESENT in target"; } else { # say "\t$srccol_name column is present in target"; } } # Are the datatypes same for both source and target ? Lets check... if ( scalar grep { ( $source_coltype_hashref - > {$_} ne $target_coltype_hashref - > {$_} ) and ( defined $target_coltype_hashref - > {$_} ) } ( keys % {$source_coltype_hashref} ) ) { say "Table: $tabname"; say "\t\tAtleast one of the data types between Source and Target columns DID NOT MATCH"; # Use a while loop if you would like to save memory. Every time each is called it only returns a pair of ( key , value ) element. while ( my ( $key , $value ) = each % {$source_coltype_hashref} ) { say "\t\tSource column: $key - > $value , Target column: $key - > $target_coltype_hashref - > {$key}" if ( ( $target_coltype_hashref - > {$key} ne $value ) and ( defined $trgcol_lookup{$key} ) ) ; } say; # Format I say : - ) } else { say "\t\tThere are some missing column ( s ) : Otherwise all the remaining column's data types between source and target columns matched" if ( $missing_col_ind >= 1 ) ; # say "\tAll the column's data types between source and target columns matched" if ( $missing_col_ind = = 0 ) ; } % srccol_lookup = ( ) ; # Empty the hash lookup for the next table's columns iteration undef % trgcol_lookup; # Apparently this does the same as above } # Check if user wants to compare tables between source and target schemas ? say; say "\t$0 will let you control what database objects to compare by prompting you for yes / no"; say "\tEnter either y|yes , case does not matter to compare or anything else to skip"; print "\t$0 will compare tables between source and target schemas would you like to proceed ?[y|yes]"; chomp ( my $tab_ind = ) ; say; if ( $tab_ind = ~ / ( y|yes ) / i ) { Build_Sourcetab_Ds ( ) ; Build_Targettab_Ds ( ) ; Compare_Source_To_Target ( ) ; } $src_dbh - > disconnect ( ) or say "Failed to disconnect from $sourcedb"; $trg_dbh - > disconnect ( ) or say "Failed to disconnect from $targetdb";

Next, I turned to my favorite scripting language perl to solve this problem and I must admit things were looking really good with perl. As perl DBI::DBD drivers let you handle connections to any number of databases and you can fetch the data from the SQL queries in the form of different data structures or fabricate the data structures from the data to suit your need. Although I made significant progress with perl towards reaching a solution, few things bothered me with this approach. I have few people who will use this tool other than myself. These users may run this utility from different environments and for this utility to work with perl, the user needs to configure perl DB2 driver which can take a little bit of time and work. What if I need to run this utility on a windows environment ? Not all windows boxes have active perl installed. Perl was initially developed with Unix flavors in mind. Perl is outstanding to do something quick and dirty, but when the complexity of the programs grow and as the number of lines of code increases it can bring in some maintenance difficulties (especially when there are a lot of interdependent data structures). I learned it first hand by looking over what I wrote. You can take a look at the perl code that I wrote below to solve this problem and understand what I am trying to say.

So, I want a solution that I can use with out having to worry about causing adverse impact to the environment and I want to use it anywhere i.e. from my desktop, my laptop, my server etc. with out needing to do any extra setup.

I need a tool that can be executed from a variety of environments with minimal configuration.

I need a language with good support for data structures in an object oriented fashion (as objects may contain nested data structures).

Language should have easy garbage collection, as this tool should run with minimal memory foot print.

Method 3 (Java):

Advantages of using Db2Diff:

It can be run from any command line (Windows, Linux, OS X, AIX etc.) as long as it has java runtime environment installed. Minimal memory foot print as java is pretty good at garbage collection. Only meta data will be fetched from the database. Data structures along with their comparison is handled inside the java JVM (which could be on your desktop or mac), as a result this utility runs separately from database engine.

Whats next for Db2Diff:

Initial version of db2utils.jar got shipped with only deep comparison of most fundamental RDBMS object i.e. table. Next versions will handle Routines, Stored Procedures, functions and other object comparisons. Would love to hear back from the community on what other functionality you would like to see as part of this utility ? Please leave comments or suggestions if you see any bugs while using this utility and I will try to fix them as soon as I can.

I took a step back and clearly wrote down the problem definition. It is at this point I got enlightened that most people try to solve a problem with what ever tools they know well, even though other tools provide a more comprehensive solution. I learned the following thingsAfter doing some analysis, I determined that I could easily address all the aforementioned concerns raised during SQL and perl solutions using java. Using an object oriented language such as java, you can easily build things (tables, indexes etc.) as objects. In our case think of table as an object, PK as an object, Index as an object etc. You can put PK object, Indexes object etc. inside a table object. Once you have the nested object created, simply compare the objects between source and target. java has been around for a while, has strong garbage collection capabilities, platform independent etc. so I decided to go ahead with java as programming language of choice.