27 May, 2008

Creating a COMPRESSed Table

I haven't yet used Table Compression in Oracle (see this OTN article) but wanted to see what I space savings I could achieve.

I started off with a table with 82.5 million rows. (Note : The table is called "TEST_APPEND" because I had been running a bunch of tests with INSERT /*+ APPEND */ against LOGGING, NOLOGGING and the presence/absence of an index earlier. I had built up the table as a multiplied copy of DBA_OBJECTS, modifying data as it was inserted into TEST_APPEND).

I then created 4 tables from TEST_APPEND, using 8KB and 16KB block sizes and without and with ordering the data before inserting.

As I expected, the target table in a 16KB tableespace with data pre-ordered by the best compressible columns achieved the best space savings. So, this might be a case for larger block sizes ? Table Compression works at the Block Level -- compressing repeated values it finds within a data block, irrespective of whether the same values may or may not exist in contiguous blocks. Therefore, fitting the largest number of rows in a block with the greatest possible repetition (ie ordered by low cardinality keys) achieves the best compression.

SQL>
SQL>
SQL> set numformat 99,999,999,999
SQL> set timing on
SQL> -- get the current size of TEST_APPEND
SQL> rem exec dbms_Stats.gather_table_stats(ownname=>'HEMANT',tabname=>'TEST_APPEND',method_opt=>'FOR ALL COLUMNS SIZE 1',estimate_percent=>100,degree=>4);
SQL> select blocks, bytes/1048576 from user_segments where segment_name = 'TEST_APPEND';

BLOCKS BYTES/1048576
--------------- ---------------
1,244,672 9,724

Elapsed: 00:00:05.79
SQL> select blocks, num_rows,avg_row_len,num_rows*avg_row_len*1.2/8192 Expected_block from user_tables where table_name = 'TEST_APPEND';

BLOCKS NUM_ROWS AVG_ROW_LEN EXPECTED_BLOCK
--------------- --------------- --------------- ---------------
1,242,880 82,499,184 103 1,244,739

Elapsed: 00:00:00.09
SQL>
SQL> rem Setting a large S_A_S manually (Temporary tablespace has been recreated with 100M extents)
SQL> rem alter session set workarea_size_policy='MANUAL';
SQL> rem alter session set sort_Area_size=104857600;
SQL>
SQL>
SQL>
SQL> drop table ta_comp_8k_noord;

Table dropped.

Elapsed: 00:00:00.87
SQL> drop table ta_comp_8k_ordered;

Table dropped.

Elapsed: 00:00:00.24
SQL>
SQL> drop table ta_comp_16k_noord;

Table dropped.

Elapsed: 00:00:00.13
SQL> drop table ta_comp_16k_ordered;

Table dropped.

Elapsed: 00:00:00.17
SQL>
SQL> create table ta_comp_8k_noord tablespace test_compress_8k as select * from test_append where 1=2;

Table created.

Elapsed: 00:00:00.32
SQL> create table ta_comp_8k_ordered tablespace test_compress_8k as select * from test_append where 1=2;

Table created.

Elapsed: 00:00:00.19
SQL>
SQL> create table ta_comp_16k_noord tablespace test_compress_16k as select * from test_append where 1=2;

Table created.

Elapsed: 00:00:02.28
SQL> create table ta_comp_16k_ordered tablespace test_compress_16k as select * from test_append where 1=2;

Table created.

Elapsed: 00:00:00.11
SQL>
SQL> alter table ta_comp_8k_noord compress;

Table altered.

Elapsed: 00:00:00.10
SQL> alter table ta_comp_8k_ordered compress;

Table altered.

Elapsed: 00:00:00.01
SQL>
SQL> alter table ta_comp_16k_noord compress;

Table altered.

Elapsed: 00:00:00.04
SQL> alter table ta_comp_16k_ordered compress;

Table altered.

Elapsed: 00:00:00.03
SQL>
SQL> alter table ta_comp_8k_noord nologging;

Table altered.

Elapsed: 00:00:00.02
SQL> alter table ta_comp_8k_ordered nologging;

Table altered.

Elapsed: 00:00:00.01
SQL>
SQL> alter table ta_comp_16k_noord nologging;

Table altered.

Elapsed: 00:00:00.02
SQL> alter table ta_comp_16k_ordered nologging;

Table altered.

Elapsed: 00:00:00.01
SQL>
SQL> alter session enable parallel dml;

Session altered.

Elapsed: 00:00:00.01
SQL>
SQL> insert /*+ APPEND PARALLEL (2) */ into ta_comp_8k_noord select /*+ PARALLEL (test_a 2) */ * from test_append test_a;

82499184 rows created.

Elapsed: 00:11:10.02
SQL> commit;

Commit complete.

Elapsed: 00:00:00.13
SQL>
SQL> insert /*+ APPEND PARALLEL (2) */ into ta_comp_16k_noord select /*+ PARALLEL (test_a 2) */ * from test_append test_a;

82499184 rows created.

Elapsed: 00:13:20.69
SQL> commit;

Commit complete.

Elapsed: 00:00:00.05
SQL>
SQL>
SQL> select sn.name, s.value
2 from v$statname sn, v$mystat s
3 where sn.statistic#=s.statistic#
4 and sn.name in ('CPU used by this session', 'consistent gets', 'db block changes', 'db block gets', 'physical reads direct','physical writes direct','redo entries', 'redo size','undo change vector size')
5 order by sn.name;

NAME VALUE
---------------------------------------------------------------- ---------------
CPU used by this session 188,251
consistent gets 2,503,958
db block changes 39,542
db block gets 406,400
physical reads direct 2,482,350
physical writes direct 360,009
redo entries 42,086
redo size 4,791,160
undo change vector size 825,280

9 rows selected.

Elapsed: 00:00:00.11
SQL>
SQL> -- get the current size of TA_COMP_8K_NOORD
SQL> exec dbms_Stats.gather_table_stats(ownname=>'HEMANT',tabname=>'TA_COMP_8K_NOORD',method_opt=>'FOR ALL COLUMNS SIZE 1',estimate_percent=>100,degree=>4);

PL/SQL procedure successfully completed.

Elapsed: 00:14:45.67
SQL> select blocks, bytes/1048576 from user_segments where segment_name = 'TA_COMP_8K_NOORD';

BLOCKS BYTES/1048576
--------------- ---------------
243,200 1,900

Elapsed: 00:00:00.03
SQL> select blocks, num_rows,avg_row_len from user_tables where table_name = 'TA_COMP_8K_NOORD';

BLOCKS NUM_ROWS AVG_ROW_LEN
--------------- --------------- ---------------
242,938 82,499,184 103

Elapsed: 00:00:00.01
SQL>
SQL> -- get the current size of TA_COMP_16K_NOORD
SQL> exec dbms_Stats.gather_table_stats(ownname=>'HEMANT',tabname=>'TA_COMP_16K_NOORD',method_opt=>'FOR ALL COLUMNS SIZE 1',estimate_percent=>100,degree=>4);

PL/SQL procedure successfully completed.

Elapsed: 00:15:06.52
SQL> select blocks, bytes/1048576 from user_segments where segment_name = 'TA_COMP_16K_NOORD';

BLOCKS BYTES/1048576
--------------- ---------------
118,272 1,848

Elapsed: 00:00:00.23
SQL> select blocks, num_rows,avg_row_len from user_tables where table_name = 'TA_COMP_16K_NOORD';

BLOCKS NUM_ROWS AVG_ROW_LEN
--------------- --------------- ---------------
118,138 82,499,184 103

Elapsed: 00:00:00.07
SQL>
SQL>
SQL>
SQL> insert /*+ APPEND PARALLEL (2) */ into ta_comp_8k_ordered select /*+ PARALLEL (test_a 2) */ * from test_append test_a order by owner,object_type,object_name;

82499184 rows created.

Elapsed: 02:42:06.73
SQL> commit;

Commit complete.

Elapsed: 00:00:00.09
SQL>
SQL> -- get the current size of TA_COMP_8K_ORDERED
SQL> exec dbms_Stats.gather_table_stats(ownname=>'HEMANT',tabname=>'TA_COMP_8K_ORDERED',method_opt=>'FOR ALL COLUMNS SIZE 1',estimate_percent=>100,degree=>4);

PL/SQL procedure successfully completed.

Elapsed: 00:10:50.87
SQL> select blocks, bytes/1048576 from user_segments where segment_name = 'TA_COMP_8K_ORDERED';

BLOCKS BYTES/1048576
--------------- ---------------
132,096 1,032

Elapsed: 00:00:00.14
SQL> select blocks, num_rows,avg_row_len from user_tables where table_name = 'TA_COMP_8K_ORDERED';

BLOCKS NUM_ROWS AVG_ROW_LEN
--------------- --------------- ---------------
131,950 82,499,184 103

Elapsed: 00:00:00.04
SQL>
SQL>
SQL>
SQL> insert /*+ APPEND PARALLEL (2) */ into ta_comp_16k_ordered select /*+ PARALLEL (test_a 2) */ * from test_append test_a order by owner,object_type,object_name;

82499184 rows created.

Elapsed: 02:44:44.25
SQL> commit;

Commit complete.

Elapsed: 00:00:00.04
SQL>
SQL> -- get the current size of TA_COMP_16K_ORDERED
SQL> exec dbms_Stats.gather_table_stats(ownname=>'HEMANT',tabname=>'TA_COMP_16K_ORDERED',method_opt=>'FOR ALL COLUMNS SIZE 1',estimate_percent=>100,degree=>4);

PL/SQL procedure successfully completed.

Elapsed: 00:10:59.63
SQL> select blocks, bytes/1048576 from user_segments where segment_name = 'TA_COMP_16K_ORDERED';

BLOCKS BYTES/1048576
--------------- ---------------
64,512 1,008

Elapsed: 00:00:00.10
SQL> select blocks, num_rows,avg_row_len from user_tables where table_name = 'TA_COMP_16K_ORDERED';

BLOCKS NUM_ROWS AVG_ROW_LEN
--------------- --------------- ---------------
64,291 82,499,184 103

Elapsed: 00:00:00.04
SQL>
SQL>




As we can see in the listing, table TA_COMP_16K_ORDERED is, effectively, 1008MB (64,512 blocks of 16KB) against the original table TEST_APPEND of 9,724MB (1,244,672 blocks of 8KB)
Note : The test_compress_8k and test_compress_16k tablespaces were created with UNIFORM SIZE 4M instead of AUTOALLOCATE, thus saving space on the "last extent" (which could be 64K in AUTOALLOCATE).


TableName BlkSz HWM SegBlks SegSzMB
TEST_APPEND 8KB 1,242,880 1,244,672 9,724

TA_COMP_8K_NOORD 8KB 242,938 243,200 1,900
TA_COMP_8K_ORDERED 8KB 131,950 132,096 1,032

TA_COMP_16K_NOORD 16KB 118,138 118,272 1,848
TA_COMP_16K_ORDERED 16KB 64,291 64,512 1,008




The above table summarises the result, except that it can take
time to get the data ordered if the source is very large.

No comments: