Enter the Vector, Victor!
Hello sir, I have a few questions and I hope
that you would help me clear my doubts…! What are the situations
when we add a EVI to a table on AS/00? What are the advantages
of adding a EVI to a table? I referred IBM manuals but I was not
able to interpret it clearly. Could you throw some light on this
sir…..!! Ashwin, (Bangalore –india)
Ashwin, this is a really good question as EVI indexes can add
a lot of performance to your query applications. I cover EVI
indexes and their use in my book, iSeries and AS/400 SQL
at Work, so here is an excerpt from the book explains
EVI indexes. If you want to know more, you can order the book
from www.sqlthing.com/book
Excerpt from iSeries and AS/400 SQL at
Work Copyright 2000, Howard F. Arner, Jr. All Rights Reserved
Encoded Vector Indexes
With the advent of V4R3, a new type of index has been added
to your arsenal of query response time weapons, the Encoded
Vector Index (EVI). A logical file, regular index, on the AS/400
is always a Binary Radix Tree. An EVI index is a bitmap of key
values. These indexes are useful when executing a query that
could return a large number of values because they can be
scanned more efficiently.
The AS/400 will use an index only if it thinks that the query
will return less than 20% of the total number of records in the
physical file. EVI indexes are useful to queries that will
return or access between 20% and 60% of the data in a file.
Above, 60% the AS/400 will usually elect to table scan. So,
EVI’s fill an important performance niche on the AS/400.
TRANSNUM |
STATE |
QUANTITY |
PARTID |
0001 |
FL |
22 |
AB21 |
0002 |
FL |
32 |
KL32 |
0003 |
TN |
91 |
AB19 |
0004 |
FL |
13 |
KL32 |
0005 |
FL |
21 |
AB01 |
0006 |
TN |
9 |
AB32 |
0007 |
WY |
2 |
AB01 |
Table 6.1: This table represents
transactions in physical file.
TRANSNUM STATE QUANTITY PARTID 0001 FL 22 AB21 0002 FL 32
KL32 0003 TN 91 AB19 0004 FL 13 KL32 0005 FL 21 AB01 0006 TN 9
AB32 0007 WY 2 AB01 The heart of the EVI is that it is based on
a bitmap of values that represent whether the database record at
that offset contains the value or not. Let’s use the table of
data shown in Table 6.1 as an example. The table contains 7 rows
of data. If a bitmap index were created on the STATE column, the
resulting bitmap index will conceptually look like the data
shown in Figure 6.1.
Figure 6.1: This figure
shows the bitmap index that was created on the STATE column.
Notice the values in the bitmap column for the FL key. The
first bit has a value of 1 indicating that the record at offset
1 in the table has the key value. The third bit is 0, indicating
that the record at offset 3 in the table does not have the key
value. If you were to query the database for records where the
STATE column equals FL, the AS/400 could retrieve the bitmap and
now knows the offsets for each FL record in the database.
In addition, the bitmaps can be combined via a logical OR to
find records with multiple values. For example, querying for
state equal to FL or WY would return a bitmap of 1101101 which
indicates to retrieve the records at offset 1,2,4,5 and 7.
The problem with bitmap structures is that they take a lot of
space. Let’s say the example transaction table has 200,000
records where all fifty states are referenced. The storage
requirement for a bitmap index is 200,000 * 50 or 10,000,000
bits. These can get unwieldy when you have a large number of
distinct key values in a bitmap index. They also cause problems
in updating data as the index consists of X number of arrays of
bits where X is the number of distinct key values. Adding a lot
of data to the database causes the arrays to be re-dimensioned
and extended continually. Enter the vector
EVI indexes are an advancement from IBM that allow all of the
benefits of a bitmap index but do not have the storage overhead.
Instead of using an array of bits for each key value, the AS/400
stores a single array of 1,2 or 4 byte vectors. The number of
vectors in the array is equal to the number of records in the
table. Each distinct value in the database will be given a
vector as it is encountered, and each offset in the array will
contain the vector value that represents the column value stored
in that record in the table. So an EVI index is a combination of
a table that maps keys to vectors and a singular array
representing the vector locations in the database. Figure 6.2
shows a conceptual vector table for the transaction data in
Table 6.1.
Figure 6.2: This
represents the conceptual view of an EVI index over the data in
Table 6.1.
When the AS/400 needs to find records in using the above
index, it will read the table of vectors to retrieve the vector
key for the column values it is searching for and then compare
those values to each element in the vector array. This is quite
speedy.
The number of bytes required to represent the column data is
related directly to the number of distinct values in the column.
If the number of distinct values is less than 256, the each
vector will be only one byte. If the number of values is less
than 32768, it can be represented using only a 2-byte vector.
Database columns having more than 32768 distinct values will be
represented by a 4-byte vector.
The point of the EVI methodology is that it takes less space
than the traditional bitmap index. In addition, the EVI is
maintainable in a production environment, whereas bitmap indexes
are notorious for causing problems. Another advantage of EVI
indexes is that the AS/400 can combine multiple EVI indexes
together in memory when resolving a database query. Imagine you
have two EVI indexes on the transaction table shown in Table
6.1. The first is on IDATE and the second is on STATE. Now
imagine executing the following query:
SELECT *
FROM TRANSACTIONS
WHERE STATE IN (‘FL’,’MO’,’HI’,’CA’)
AND QUANTITY BETWEEN 1 and 12 and
IDATE in (‘11/01/2002’,’11/12/2002’,’12/12/2002’)
The query optimizer may choose to merge the two indexes
together to get the result set. This merge can be very fast for
statements that will retrieve large numbers of records, so you
need to play with EVI indexes on your large tables.
|