ADO Best Practices
For Development of iSeries n-tier Applications
By Howard F. Arner, Jr.
for Client Server Development, Inc.
Introduction
ADO and ADO.Net are technologies for developing
distributed applications that reside on Microsoft platforms and
communicate with the iSeries.
The purpose of this document is to outline the best policies
for developing with ADO, when ADO is not appropriate and how to use a
development methodology with the iSeries that ensures the data access
layer of your distributed application is scalable and reliable. This
document reveals when a programmer should use the Client Access OLEDB
provider, when to use the Client Access ODBC provider via MSDASQL and
when to use direct programming to the ODBC interface. The intended
audience for this paper is a programmer familiar with ADO, VBScript or
Visual Basic and ADO.Net.
Which Access Method
At the time of this writing, there is not a
managed provider available for accessing the iSeries via ADO.Net.
However, there is an OLEDB provider for accessing the iSeries via ADO
or OLEDB. This provider allows you to access specific capabilities of
the iSeries, some of which are not readily available via an ODBC
provider. These are:
o
Access to data queues
o
Record level access to physical and
logical files
o
Access to iSeries API code
o
Access to CL Commands
o
Running SQL statements
There is no support in the ODBC provider for
record level access to physical and logical files. If this is
important to the section of code you are working on, it would be
appropriate to use the OLEDB provider. If you need to access data
queues, you can do this with the ODBC provider as long as you have
created stored procedures for manipulating data queues. In addition,
you can access and use iSeries API’s via the OLEDB provider, whereas
via ODBC you would need to call a native program as a stored
procedure. For these types of accesses, the OLEDB provider gives good
performance and integrity.
However, the OLEDB provider lacks good support
for manipulating the capabilities of DB2 when using SQL.
In tests performed by Client Server Development, we have found
that using the ODBC provider via the MSDASQL provider provides a 45%
improvement in performance when fetching more than one record. In
addition, the OLEDB provider does not support the packaging of SQL
statements and instead relies on the system and job level SQL
statement cache. Because most SQL statements are against legacy tables
with a large number of physical files, we feel the packaging of SQL
statements is of paramount importance in producing a scalable and
reliable n-tier system. Finally, the OLEDB provider only supports the
SQL naming convention for execution of SQL statements against the
iSeries. Sometimes, a program needs the *SYS naming convention which
is supported by ODBC in order to avoid effects programmed into the
*SQL naming convention for objects. It is for these reasons that we
recommend all SQL access to the iSeries be accomplished via the
MSDASQL provider against the Client Access ODBC driver.
If programming in the .Net framework, there is an
ODBC provider available in version 1.1, system.data.odbc, which allows
managed code access to ODBC drivers.
In our tests, this provider has performed quite well when
connecting to the iSeries and we feel that is should be considered if
you need managed code access to the iSeries.
However, one area where we recommend programming
directly to the ODBC API is when you must write more than one record
to a table. ODBC supports the binding of arrays of records to
parameter markers in SQL statements. This feature can significantly
improve the performance of database writes by decreasing the number
round trips between the client and server. In a recent test on a
client connected to the iSeries via a 256KB Internet connection we
were able to write 14 records per second using ADO/ODBC. On the same
connection using direct calls to ODBC and binding arrays of records we
were able to achieve writes of 400 records per second. This is a
significant performance difference.
Developing for Performance (Methodology)
The following is the recommended methodology for
developing distributed applications with the iSeries.
Data centric design
The programmer should begin by designing and
coding the SQL statement in a tool like SQLThing Enterprise Edition,
before any client code is written.
The statement should be tested against the iSeries as the
statement will be used in the component, (i.e. using parameter markers
rather than hard coded values in the where and having clauses). When
testing the developed statement, the programmer should place the
iSeries server job into debug mode so that they can view optimization
messages in the job log. Care should be taken to look for messages
that can negatively impact performance.
The following metrics should be obtained against
statements:
o
Prepare time
o
Execution time
o
Records per second
o
Overall execution time
If there are no contraindications in the job log
and times appear adequate to support the suspected load, the
programmer should copy the statement directly from the testing tool
and place it into his component. At this point, the programmer knows
that if there is a performance problem with the fielded application,
it is not in the data access layer as a quality assurance process has
vetted the statements.
The programmer should have the following data at
hand during this process
o
Index selectivity reports for each
logical file associated with the physical files
o
A report describing all physical files
and columns that are in the target tables
The programmer can use the job log output to see
which indexes, (logical files), are being used in the resolution of
the query and why other logical files are not being accessed. This
information can be valuable to spotting inconsistent data values or
comparisons between files and other actions that can negatively impact
performance.
Client Programming Best Practices
The following information recounts best
programming practices for achieving optimal performance when accessing
iSeries data.
Use of Prepared Statements
When programming in ADO, many programmers use a
technique called SQL mashing to put a select statement together. This
involves taking a stub of the SQL statement and concatenating together
the required where clause using string concatenation techniques and
then opening the recordset object using the concatenated statement.
This causes the iSeries to have to optimize the SQL statement when it
is received, which is a very bad thing when accessing legacy files.
Unlike most relational databases, iSeries databases support record
level access techniques and direct reading of indexes.
Whereas a typical relational database might have two or three
indexes over a table, the iSeries might have seventy or more indexes
over a physical file in order to support various programs that need to
access the data in different sequences.
Because the query optimizer attempts to evaluate each index
available, the optimization process can take an inordinate amount of
time and becomes very expensive.
Therefore, use of dynamic SQL should be avoided.
The alternative to dynamic SQL is to use static
SQL. This is similar to the creating a stored procedure in a product
like SQL server. When the iSeries first sees a static statement, it
will completely optimize the statement to find the best access plan
and then store a copy of the execution plan in an SQL Package file.
When you attempt to use the statement again, the iSeries will
first search the package file for the occurrence of the statement, and
if found it will forgo the optimization process and use the stored
execution plan. This is very efficient and saves both CPU and disk IO
access on the iSeries.
In order to use the packaging facility, you must
use a JDBC or ODBC driver to access the iSeries as packaged statements
are not currently supported by the OLEDB or DRDA access methods. In
addition, when programming in ADO the programmer should make use of
parameter markers in the statements as the iSeries will only package
statements that have no hard-coded values in the where or having
clauses. Here is an example of a dynamic SQL statement:
SELECT * FROM PARTS WHERE PART_CLASS = ‘Filters’
Here is an example of a static SQL statement
SELECT * FROM PARTS WHERE PART_CLASS = ?
In order to use static SQL, the programmer should
use the ADO Command object.
Here is a code snippet that will execute the above statement.
Dim Con1 as New ADODB.Command
Dim Cmd1 as New ADODB.Connection
Dim Rs as New ADODB.Recordset
Con1.Open “DSN=MYAS400;UID=HOWIE;PWD=SECRET;”
Cmd1.ActiveConnection = Con1
Cmd1.CommandText = “SELECT * FROM PARTS WHERE
PART_CLASS = ?”
Cmd1.Parameters.Refresh
Cmd1.Parameters(0).Value = “Filters”
Rs.CacheSize = 200
Rs.Open Cmd1
While Not Rs.Eof()
…
Rs.MoveNext
Wend
Rs.Close
Con1.Close
The above is close to best practice code, as it
uses parameter markers and adjusts the cachesize of the recordset
before opening the recordset to anticipate the number of records being
returned by the client. Setting the cachesize allows the iSeries to
send up to 200 records per server fetch, thus reducing round trips
between client and server and increasing throughput. The DSN used in
this connection has packaging turned on, which will cause the iSeries
to store the execution plan of the command so that subsequent
executions of the statement need not be optimized.
To make this code into best practice code, the
programmed should use a connection string that does not incorporate a
DSN. DSNless connection string avoid a hit against the Windows
registry and allow for greater control over connection options.
Using DSNLess Connection Strings
The following information relates how to create a
connection string and use the connection string with an ADO or
ODBC.Net provider.
A DSN, Data Source Name, is defined via the ODBC
Driver manager in the Windows Control Panel.
This causes an entry to be made into the system registry
recording all of the connection information that you enter into the
Control Panel. The
problem with using a DSN is that every time you connect to the AS/400,
the target computer must look up the connection information in the
Windows registry. This is
not a factor if you are casually connecting from applications like
Excel, Access or SQLThing, but if you are doing Active Server Pages
these registry lookups do add a small amount of time to the connection
process. Also, using a
DSN-less connection can give you finer control of connection
attributes; in fact you can control your connection attributes on a
page-by-page basis. In addition, by not requiring a DSN you make you
application or web pages easier to distribute to target computers,
i.e. you do not have to go to the target computer and use the control
panel to create a DSN.
There are a host of connectivity
options that you can use when connecting to the AS/400, and these can
lead your connection string to become quite complex.
Instead of typing all of the options in, you can let Windows
create a connection string for you by creating a Universal Data Link
(UDL) file. A UDL file is a file persisted set of information on how
to connect to a data source.
In this case, we will be using the UDL file as a quick way to
build an ODBC or OLEDB connection string.
To start, open a Command Prompt window, switch to a directory
you wish to use and type COPY CON A.UDL and press Enter.
Next, press the F6 key to send a ^Z command to the console.
You should see the text 1 file copied and then be
returned to a command prompt.
You have just created a 0 length file called A.UDL
in your target directory.
Next, pull up the Windows Explorer and right
click on the A.UDL file and select the Properties menu
item. This will bring up
an interactive window where you can set the provider and all
connection options for that provider.
Once you have your settings correct, press Apply then OK to
save the UDL information.
Next, open the UDL file with notepad, or some other editor, and Volla,
you have a preformed almost-perfect connection string.
The only problem with the string is that it still points to an
ODBC DSN. To make the
string perfect, remove the DSN information from the extended
properties and replace the quote (“) characters with double quotes
(“”) so that the string will not be confused in your ASP page.
Finally, after the Provider=MSDASQL.1; place the following
information: Driver=Client Access ODBC
Driver (32-bit); This
string lets the MSDASQL OLEDB provider know to use the Client Access
ODBC driver when it attempts to connect.
Here is a code snippet that shows a proper connection string
with lots of options:
ConStr = "Provider=MSDASQL.1;Driver=” ConStr =
ConStr & “Client Access ODBC Driver (32-bit);Password=SECRET;”
ConStr = ConStr & “Persist Security Info=True;User ID=HFA;Extended”
ConStr = ConStr & “Properties=""SYSTEM=172.16.2.200;CMT=0;” ConStr
= ConStr & “DBQ=WEBOE,HD1100PD,HD1100PO;NAM=0;” ConStr = ConStr &
“DFT=5;DSP=1;TFT=0;TSP=0;DEC=0;XDYNAMIC=1;” ConStr = ConStr &
“RECBLOCK=2;BLOCKSIZE=32;SCROLLABLE=0;” ConStr = ConStr &
“TRANSLATE=1;LAZYCLOSE=1;LIBVIEW=0; ConStr = ConStr &
“REMARKS=1;CONNTYPE=0;SORTTYPE=0; ConStr = ConStr &
“PREFETCH=0;DFTPKGLIB=QGPL;LANGUAGEID=ENU;” ConStr = ConStr &
“SORTWEIGHT=0;SSL=2;MAXFIELDLEN=32;” ConStr = ConStr &
“COMPRESSION=1;ALLOWUNSCHAR=0;” ConStr = ConStr &
“SEARCHPATTERN=0;MGDSN=0;"""
Set Con1 = Server.CreateObject(“ADODB.Connection”)
Con1.Open ConStr
Note that the connection string keyword
parameters randomly change for no apparent reason between versions of
Client Access, why is left as a mystery to be contemplated by minds
superior to ours. Because these connection keywords change, it is
recommended that you use the above method to create the desired
DSNless connection string. This gives you complete access to the GUI
interface IBM providers for setting up ODBC data sources, without
having to hit the registry to retrieve the information.
Bulk Insertion of Records
When writing record to the iSeries the best
practice is to use a command object and parameter markers so that the
insert statement is optimized and packaged once. However, in some
cases you are not writing one or two records during a transaction, but
many records. In these cases, it is desirable to use bulk insert
techniques in order to avoid multiple round trips between client and
server. Since the MSDASQL provider does not support this technique,
you must program directly to the ODBC API to take advantage of bulk
inserts.
Client Server Development provides a COM object
that eases the use of ODBC for performing bulk insert operations. This
COM object is written in VB6 and has a small companion DLL written in
C++ for memory management. The object has been commercially available
since 1996 and is currently in use by over 3,000 organizations to
abstract the VB and COM programmer from the intricacies of ODBC API
manipulation.
The COM object contains three objects
representing the ODBC Environment handle, the ODBC Connection handle
and the ODBC Statement handle. Each object exposes methods and
properties that allow the programmer to control how ODBC behaves and
invoke ODBC API functions.
To install the object, the programmer copies the two DLL files
to the Windows/System32 folder and then uses the REGSVR32 program to
register the ODBC400 DLL. Once installed, the COM object appears as a
reference-able object in the list of references in VB and other COM
programming environments. The following is a set of code used to bind
multiple records to an insert statement using the COM object. This
program opens a dBase file and reads records and binds them to the SQL
statement. When the number of records read is equal to the number of
records to bind per execute, the statement handle is executed and the
records are sent to the iSeries. The logic is continued until all
records are read from the dBase file and sent to the iSeries.
Dim OEnv As New sqltODBC400.odbcenv
Dim OCon As sqltODBC400.odbccon
Dim rc As Integer
Dim ADC As New ADODB.Connection
Dim rs As New ADODB.Recordset
Dim i As Long, ii As Long, e As Long
Dim eRcol As New Collection
Dim RW As Long, rr As Long
Dim begtime, endtime
Dim NBR
‘initialize ODBC objects
OEnv.Sql_Init
‘connect to the iSeries.
Set OCon = New sqltODBC400.odbccon
X = OCon.Connect(OEnv, txtDSN.Text, txtUID.Text,
txtPWD.Text)
‘Set number of records to bind before execute
NBR = CInt(250)
‘set recordset cachesize for performance
rs.CacheSize = 2500
‘open dBase file
ADC.Open "Provider=Microsoft.Jet.OLEDB.4.0;Data
Source=" & App.Path & ";Extended Properties=""dBASE 5.0;"";Persist
Security Info=False"
rs.Open "GLSUM", ADC
Dim Stmt As String
Dim BinAR() As Variant
‘prepare odbc statement handle
Dim s1 As New sqltODBC400.odbcstmt
s1.Make OCon
‘make a table to hold information
rc = s1.Prep("drop table YYGLSUM")
rc = s1.Execute()
Stmt = ""
Stmt = Stmt & "CREATE TABLE YYGLSUM ("
Stmt = Stmt & " ACCOUNT CHAR(10) not null,"
Stmt = Stmt & " COMPANY integer not null,"
Stmt = Stmt & " BRANCH integer not null,"
Stmt = Stmt & " RMN integer not null,"
Stmt = Stmt & " PRMN integer not null,"
Stmt = Stmt & " YR integer not null,"
Stmt = Stmt & " MO integer not null,"
Stmt = Stmt & " AMOUNT decimal(12,2) not null,"
Stmt = Stmt & " YTD decimal(12,2) not null,"
Stmt = Stmt & " PYAMOUNT decimal(12,2) not null,"
Stmt = Stmt & " PYTD decimal(12,2) not null,"
Stmt = Stmt & " PK integer not null primary key)"
‘prepare create table statement
rc = s1.Prep(Stmt)
If rc <> True Then
OEnv.Errd
Exit
Sub
End If
‘execute statement
rc = s1.Execute()
If rc <> True Then
OEnv.Errd
End If
‘kill the statement object
s1.Kill
‘make a new one
Set s1 = New sqltODBC400.odbcstmt
s1.Make OCon
begtime = Timer()
‘note use of parameter markers
Stmt = "insert into YYGLSUM values
(?,?,?,?,?,?,?,?,?,?,?,?)"
‘the mrprep method prepares the statement and
indicates
‘the maximum number of rows we will bind per execute
rc = s1.MrPrep(Stmt, NBR)
‘mrpreprows prepares the memory buffers
rc = s1.MrPrepRows()
Dim RowRead As Long
RowRead = 1
RW = 0
rr = 0
While Not rs.EOF
rr =
rr + 1
For i
= 1 To 11
‘use MRBindParm method to place data into buffer
rc = s1.MrBindParm(rs.Fields(i - 1).Value, i, RowRead - 1)
If rc <> True Then
s1.Errd
End If
Next
'add
primary key
rc =
s1.MrBindParm(rr, 12, RowRead - 1)
If rc
<> True Then
s1.Errd
End If
If
RowRead = NBR Then
rc = s1.MrExecute(NBR)
If rc <> -1 Then
If s1.Errnum = 170 Then
‘ this is to counter a bug sometimes found
‘when communicating with SQL Server
s1.MrExecute (NBR)
Else
s1.Errd
End If
End If
DoEvents
If Not rc Then
e = e + 1
End If
RowRead = 0
End If
rs.MoveNext
DoEvents
RowRead = RowRead + 1
RW = RW + 1
Wend
‘commit the transactions
OCon.Commit
endtime = Timer()
MsgBox "ET: " & endtime - begtime & " RPS " & RW /
(endtime - begtime) & " for " & RW & " rows"
Performance Verification
Finally, it is important that we verify the
actions of the completed component against the iSeries. It is my
recommendation that the component be run while the IBM Database
Monitor is active and journaling SQL information. Once all aspects of
the component have been exercised the monitor is turned off and data
in the monitor file is examined using SQL statements and programs
provided by Client Server Development. By examining the database
monitor, we can spot potential performance problems like creation of
temporary indexes, unnecessary code-page sorts, use of dynamic SQL or
dynamic rebuilding of access plans. This audit will ensure that the
system is robust, scalable and have little impact on normal iSeries
operations.
|