Navigation:  Articles > Nov-2001 >

ADO.Net Revolution: Data Adapter

Previous pageReturn to chapter overviewNext page

Peter Vogel        

Following up on Rob Macdonald's article on how ADO.NET has grown from ADO, Peter Vogel looks at how ADO.NET is different from the ADO that you know.

In the August 2001 issue of Smart Access, Rob Macdonald discussed how ADO.NET can be considered an evolution of ADO. Rob looked at ADO.NET as a version of ADO with the cursor location property set permanently on "cursor-side." In this article, I'll concentrate on the aspects of ADO.NET that are different from ADO.

ad6468x60
 

How is ADO.NET like ADO? Like ADO, ADO.NET has a Connection object that you use to manage your connection to your data source. Like ADO, ADO.NET also has a Command object. The Command object is used to submit SQL statements or to execute stored procedures. Like the ADO Command object, the ADO.NET Command object accepts parameters and supports re-executing compiled commands. Things start to get different when you look at the ADO.NET DataSet object. In many ways, the DataSet object looks like a weird amalgam of all of the old DAO objects holding tables, relations, and columns.

The ADO Recordset object held a collection of records that you could scroll through. The ADO.NET DataSet can do the same but can hold several sets of records and the relationships between them. The ADO.NET DataSet is like a portable database containing tables and views along with the data description information that defines them. You can consider an ADO.NET DataSet to be an in-memory database that you hold in your application's memory space (see the sidebar, "The DataReader," for the one exception to ADO.NET's separation of client and server).

The DataAdapter

The completely new object in the ADO.NET world is the DataAdapter. The purpose of the DataAdapter is embedded in its name: It performs the activities necessary to get the data from the data source on the server into the database that's held in the DataSet. To do that, the DataAdapter lets you specify the commands that should be carried out to retrieve and update data.

In ADO, if you used a client-side Recordset, your data was disconnected from the data source. This had lots of advantages, not the least of which was that you could close your application's connection to the data source and still work with the data. This would free up the connection so that it could be used by other applications, improving your application's scalability.

Disconnected Recordsets also made applications more scalable in other ways. In addition to reducing the load on the data server by reducing the number of active connections, the data server's load was also reduced because activities performed on disconnected data had no impact on the server. In order to transmit changes back to the data source, you used ADO's UpdateBatch method. The process was straightforward: You created a disconnected Recordset; you made changes to the data; you called the UpdateBatch method; and... magic happened. If you had carefully crafted SQL statements or stored procedures, it didn't matter. UpdateBatch performed your updates any way that it saw fit. Some of Microsoft's documentation suggests that your original Recordset was re-created on the server, and the records scrolled through, reconciling the changes between the Recordset from the client and its half-sibling on the server, a very inefficient process.

The DataAdapter improves on this process. The object provides four properties that allow you to control how updates are made to the server: SelectCommand, UpdateCommand, InsertCommand, and DeleteCommand. The four properties are set to Command objects that are used when data is manipulated.

For instance, when you call the DataAdapter's Fill method to retrieve data from a data source and pour it into a DataSet, the Command object in the SelectCommand property is used. The DataAdapter is the gatekeeper that sits between your DataSet and the data source. Instead of using Command objects directly with a Connection, the DataAdapter manages your Command objects as they interact with the data source.

There are problems with this strategy of using the DataAdapter object to bridge the gap between the data source and the data set. One is some really ugly syntax. For instance, here's the code to retrieve all of the records from the Customers table:

Dim conn As SqlConnection

Dim cmd As SqlCommand

Dim adt As SqlDataAdapter

Dim dts As DataSet

 

conn= New SqlConnection(…connection string…)

cmd = New SqlCommand("SELECT * FROM Customers", conn)

adt = New SqlDataAdapter

adt.SelectCommand = cmd

dts = New DataSet

As you can see, the code creates a Connection object, a Command object that uses the connection object, and a DataAdapter. One word of warning: To save space, I've shortened the syntax down. Where I have "SqlConnection," I should have "System.Data.SqlClient.SqlConnection." Just make the substitution in your head as you read this code, and it will save another tree.

Once created, the DataAdapter object's SelectCommand property is set to the Command object. This section of code completes with the creation of a DataSet. With the preparatory work done, it's time to add to the load on the data server. This line of code pours the data into the DataSet using that ugly syntax that I referred to:

adt.Fill(dts, "Customers")

The Fill method of the DataAdapter is passed the DataSet and a string. The string is used to name the DataTable created by the Fill method. Once the DataSet is Filled, you have your own personal database containing a table with the data that you retrieved.

You can bypass some of this code by passing the Select command and connection to the DataAdapter when you create it:

conn= New SqlConnection(…connection string…)

adt = New SqlDataAdapter("SELECT * FROM Customers", conn)

dts = New DataSet

adt.Fill(dts, "Customers")

 

The philosophy of object data access

What's not to like? In an ideal object-oriented world, the methods of an object are things that the object knows how to do to itself. In other words, when you call a method, something happens to the object or to the real-world resources that the object controls. For instance, if you call the UpdateBatch method of an ADO Recordset, the records managed by the Recordset are updated. The design goal is to reduce or limit the side effects of any action. However, the Fill method of the DataAdapter does nothing to the DataAdapter. Instead, the DataSet passed to the Fill method is populated.

This is just wishful thinking on my part, but I would have preferred a syntax that put a Fill method on the DataSet object, passing the DataAdapter to be used:

dts.Fill(adt, "Customers")

Oh well.

There's no doubt that some of the elegant simplicity of ADO has been lost in the move to ADO.NET. Microsoft's original data access strategy, DAO, was a set of objects that looked like a database (to be exact, like a Jet database). ADO swept aside a lot of the underbrush to boil down data access to three objects—connections to the database, commands issued against the database, and the records retrieved from the database.

ADO.NET, for the most part, creates a division between the client-side DataSet and the server-side data source. This reflects the current emphasis on scalability: Reduce the demand on the shared resource (the server) and provide as much disconnected functionality to the individual clients as possible. The result is a more complicated DataSet object and a more complicated relationship between the data source and the DataSet.

Using the DataAdapter

Using Command objects with the DataAdapter isn't all that different from using them in ADO—you create the Command objects, set their parameters, and Fill your DataSet. Rewriting the previous code to retrieve a single customer would look like this:

conn= New SqlConnection(…connection string…)

cmd = New SqlCommand("SELECT * FROM Customers" & _

              "Where CustomerId = @cust", conn)

cmd.Parameters.Add("@Cust", SQLDbType.VarChar, 5)

 

adt = New SqlDataAdapter

adt.SelectCommand = cmd

dts = New DataSet

 

adt.SelectCommand.Parameters("@cust").Value = "ALFKI"

adt.Fill(dts, "Customers")

This particular syntax will only work if you're using the ADO.NET objects designed to work with SQL Server. The SQL Server objects allow you to define parameters by using names prefixed with an "@." Unlike ADO, you must explicitly add your parameters to the Command object's Parameters collection.

If you use the general-purpose objects that communicate with any data server, you define parameters by inserting a question mark into your SQL statement. With the general-purpose objects, parameters are referenced by position. Here's the relevant code from my previous example:

cmd = New OleDbCommand("SELECT * FROM Customers " & _

                "Where CustomerId = ?", conn)

cmd.Parameters.Add("@Cust", DbType.String, 5)

 

Mixing and matching

Much of flexibility of ADO remains in ADO.NET. You can, for instance, use multiple DataAdapters with a single DataSet. This code fills a single dataset from multiple DataAdapters:

dts = New DataSet

 

cmd = New SqlCommand("SELECT * FROM Customers", conn)

adt = New SqlDataAdapter

adt.SelectCommand = cmd

 

cmd2 = New SqlCommand("SELECT * FROM Orders", conn)

adt2 = New SqlDataAdapter

adt2.SelectCommand = cmd2

 

adt.Fill(dts, "Customers")

adt2.Fill(dts, "Orders")

In this example both of the DataAdapters used the same Connection. There's no reason that you can't have DataAdapters that use different Connections and so pull data from a variety of sources. A DataSet is really a container that holds tables, and it doesn't matter where the tables come from or what their data sources are. A single DataSet can provide you with a single client-side database that allows you to manipulate all of the data that your application needs, no matter how many sources it's drawn from.

When you call the Fill method, the DataAdapter uses the Command object in the SelectCommand property and the Command object uses the Connection it's associated with. If the Connection is closed, then it's opened for the time that it takes to retrieve the data and then closed. However, if the Command object finds the Connection open, it leaves it open, which gives you an opportunity to make your code more efficient.

In the previous example, each call to the Fill method would open and close a connection to the data source. In general, it's probably more efficient to open the connection once for the duration of the two Fills and close it when you're done:

conn.Open()

adt.Fill(dts, "Customers")

adt2.Fill(dts, "Orders")

conn.Close()

The relative efficiency of the two techniques will depend on the data source, connection pooling, and other environment-dependent variables; so test before you commit.

For a DataSet to truly be a database, you need to be able to establish relationships between your tables. To do that, you need to use a DataRelation object. DataRelation objects live in the DataSet's Relations collection and are added to it using the collection's Add method. The Add method accepts three parameters: a name for the relation and the two fields involved in the parent-child relationship. The following example establishes a relationship between the Customer and Products DataTables that I've put in my DataSet. The syntax for accessing DataTables in a DataSet and a Column in a DataTable is pretty obvious:

dts.Relations.Add("CustOrders", _

  dts.Tables("Customers").Columns("CustomerID"), _

  dts.Tables("Orders").Columns("CustomerID"))

You can even fill a DataTable multiple times from different Fill methods. The Fill method will compare the primary keys of records in the DataTable with the primary keys of the records in the data source. If the record already exists in the table, the incoming and existing rows are merged; if no matching record is found for the incoming data, it's appended to the DataTable.

Updating data

As with selecting records, when you've finished modifying the data in a DataSet, it's the DataAdapter that handles moving the data from your local in-memory database to the data source. The DataAdapter's Update method accepts a DataSet and the name of the DataTable whose updates are to be processed. This gives the Update method a certain symmetry with the Fill method:

adt.Fill(dts, "Customers")

...code to update the DataSet...

adt.Update(dts, "Customers")

During Update processing, the DataSet is analyzed for changes and the appropriate Update, Delete, or Insert command is called. The DataAdapter's UpdateCommand, DeleteCommand, and InsertCommand properties must have been set to Command objects that will do the necessary updates. Creating an Update command consists of defining an SQL statement with the necessary parameters for the fields to update and the primary key for the record. This code creates an Update statement that changes the CompanyName and ContactName fields for a specific CustomerId:

Dim ucmd As SqlCommand

ucmd = New SqlCommand("UPDATE Customers " & _

       "Set CompanyName = @CompanyName, " & _

       "    ContactName = @ContactName " & _

       "Where CustomerId = @CustomerId", conn)

adt.UpdateCommand = ucmd

Once the SQL statement is defined, you can define the entries in the DataAdapter's parameters collection. The Parameters collection's Add method accepts a fourth parameter that's the name of the field that the parameter is to be filled with when the updates are done. These four parameters will have their values set from the CompanyName, ContactName, and CustomerId fields:

adt.UpdateCommand.Parameters.Add( _

       "@CompanyName", SqlDbType.NVarChar, 15, _

       "CompanyName")

adt.UpdateCommand.Parameters.Add( _

       "@ContactName", SqlDbType.NVarChar, 15, _

       "ContactName")

adt.UpdateCommand.Parameters.Add( _

       "@CustomerId", SqlDbType.NVarChar, 5, _

       "CustomerId")

As you can imagine, creating the Update and Insert commands can be time-consuming for tables with lots of fields. The Delete command is typically simpler since you only need to provide a parameter for the primary key field:

Dim dcmd As SqlCommand

dcmd = New SqlCommand("Delete Customers " & _

       "WHERE CustomerId = @CustomerId", conn)

adt.DeleteCommand = dcmd

adt.DeleteCommand.Parameters.Add( _

       "@CustomerId", SqlDbType.NVarChar, 5, _

       "CustomerId")

While you can craft your own SQL statements, if your DataSet consists of a single DataTable that was drawn from a single table in the data source, you can have the commands automatically generated for you through a CommandBuilder object. There are restrictions on using the CommandBuilder: You must have retrieved a unique key field for the rows, and the field names can't contain any unusual characters (spaces, punctuation marks, and so forth). To use the CommandBuilder, just pass the DataAdapter (after you create it) to the CommandBuilder object when you create the CommandBuilder:

Dim cmb As SqlCommandBuilder

cmb = New SqlCommandBuilder(adt)

adt.Fill(dts, "Customers")

dts.Tables("Customers").Columns("CompanyName") _

                       = "PH&V"

adt.Update(dts, "Customers")

As usual, there's more that I could have covered here. This covers the core of what you need to know to use the DataAdapter object. As you've seen, there's no equivalent object to the DataAdapter in ADO. Besides, too many people were getting to know how ADO worked. If too many people understand how these things work, it'll drive down our billing rates.

Sidebar: The DataReader

A DataReader resembles an ADO Recordset more than anything else either in ADO or ADO.NET. The intent of the ADO.NET DataReader is to replace the default ADO Recordset and be a highly efficient way to run through records. Also like the ADO Recordset, an ADO.NET DataReader is created from a Command object. The ExecuteReader method of the Command object is used to create a DataReader:

Dim cmd As SqlCommand

Dim rdr As SqlDataReader

cmd = New SqlCommand("Select * From Customers", conn)

rdr = cmd.ExecuteReader()

To pull records from the DataReader, you use the Read method. The Read function makes the next record being pulled from the data source available for processing. If there's no next record, the Read method returns False. As an example, this code moves through the records being retrieved by the DataReader and stops when there are no more records to process:

Do While rdr.Read()

 …record processing…

Loop

Essentially, the Read method combines the functionality of ADO's MoveNext method with ADO's EOF property. This eliminates the single major cause of infinite loops in ADO data processing: leaving the MoveNext out of the body of the loop.

The DataReader stream is unbuffered, so there's a minimal demand on memory. However, records are retrieved only when you call the Read method. As a result, the DataReader must maintain its connection to the data source. The connection isn't closed until you call the DataReader's Close method:

rdr.Close

The records in the DataReader are read-only, so there's not much manipulation that you'll do with the data. Typically, you'll use the DataReader to retrieve records and write them out to a grid on your form or into a table in a Web page. For maximum scalability, you should retrieve only the fields and records that you need, process the records as quickly as possible, and close your DataReader as soon as possible.

You're not restricted to retrieving a single set of records with the DataReader. You can use multiple Select statements or calls to stored procedures. You move from one set of records to another using the DataReader's NextResult method. In this example, the records from the Orders and Customers tables are retrieved. However, the code skips over the Orders records to loop through the Customer records:

Dim cmd As SqlCommand

Dim rdr As SqlDataReader

 

cmd = New SqlCommand("SELECT * From Orders;" & _

                    "SELECT * From Customers;", _

                      nwindConn)

rdr = cmd.ExecuteReader()

conn.Open()

rdr.NextResult

Do While myReader.Read()

…Customer records processing…

Loop

rdr.Close()

Like the Read method, the NextResult method returns False if there isn't another result set to retrieve.