Improving Performance by Batching Azure Table Storage Inserts

DZone 's Guide to

Improving Performance by Batching Azure Table Storage Inserts

· Cloud Zone ·
Free Resource

This is a short post to share the results of a little investigation I did that was inspired by comments on a post I wrote about using SQL Azure for handling session data. The comment was by someone reporting that SQL Azure seemed to be faster than Azure Table Storage for handling session data. My experiments show that SQL Azure and Table Storage have very similar performance when doing single writes (YMMV), so I can’t verify or refute the claim. However, I got to wondering which is faster for inserting and retrieving many “rows” of data. I know that Table Storage is supposed to be faster, but I wondered how much faster. So I wrote a two-part PHP script that does the following:

  1. Connects to SQL Azure.
  2. Inserts 100 rows to an existing database.
  3. Retrieves the 100 rows.

Here’s the code:

$conn = sqlsrv_connect(SQLAZURE_SERVER_ID.".database.windows.net,1433", array("UID"=>SQLAZURE_USER."@".SQLAZURE_SERVER_ID
                                                                            , "PWD"=>SQLAZURE_PASSWORD
                                                                            , "Database"=>SQLAZURE_DB
                                                                            , "ReturnDatesAsStrings"=>true));    
for($i = 0; $i < 100; $i++)
    $id = $i;
    $data = "GolferMessage".$i;
    $params = array($id, $data);
    $stmt1 = sqlsrv_query($conn, "INSERT INTO Table_1 (id, data) VALUES (?,?)", $params);
    if($stmt1 === false)
$stmt2 = sqlsrv_query($conn, "SELECT id, data, timestamp FROM Table_1");
while($row = sqlsrv_fetch_array($stmt2, SQLSRV_FETCH_ASSOC))

Note: The code above uses the SQL Server Driver for PHP to connect to SQL Azure.

The second part of the script does the equivalent for Table Storage:

  1. Connects to Azure Storage.
  2. Inserts 100 entities to an existing table.
  3. Retrieves the 100 entities.

Here’s the code:

$tableStorageClient = new Microsoft_WindowsAzure_Storage_Table('table.core.windows.net', STORAGE_ACCOUNT_NAME, STORAGE_ACCOUNT_KEY);
$batch = $tableStorageClient->startBatch();

for($i = 0; $i < 100; $i++)
    $name = $i;
    $message = "GolferMessage".$i;
    $mbEntry = new MessageBoardEntry();
    $mbEntry->golferName = $name;
    $mbEntry->golferMessage = $message;
    $tableStorageClient->insertEntity('MessageBoardEntry', $mbEntry);
$messages = $tableStorageClient->retrieveEntities("MessageBoardEntry", null, "MessageBoardEntry");
foreach($messages as $message)

Note: The code above uses the Windows Azure SDK for PHP to connect to Azure Storage.

The result of the test was that Table Storage was consistently 4 to 5 times faster than SQL Azure (again, YMMV). The key, however, was to use the $tableStorageClient->startBatch() and $batch->commit() methods with Table Storage. Without using batches, Table Storage opens and closes a new HTTP connection for each write, which results in slower performance than SQL Azure (which keeps a connection open for writes). When using batches with Table Storage, the connection is kept open for all writes.

Note: Many thanks to Maarten Balliauw who, when I was perplexed about the results of my tests without batching (I expected Table Storage to be faster, but because I didn’t know about batches for Table Storage, I was not getting the results I expected), suggested I try batching.

The complete script (with set up/tear down of database and Table) is attached in case you want to try for yourself.




Source: http://blogs.msdn.com/b/silverlining/archive/2012/01/25/improving-performance-by-batching-azure-table-storage-inserts.aspx


Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}