How do reduce transaction log growth for batched nvarchar(max) updates

Our app needs to add large amounts of text to SQL Server 2005 database (up to 1 GB for a single record). For performance reasons, this is done in chunks, by making a stored procedure call for each chunk (say, usp_AddChunk). usp_AddChunk does not have any explicit transactions.

What I’m seeing is that reducing the chunk size from 100MB to 10MB results in massively larger transaction logs. I’ve been told this is because each time usp_AddChunk is called, an “implicit” (my term) transaction will log all of the existing text. So, for a 150MB record:

  • Error at the transport level when sending a request to the server
  • How to know if a child exists on any nest level in a tree structure
  • SQL: Remove duplicates
  • how can i use variable table $r1 in the query $r2 instead of table name on FROM clause in php file
  • SQL Server syntax for named keys/indices inside table declaration
  • How to write using BCP to a remote SQL Server?
  • 100MB chunk size: 100 (0 bytes logged) + 50 (100 MB logged) = 100 MB logged

    will be smaller than

    10 MB chunk size: 10 (0 bytes logged) + 10 (10 MB logged) + 10 (20 MB logged) … + 10 (140 MB logged) = 1050 MB logged

    I thought that by opening a transaction in my C# code (before I add the first chunk, and commit after the last chunk), this “implicit” transaction would not happen, and I could avoid the huge log files. But my tests show the transaction log growing 5x bigger using the ADO.NET transaction.

    I won’t post the code, but here’s a few details:

    1. I call SqlConnection.BeginTransaction()
    2. I use a different SqlCommand for each chunk
    3. I assign the SqlTransaction from (1) to each SqlCommand
    4. I usually close the connection after each SqlCommand execution, but I’ve also tried not closing the connection with the same results

    What’s the flaw in this scheme? Let me know if you need more info. Thanks!

    Note: using a simple or bulk-logged recovery model is not an option

  • Pick a physical location for a single database in Management Studio
  • Saving changes after table edit in SQL Server Management Studio
  • Sql Server is not combining the results , it is becoming mutually exclusive
  • Metadata database design
  • Error on converting date time to specific format in SQL Server
  • Simple insert query in stored procedure
  • 2 Solutions collect form web for “How do reduce transaction log growth for batched nvarchar(max) updates”

    If by ‘chunks’ you mean something like:

    UPDATE table
    SET blob = blob + @chunk
    WHERE key = @key;

    Then you are right that the operation is fully logged. You should follow the BLOB usage guidelines and use the .Write methods for chuncked updates:

    UPDATE table
    SET blob.Write(@chunk, NULL, NULL)
    WHERE key = @key;

    This will minimally log the update (if possible, see Operations That Can Be Minimally Logged):

    The UPDATE statement is fully logged;
    however, partial updates to large
    value data types using the .WRITE
    clause are minimally logged.

    Not only that this is minimally logged, but because the update is an explicit write at the end of the BLOB, the engine will know that you only updated a portion of the BLOB and will only log that. When you update with SET blob=blob+@chunk te engine will see that the entire BLOB has received a new value and won’t detect the fact that you really only changed the BLOB by appending new data, so the it will log the entire BLOB (several times, as you already found out).

    BTW you should use chunks of size multiple of 8040:

    For best performance, we recommend
    that data be inserted or updated in
    chunk sizes that are multiples of 8040

    What you may have to do is surround each “chunk” or group of chunks with it’s own transaction and commit after each group. Surrounding the entire thing with your own ADO transaction is essentially doing the same thing as the implicit transaction does, so that won’t help. You have to commit in smaller chunks to keep the log smaller.

    MS SQL Server is a Microsoft SQL Database product, include sql server standard, sql server management studio, sql server express and so on.