This one was interesting and if it wasn’t for a quite obscure article on page 3 of a Google search (who even goes to page 3 anymore) it would have taken me a lot longer to figure out the problem.
Basically we’re hitting a Kafka queue in SSIS to get and process messages as they appear. There’s some info here on how to connect SSIS to Kafka.
The messages that are coming down from the queue are in UTF8 format and I wanted to store them in an NVARCHAR(MAX) column, in case we ever need to process multilingual messages.
As we get the messages down I’m storing them as a string in UTF8, which as per the below isn’t unicode:
And then when we push them to the output buffer we’re using UTF8, still not unicode:
The problem came when I tried to push the field into an NVARCHAR(MAX) column in the database, now this IS unicode So we end up getting an error when attempting to push the value from the buffer into the table:
“Unicode data is odd byte size for column 4. Should be even byte size.”
It was an easy fix, just use Unicode as the encoding when you’re pushing the value to the otuput buffer.
This is caused by SQL Server expecting an even number of bytes in an NVARCHAR column as per the below:
“Notice that compressed Unicode strings are always an odd number of bytes. This is how SQL Server determines that the string has actually been compressed, because an uncompressed Unicode string—which needs 2 bytes for each character—will always be an even number of bytes” Source (You’ll need to search for ‘odd’ to find this section..
If nothing else this was an interesting error that I couldn’t find any more info on! So hopefully this helps someone..