What’s a BULK INSERT?
If you want to get larger amounts of data into a database, the wonder weapon in the SQL Server environment is of course SQL Server Integration Services. If you want to do this for a SQL database in the cloud, you don’t have an SSIS engine available for the time being, you need to use an Azure Data Factory with an Integration Services Runtime. But if it’s just a matter of loading a formatted file into a table, you may be able to take a slightly “easier” way and load the data into the table using a
BULK INSERT statement.
How can you use BULK INSERT?
BULK INSERT command writes the data from a file into a table, and the command automatically works in batches, the size of which can be configured. To have data that we can import, we export the data from a query on the
WideWorldImporters database to a CSV file. To do this, we execute the following query in Azure Data Studio:
SELECT StockItemID ,StockItemName ,ColorName ,RecommendedRetailPrice ,TaxRate ,UnitPrice ,SupplierName ,SupplierCategoryName FROM [Warehouse].[StockItems] si LEFT JOIN [Warehouse].[Colors] c ON si.ColorID = c.ColorID LEFT JOIN [Purchasing].[Suppliers] sup ON si.SupplierID = sup.SupplierID LEFT JOIN [Purchasing].[SupplierCategories] supc ON sup.SupplierCategoryID = supc.SupplierCategoryID
We now export this data as CSV. We can do this both in SQL Server Management Studio and Azure Data Studio by selecting the whole table and then copying and pasting it into Excel, where you will save the result as CSV. It would be easier to save in Azure Data Studio by right-clicking on the table and selecting “Save as CSV”, but then we have no control over thousand separators and column separators, which we may need to adjust, because even in 2020 computers are not yet ready to paste data you have exported from a database directly back into it.
Now we want to use this CSV file to load it into a new table via
BULK INSERT. For this we have to create a table first. A few weeks ago you already got to know the pattern
SELECT TOP 0 INTO, which we now use to create a table:
SELECT TOP 0 StockItemID ,StockItemName ,ColorName ,RecommendedRetailPrice ,TaxRate ,UnitPrice ,SupplierName ,SupplierCategoryName INTO Sales.SpecialItems FROM [Warehouse].[StockItems] si LEFT JOIN [Warehouse].[Colors] c ON si.ColorID = c.ColorID LEFT JOIN [Purchasing].[Suppliers] sup ON si.SupplierID = sup.SupplierID LEFT JOIN [Purchasing].[SupplierCategories] supc ON sup.SupplierCategoryID = supc.SupplierCategoryID
Now you can use
BULK INSERT to write to this empty table, but you must make sure that the file exists on your database server, not on your client. This means you must either copy it there, or put it on a network share, or if you are using a Docker container like us, copy it into that container:
docker cp WWI_Items.csv container_name:/tmp/
If you have now provided the file cleanly (comma as column separator, no thousand separators and dot as decimal separator), and deleted all
NULL values (the string “NULL” is not a NULL value during import), you can now load the file into your table:
BULK INSERT Sales.SpecialItems FROM '/tmp/WWI_Items.csv' WITH ( FORMAT='CSV');
WITH block you have further possibilities to intervene in a controlling way, you can for example set the column separator.
Provided your CSV files are clean and correspond to the desired input format, you now have a technically simple way to quickly load data from a CSV file into a table in your database.