T-SQL Ninja #24

SELECT TOP 0 INTO

SELECT TOP 0 INTO

Occasionally you may want to create an empty table to store the data from a query. Of course, you can now check the metadata of all columns in the query from all source tables and write a DDL statement. But this can be quite complicated with queries with multiple joins and possibly manipulations in the columns. If you want to copy data into a new table you can do this with SELECT INTO. If you only want to copy the structure instead, you can simply copy the TOP 0 rows, i.e. no rows, which will give you a table that only reflects the structure of the source statement.

What can you use SELECT TOP 0 INTO for?

As described above, you can use SELECT TOP 0 INTO to create an empty table that is structurally identical to your query. We will do this once by looking at some of the attributes of the Customers table in the WideworldImporters database that we want to store in a MainCustomers table:

DROP TABLE IF EXISTS [Sales].[MainCustomers]
SELECT TOP (0) 
       [CustomerID]
      ,[CustomerName]
      ,[CreditLimit]
      ,[AccountOpenedDate]
      ,[StandardDiscountPercentage]
      ,[IsStatementSent]
      ,[IsOnCreditHold]
      ,[PaymentDays]
      ,[PhoneNumber]
      ,[FaxNumber]
      ,[DeliveryRun]
      ,[RunPosition]
      ,[WebsiteURL]
      ,[DeliveryAddressLine1]
      ,[DeliveryAddressLine2]
      ,[DeliveryPostalCode]
      ,[DeliveryLocation]
      ,[PostalAddressLine1]
      ,[PostalAddressLine2]
      ,[PostalPostalCode]
  INTO [Sales].[MainCustomers]
  FROM [Sales].[Customers]

In your database you will now find a MainCustomers table, which contains exactly the fields selected here with identical metadata. Of course, the whole thing gets a bit more complex if you execute a JOIN instead of just copying the metadata from a table. To do this, we first determine the total turnover and the number of orders from our customers:

SELECT 
[CustomerID]
,SUM(TransactionAmount) as [TotalCustomerValue]
,COUNT(Distinct CustomerTransactionId) as [NumberOfTransactions]
FROM [Sales].[CustomerTransactions]
GROUP BY [CustomerID]

We now execute this in a CTE, which we join to the above result set and again create a table in which we could write the result:

DROP TABLE IF EXISTS [Sales].[MainCustomers]

;WITH CTE_SalesPerCustomer AS (
    SELECT 
         [CustomerID]
        ,SUM(TransactionAmount) as [TotalCustomerValue]
        ,COUNT(Distinct CustomerTransactionId) as [NumberOfTransactions]
    FROM [Sales].[CustomerTransactions]
    GROUP BY [CustomerID]
)
SELECT TOP (0) 
       c.[CustomerID]
      ,[CustomerName]
      ,[CreditLimit]
      ,[AccountOpenedDate]
      ,[StandardDiscountPercentage]
      ,[IsStatementSent]
      ,[IsOnCreditHold]
      ,[PaymentDays]
      ,[PhoneNumber]
      ,[FaxNumber]
      ,[DeliveryRun]
      ,[RunPosition]
      ,[WebsiteURL]
      ,[DeliveryAddressLine1]
      ,[DeliveryAddressLine2]
      ,[DeliveryPostalCode]
      ,[DeliveryLocation]
      ,[PostalAddressLine1]
      ,[PostalAddressLine2]
      ,[PostalPostalCode]
      ,[TotalCustomerValue]
      ,[NumberOfTransactions]
  INTO [Sales].[MainCustomers]
  FROM [Sales].[Customers] c
  LEFT JOIN CTE_SalesPerCustomer spc 
  ON c.CustomerID = spc.CustomerID

As a result we now have a table whose metadata we want to look at:

SELECT c.name AS column_name
    ,c.column_id
    ,SCHEMA_NAME(t.schema_id) AS type_schema
    ,t.name AS type_name
    ,t.is_user_defined
    ,t.is_assembly_type
    ,c.max_length
    ,c.precision
    ,c.scale
FROM sys.columns AS c 
JOIN sys.types AS t ON c.user_type_id=t.user_type_id
WHERE c.object_id = OBJECT_ID('Sales.MainCustomers')
ORDER BY c.column_id;

Here you see as the last row of the result set the entries TotalCustomerValue as Decimal(38,2) column and NumberOfTransactions as Int column.

But now we want to do it even wilder and add a calculated column to the table:

ALTER TABLE [Sales].[MainCustomers] 
    ADD [AverageTransactionValue] 
    AS [TotalCustomerValue]/CAST(NumberOfTransactions AS Decimal(10,2))

From this table, we now execute another SELECT TOP 0 INTO:

SELECT TOP 0 * 
INTO [Sales].[MainCustomersExtended]
FROM [Sales].[MainCustomers] 

And let us now query the metadata of both tables:

SELECT c.name AS column_name
    ,c.column_id
    ,SCHEMA_NAME(t.schema_id) AS type_schema
    ,t.name AS type_name
    ,t.is_user_defined
    ,t.is_assembly_type
    ,c.max_length
    ,c.precision
    ,c.scale
FROM sys.columns AS c 
JOIN sys.types AS t ON c.user_type_id=t.user_type_id
WHERE (
           c.object_id = OBJECT_ID('Sales.MainCustomers') 
        OR c.object_id = OBJECT_ID('Sales.MainCustomersExtended')
      ) AND c.name = 'AverageTransactionValue'
ORDER BY c.column_id;

Then we see two identical columns. However, if we now take the CREATE TABLE statements of both tables, we see that in MainCustomers the definition of the Computed Column was taken over, while in MainCustomersExtended the column simply exists as an empty column.

This means that with a SELECT TOP 0 all the information beyond the pure metadata (which data type is stored here) is lost. This includes information such as filegroups, indexes, primary keys, and so on. The quick workaround for creating tables with the same structure is therefore a practical tool, but in the end, with productive databases, you still have to pay attention to the details, because these may end up hurting you in the end.

References

  • Today without further links.

Ninja-Notebooks @ GitHub

Leave a Reply

Your email address will not be published. Required fields are marked *