I’m working on an app for a client with some interesting requirements. The app has its on Azure SQL DB thats ultimately for staging data, but is part of a project with a Data Warehouse. We needed some office location data from the data warehouse to populate a drop down in the app and I decided to try the EXTERNAL DATA SOURCE feature.

External data sources seem similar in concept to linked tables but are much different in practice. They let you connect to hadoop data, azure blob storage, do elastic tables, and expose a table in one database to another. The last one is the least sexy, but the one I want. It’s also the option with the least attention in books online and blogs. I’m here to fix that.

Defining our problem

Of course these are not the real names of my clients servers, dbs and tables.

The data warehouse is on a server called mywarehouse.database.windows.net. The database name is myWarehouse. It has a table called dbo.dimFacilities. We want to expose a subset of that tables columns to a database called loadingDb on myapp.database.windows.net.

Doing the work

loadingDb is going to need to authenticate to myWarehouse and perform some selects. Therefore we need to create a user in myWarehouse. for loadingDb and assign it to the db_datareader role.

CREATE USER app_dw_reader WITH PASSWORD = '************';

GO

EXEC sp_addrolemember @rolename = 'db_datareader', @membername = 'app_dw_reader';

GO

It’s probably better to create a custom role with just the selects on the table we want, but that’s an exercise for the reader. You are not limited to contained users here. You can also use AzureAD users or users associated with a login.

Now its time to turn our attention to loadingDb. The first two objects we will create are a MASTER KEY and a SCOPED CREDENTIAL.

CREATE MASTER KEY ENCRYPTION BY PASSWORD = '**************';

GO

CREATE DATABASE SCOPED CREDENTIAL dw_reader WITH IDENTITY = 'app_dw_reader', SECRET = '********';

GO

Note the MASTER KEY password is a distinct password the SCOPED CREDENTIAL password must match the users password. The master key is used to encrypt the scoped credential. When you change the password for app_dw_reader, you need to use ALTER DATABASE SCOPED CREDENTIAL to update the secret.

Now that we have that the next object is the external data source This somewhat analogous to adding a linked server entry.

CREATE EXTERNAL DATA SOURCE data_warehouse

WITH (

TYPE = RDBMS,

LOCATION = 'mywarehouse.database.windows.net',

DATABASE_NAME = 'myWarehouse',

CREDENTIAL = dw_reader

);

GO

Note this command will succeed even if the credential is invalid as long as the credential object exists. You don’t get an error until you try to query the linked table. Luckily its a useful error.

Now we need to define the table and the columns. You can define just a subset of the columns, but column definitions must match exactly or you get an error on select. Luckily, the errors returned if the select failed list the column in question.

CREATE EXTERNAL TABLE [dbo].[dimFacility]

(

[facilityId] int NOT NULL,

[facilityName] NVARCHAR(100) NULL,

[country] NVARCHAR(64) NULL,

)

WITH (

DATA_SOURCE = dev_data_warehouse

);

GO



Tangential note on column definitions. If you are using sys.columns.max_length to get the length of NVARCHAR columns divide that by two because that is length in bytes. The azure portal query editor doesn’t show column lengths, so I decided to use DMVs.

Now if everything worked out well you should be able to do a SELECT * FROM dbo.dimFacility; in loadingDb and get you data. Congratulations!