Recently I wrote a post about my first experiences with synapse and wanting to talk to them from Logic Apps. In this post I want to explore a scenario where we have an application database which wants to query the list of populations from my Synapse workspace.
The original article is here: https://mikestephenson.me/2021/08/03/query-synapse-serverless-sql-from-logic-apps/
In my scenario I have built a Power App which uses a SQL Database as its backend data source. My app looks like below.
The scenario I have is that some of the data I want to use in my app is held in our Synapse platform in a view which runs against our datalake. I want to be able to import on demand some data meeting the criteria that the user wants so that they can work on a specific dataset.
In this case I want my user to supply a set of criteria and then they would execute a stored procedure which would pull in the data we want from Synapse by querying the view.
To simulate this scenario I have a table of population data in my SQL database. I then want to have a stored procedure which will clear out the population data and replace it with the population data matching the criteria I supply. The below diagram shows how we will do this.
Lets have a quick walk through how I did this.
Synapse Configuration
Remember that from the previous post I already have a view called dbo.populationView which is the object I want to query. I then need to setup a credential which my application database can use to connect to Synapse. I also need to give it access to the view and to be able to reference the scoped credential which is used by the view. The below 4 statements will create the logic, user and provide access to the view in Synapse.
CREATE LOGIN ClientSQLDB WITH PASSWORD = 'password goes here';
CREATE USER ClientSQLDB FROM LOGIN ClientSQLDB;
GRANT SELECT ON OBJECT::[dbo].[populationView] TO [ClientSQLDB]
GRANT REFERENCES ON DATABASE SCOPED CREDENTIAL::WorkspaceIdentity to [ClientSQLDB]
Application SQL DB Configuration
In the application database I need to then setup access as an external table the view from Synapse.
The first thing I need to do is setup a database scoped credential to allow my database to connect to Synapse.
CREATE DATABASE SCOPED CREDENTIAL SynapseSqlCredential
WITH IDENTITY = 'ClientSQLDB', SECRET = 'Matching password we setup in Synapse';
GO
I will then create an external data source which will point to my Synapse workspace and my test database which references the credential I just setup.
CREATE EXTERNAL DATA SOURCE SynapseSqlDataSource
WITH (
TYPE = RDBMS,
LOCATION = 'blog-ms-demo-ondemand.sql.azuresynapse.net',
DATABASE_NAME = 'Test',
CREDENTIAL = SynapseSqlCredential
);
GO
I will then create an external table which has the schema I need to reference the view.
CREATE EXTERNAL TABLE dbo.populationView(
[country_code] varchar(5) NULL,
[country_name] varchar(100) NULL,
[year] smallint NULL,
[population] bigint NULL
)
WITH ( DATA_SOURCE = SynapseSqlDataSource );
This will now let me query the view in synapse if I run the following query
SELECT * FROM dbo.populationView
I now want to have a local table which contains the data I want to work with. This will be my local copy of the data that I want to use while working on my app functionality.
CREATE TABLE [dbo].[local_populations](
[county_code] [varchar](5) NULL,
[country_name] [varchar](100) NULL,
[year] [smallint] NULL,
[population] [bigint] NULL
) ON [PRIMARY]
GO
Next I need a stored procedure which I can run which will refresh my local table with just the data which I want. In this case I will run the procedure with a year so ill clean out my local table and re-populate it with the data matching the year that I am interested in from the Synapse view.
CREATE PROCEDURE [dbo].[RefreshPopulations]
@year smallint
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON
Delete From [dbo].[local_populations]
INSERT INTO [dbo].[local_populations]
SELECT * FROM dbo.populationView
Where [year] = @year
END
I can now test the procedure running it as follows and it will refresh the table from Synapse and then I can work with the data set locally in my Power App.
Exec [dbo].[RefreshPopulations] 2013
Select * From [dbo].[local_populations]
As you can see below I just have the data for 2013
In my Power App I can now add functionality so that a user can trigger the stored procedure with a year and the app will run a refresh of the data we are interested in when the user needs to do a refresh.
Summary
There are a number of ways which we can refresh data between databases including things like data flows in Synapse to push data to systems that want the data, and im sure there are a number of different ways you can build elaborate pipelines which can be triggered by users.
In this case I wanted to test the ability to query from a SQL DB to a Synapse Serverless SQL pool’s view to refresh some data and see if there is a simple way to implement a refresh on demand approach. Im sure that there will be some constraints to keep in mind with this approach like the amount of data you might want to copy across and other considerations. Probably you also need to be cautious about the cost of how often the data is queried too but in the right scenarios Im sure this can work well and simple is always good!