This morning I was pondering a mapping thing I needed to do and I thought, about how cool Kusto is when querying and transforming data with queries in products like application insights, data explorer and log analytics and I was wondering if I might be able to use Kusto to transform data from Logic Apps.
Am I crazy?
There are a number of different ways you can transform data with Logic Apps including:
- Integration Accounts
- XSLT
- Liquid
- Javascript
- Compose Shape
- Call an Azure Function / API
They all have their pros and cons. XSLT and Liquid has some language limitations, some need the cost of an integration account if your on consumption plan and then others like functions or APIM need the external resource to be setup.
I was thinking that I have both app insights and log analytics available so can I just pass some data to a kusto query and execute it and transform the data.
How might you do it?
I wanted to have a reasonably interesting scenario so I will have 2 json array datasets. The first is a list of railcars which were are out travelling to a customer. I get an update batch from my transport partner listing the cars, their weight and the product. The below is the example data set. In the real world I might have received a flat file and parsed it into a friendly json format in my Logic App.
[
{"rail_car_id":1,"load_id":"121","gross_weight":10000,"product":"gold"},
{"rail_car_id":3,"load_id":"122","gross_weight":10000,"product":"gold"},
{"rail_car_id":2,"load_id":"123","gross_weight":10000,"product":"gold"},
{"rail_car_id":3,"load_id":"124","gross_weight":10000,"product":"gold"},
{"rail_car_id":1,"load_id":"125","gross_weight":10000,"product":"gold"},
{"rail_car_id":1,"load_id":"126","gross_weight":10000,"product":"gold"},
{"rail_car_id":1,"load_id":"127","gross_weight":10000,"product":"gold"}
]
Next I lookup the inventory of my railcars so I have a list of all railcars, their supplier and the weight of the car.
[
{"rail_car_id":1,"supplier":"acme", "car_weight": 900},
{"rail_car_id":2,"supplier":"fabrikam", "car_weight": 700},
{"rail_car_id":3,"supplier":"contoso", "car_weight": 1000}
]
What I want to do is merge the 2 data sets together so I can create a combined dataset which joins so that for each railcar I can add the supplier column and the weight and then Ill perform a calculation to workout the net weight of product by removing the car weight from the gross weight.
In Kusto one of the things you can do in addition to reading a log table is set a dynamic variable which acts like an in memory table.
If we look at the below query where I have inline json I can query this as if its a table.
let railcarData = print railcarJson = dynamic(
[
{"rail_car_id":1,"supplier":"acme", "car_weight": 900},
{"rail_car_id":2,"supplier":"fabrikam", "car_weight": 700},
{"rail_car_id":3,"supplier":"contoso", "car_weight": 1000}
])
| mvexpand railcarJson
| evaluate bag_unpack(railcarJson);
railcarData;
If I extend the query to be like below then I can do a join between the two tables and also project the values I want and create a new calculated column to add my net weight. The query would look like the below.
let railcarData = print railcarJson = dynamic(
[
{"rail_car_id":1,"supplier":"acme", "car_weight": 900},
{"rail_car_id":2,"supplier":"fabrikam", "car_weight": 700},
{"rail_car_id":3,"supplier":"contoso", "car_weight": 1000}
])
| mvexpand railcarJson
| evaluate bag_unpack(railcarJson);
let loadedRailcarData = print loadedRailcarJson = dynamic(
[
{"rail_car_id":1,"load_id":"121","gross_weight":10000,"product":"gold"},
{"rail_car_id":3,"load_id":"122","gross_weight":10000,"product":"gold"},
{"rail_car_id":2,"load_id":"123","gross_weight":10000,"product":"gold"},
{"rail_car_id":3,"load_id":"124","gross_weight":10000,"product":"gold"},
{"rail_car_id":1,"load_id":"125","gross_weight":10000,"product":"gold"},
{"rail_car_id":1,"load_id":"126","gross_weight":10000,"product":"gold"},
{"rail_car_id":1,"load_id":"127","gross_weight":10000,"product":"gold"}
])
| mvexpand loadedRailcarJson
| evaluate bag_unpack(loadedRailcarJson);
loadedRailcarData
| join kind=leftouter railcarData on rail_car_id
| extend net_weight = gross_weight - car_weight
| project rail_car_id, supplier, load_id, product, net_weight, gross_weight, tare_weight=car_weight
This shows I can do a query to transform 2 json arrays using Kusto but what about doing it from a Logic App.
In the Logic App I am using 2 compose shapes to simulate the source data shown below.
Next I can use one of a couple of different shapes to execute the kusto query. In my blog azure account I have an app insights already available so ill use that, but I could also use the Azure Data Explorer or Log Analytics Query. In this case I wont be writing any data to the logs so it doesnt really matter which I use.
In the app insights action to execute a query I will dynamically add my json from the compose shapes to the query from earlier.
Now when I run the Logic App it will execute my query and ill get the json output as shown below.
To show the output json in more detail you can see the below.
[
{
"rail_car_id": 1,
"supplier": "acme",
"load_id": "127",
"product": "gold",
"net_weight": 9100,
"gross_weight": 10000,
"tare_weight": 900
},
{
"rail_car_id": 1,
"supplier": "acme",
"load_id": "126",
"product": "gold",
"net_weight": 9100,
"gross_weight": 10000,
"tare_weight": 900
}
]
We can see that I have merged the 2 tables and Ive added the columns and created the calculated column. All works pretty nicely and its pretty easy to implement.
Would I actually do this in the real world?
In this post I was really just experimenting to see if you could do it. There are a bunch of constraints you need to be careful about depending if you choose to use App Insights, Log Analytics or Azure Data Explorer to run your query. There are constraints such as concurrent runs, query duration, size of data returned. You will have questions about the size of the query message I am sure too.
I think if you were looking at doing something with Log Analytics or App Insights, you could do it but only if your load was pretty low.
With Data Explorer there seems to be more configurable options and you could probably do it at higher load.
One of the places where you would potentially use this would be to log the data coming in and then aggregate it with a query from 1 or more logs. Maybe you could use an approach similar to the above to link some relatively static reference data to some log data using a join.
What would be cool thou?
While the aim of the above was really a proof of concept to see what I could do, it would be pretty cool if there was a connector or something where I could just execute some kusto queries on 1 or more simple json objects to give access to all of those cool kusto functions.