Kusto: Creating an IfExists function to check if a table exists


One of the things I find lacking in Kusto is an explicit way to test for the existence of a table: in both Azure SQL and Azure Data Lake, the ifexists function and exists compiler directive, respectively served this purpose.

Kusto doesn’t seem to have an explicit statement supporting this, but you can roll your own using the isfuzzy union argument. The isfuzzy argument basically says that a union should run as long as at least one table exists.

So here’s how to create your own “ifexists” function for kusto.

Write a function that takes a table name as a string input. Within the function itself, create a datatable with a single row named Exists with a value of 0.

Then union that datatable with the function input using the table operator like so and do a count of rows alias with the name Exists. I’d limit this to a single row, so you minimize execution.

Finally, sum up the Exists column result of your union so that you have only a single row. If the tablename from the input exists and has at least one row, your function will return 1. If it doesn’t exist or is empty, your function will return 0.

When using this, you convert the table output of the function to a scalar value using the toscalar function, like so. That’s it.

Of course, you don’t have to use this as a distinct function: you could simply have a fuzzy union in your code.

There is another way to do this if you’re writing an ETL function that acts differently depending on whether the table exists. We do this with very large telemetry sets when we just want to pull new values rather than pulling months’ worth of data and overwriting.

When we’re actively developing such a function, we may need to change the schema or do a compete reload. So if the table doesn’t exist, we want to pull a much larger set of data rather than just the past day or so.

Here’s the trick: once a function is compiled, it will run regardless of whether the tables upon which it relies exist. When you aggregate a non-existent table to a single row and convert the output to a scalar—say, the most recent datetime—Kusto returns a null value without any error. We check if that value isempty, and if it is, we grab several months’ worth of data, rather than just the new rows.

Leave a Reply

Your email address will not be published. Required fields are marked *