Using Lua script for discovering sensitive data in JSON files

Using Lua script for discovering sensitive data in JSON files

When working with the Data Discovery feature, DataSunrise enables you to use a variety of prebuilt search filters for various types of sensitive data. But is there a way of searching for unique data? Yes, to accomplish this task you can use Lua. The Lua scripts when used for the Data Discovery enables searching of literally any text-type values not covered by the existing templates.


This article describes how you can use Lua for locating the database columns of interest in JSON files. A dedicated script is used to do that so you can base your own search on the algorithm described below. Note that this can be done not only with JSONs but with any type of files – you just need to create an appropriate Lua script.


You can copy the script used in this article at here:

-- Specify values you want to discover in sensetive_from_json list
-- e.g. {"data","id","name"}
sensetive_from_json = {"{wefsdf","123","data"}
-- valStr will contain json as text
local valStr = tostring(columnValue)
local valStrLen = string.len(valStr)
-- function to get length of list T
function tablelength(T)
  local count = 0
  for _ in pairs(T) do count = count + 1 end
  return count
end
-- get count of elements in sensetive_from_json list
count = tablelength(sensetive_from_json)
-- identify if column contains json formatted data
if ((string.sub(valStr, 1, 1) == '{') and (string.sub(valStr, valStrLen, valStrLen) == '}')) then
  for i=1,count,1 do
-- if json does contain at least 1 desired value return 1, else 0
    if (string.find(valStr,tostring(sensetive_from_json[i])) ~=nil ) then
     return 1
    end
  end
return 0
else
  return 0
end

First, create your Lua script for searching your own data of interest. Note that the particular script we created for this article among other things checks if the processed file is formatted like a JSON file. For other file types you should use other validation algorithms. We fill in the required values to the script. For your convenience we left some comments there.

So, our script is ready for processing and we can go to the DataSunrise’s Web Console.


We navigate to Data Discovery -> Information Types and create a new Information Type:


We Add a new Attribute and in the attribute’s settings, select “Column Data”. In the “Column Data Type”, we select “Strings Only”. In the Search Method, we select “Lua Script”:


Then we click “Edit Lua Script” for the script’s code. We Paste our script into the Script field and save it:


Now we can create a new Data Discovery task. In the Search Filters subsection, we select Information Types and select our Information Type to use for discovery:

Download free 30 days Trial