DataSunrise Dynamic Data Masking for MS SQL Server

MS SQL Server Masking guarantees SQL Server sensitive data protection against any possible unauthorized access or hacker attack. The tool analyzes all queries to the database, filters them and allows to SQL Server only those admitted in accordance with requirements specified by an administrator in settings. All responses to unwanted queries are masked. The response to an unauthorized query represents random fake values.

Tech Info

Data masking (data obfuscation) is an efficient method of preventing sensitive information exposure when giving database access to third-party. There are a lot of situations when organizations need to share data from their production databases with third party. Here are some examples:

  • A certain organization hires outsourced IT specialists to customize its business system or perform database testing, upgrade etc.
  • Healthcare company provides medical researchers with information on clinical trials.
  • Retail company shares sales information with market researchers.

The point is that in most cases third-party specialists don’t need real data the database contains. An environment mimicking a real production database is enough. The best way to protect sensitive data while transferring it to third-party is to replace it with some neutral data. And the most efficient way to do this is masking.

The important point is that masked data should be consistent enough to support proper functioning of the third party’s application. In simple words, the main purpose of masking is to make sensitive data useless for evil-doers while keeping it useful for the ones that should receive it.

Sometimes in such situations static masking is used. It is about providing the third party with a stand-alone copy of real database containing some neutral data instead of sensitive data. While this method is reliable, it could be pricey and time-consuming.

That’s why in most cases dynamic data masking is preferable. Unlike static data masking, dynamic data masking is about obfuscating sensitive data on-the-fly, while transferring it to third party. In this case actual database contents remains intact and only database output is obfuscated.

This is how it works

DataSunrise Dynamic Data Masking solution is deployed as a proxy between clients (those third-party specialists) and your SQL Server database. Clients contact production database through DataSunrise proxy only (any direct access to database is disabled).

MS SQL Server masking tool by DataSunrise intercepts client query, changes it according to existing security policies and redirects modified (“masked”) query to the database. Having received “masked” query, the database outputs fake (obfuscated) data instead of real values originally requested by the client. Since no real data leaves the database, this method of data obfuscation is very reliable.

And this is how it looks like

Original tableFor example, here we have MS SQL table which resembles a list of customers including addresses, emails and credit card numbers. Before exporting this list to a third-party system, we need to obfuscate customers’ personal data.

To do this, some masking rules were created. Along with general-purpose masking methods DataSunrise provides obfuscation algorithms for emails and credit card numbers. So, we’ve created some masking rules to obfuscate data in columns that contain customers' emails and credit card numbers. Let’s assume that leaving other columns unmasked is acceptable.

Masked table

And here how the table looks like after the masking is applied. As you can see, data in specified columns were obfuscated and thus, it became useless for potential wrong-doer.

Masking internals

Now let's see how masking looks like in SQL.

Here's the original client query:

SELECT TOP 1000 [Order] ,
           [First Name] ,
           [Last Name] ,
           [Address] ,
           [ZIP] ,
           [Email] ,
           [Card]
FROM [sales].[dbo].[Customers]

And SQL query the database gets after masking is applied:

SELECT TOP 1000 [Order] ,
           [First Name] ,
           [Last Name] ,
           [Address] ,
           [ZIP] ,
           CAST(CASE PATINDEX('%.%', LTRIM(RTRIM(CAST([EMAIL] AS VARCHAR(MAX)))))
                    WHEN 0 THEN REPLICATE('*', LEN(LTRIM(RTRIM(CAST([EMAIL] AS VARCHAR(MAX))))))
                    ELSE CASE PATINDEX('%@%', SUBSTRING(LTRIM(RTRIM(CAST([EMAIL] AS VARCHAR(MAX)))), 1, LEN(LTRIM(RTRIM(CAST([EMAIL] AS VARCHAR(MAX))))) - PATINDEX('%.%', REVERSE(LTRIM(RTRIM(CAST([EMAIL] AS VARCHAR(MAX))))))))
                             WHEN 0 THEN REPLICATE('*', LEN(LTRIM(RTRIM(CAST([EMAIL] AS VARCHAR(MAX))))) - PATINDEX('%.%', REVERSE(LTRIM(RTRIM(CAST([EMAIL] AS VARCHAR(MAX)))))))
                             ELSE STUFF(REPLICATE('*', LEN(LTRIM(RTRIM(CAST([EMAIL] AS VARCHAR(MAX))))) - PATINDEX('%.%', REVERSE(LTRIM(RTRIM(CAST([EMAIL] AS VARCHAR(MAX))))))), PATINDEX('%@%', LTRIM(RTRIM(CAST([EMAIL] AS VARCHAR(MAX))))), 1, '@')
                         END + SUBSTRING(LTRIM(RTRIM(CAST([EMAIL] AS VARCHAR(MAX)))), LEN(LTRIM(RTRIM(CAST([EMAIL] AS VARCHAR(MAX))))) - PATINDEX('%.%', REVERSE(LTRIM(RTRIM(CAST([EMAIL] AS VARCHAR(MAX)))))) + 1, PATINDEX('%.%', REVERSE(LTRIM(RTRIM(CAST([EMAIL] AS VARCHAR(MAX)))))))
                END AS varchar(50)) AS "EMAIL" ,
           CAST(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(CASE
           WHEN LEN(LTRIM(RTRIM(CAST([CARD] AS VARCHAR(MAX))))) >= 4 THEN LEFT(LTRIM(RTRIM(CAST([CARD] AS VARCHAR(MAX)))),
           LEN(LTRIM(RTRIM(CAST([CARD] AS VARCHAR(MAX))))) - 4)
           ELSE LTRIM(RTRIM(CAST([CARD] AS VARCHAR(MAX))))
           END, '0', 'X'), '1', 'X'), '2', 'X'), '3', 'X'), '4', 'X'), '5', 'X'), '6', 'X'), '7', 'X'), '8', 'X'), '9', 'X') + CASE
           WHEN LEN(LTRIM(RTRIM(CAST([CARD] AS VARCHAR(MAX))))) >= 4 THEN RIGHT(LTRIM(RTRIM(CAST([CARD] AS VARCHAR(MAX)))), 4)
           ELSE ''

           END AS varchar(20)) AS "CARD"
FROM [sales].[dbo].[Customers]

Related Articles