What Is Data Masking

Replacing sensitive information with fake values in order to protect actual data is referred as Data Masking. In simple words, data masking is a method of confusing the intruder by hiding actual data with a protective layer of real-looking useless data.

Many people confuse data masking with data access restriction, but, it is an entirely different concept. Access restriction method prevents the data to be seen by users, but users clearly realize that data is hidden. Data masking, in turn, supposed to provide users with fake data of some kind.

Why Data masking is Important?

Data leak, or inappropriate exposure of sensitive information, can affect a company on multiple levels.

Legally: Each organization is responsible for its clients’ private data. If the company loses it anyhow, then any client can take legal action against that company.

Defamation:Public exposure of production or private data, contained in a company’s database, may cause company defamation.

Loss of Future Prospects:If your competitors get your company’s information they learn your future prospects and act to beat the competition. Or your competitor can mould the information to use it against you.

What is Data masking used for?

If your production database contains real sensitive info, it doesn’t mean that databases intended for testing purposes should contain it as well. To control data exposure limits various data masking routines are used.

Level-I Masking or Compound Masking

The set of relative columns is masked as a group so as the masked data retain the same relationship across the columns. For instance, ZIP, city and state entries need to be consistent after masking applied.

Level-II Masking or Deterministic Masking

Level-II Masking is used to ensure that certain values get masked to the same value across all databases. For instance: a customer number or I.D.

Level-III Masking or Lock-Key Masking

When a company has to send its data to another company or any third party for reporting, analysis or any other business process, then Lock-Key masking is used. Original data is masked using a secure lock-key masking function. Once the company gets the data back from the 3rd party, it can recover the original data by using the same key that was used to mask it. It is also called Key-based reversible masking.

Data masking techniques

Substitution: Database content is being randomly replaced with something similar but not exactly the same. For example, it means replacing real surnames with surnames picked from a random list.

Shuffling: In substitution, the replacement data is fetched from outer source whereas in shuffling the replacement data is taken from the column itself. The data is randomly is being moved between rows until there is no reasonable correlation between the column entries achieved.

Number and Date Difference: This technique may prove itself useful if you need to protect numeric data. The original numeric data is replaced by a range of percentage. For instance; the salary data may be varied by ±5%. Some values could be increased by 5% and some values, in turn, decreases by 5%.

Encryption: Original content is converted into Patterns/codes such as Morse code or Binary. Not necessary into these two, but a company can cipher their data in any form.

Nullifying: As “nullifying” word suggests, a database entry or a column content can be replaced with NULL values.

The other techniques of data masking as follows:

X-Masking, Internal Row Synchronization, Internal Table Synchronization, Table-To-Table Synchronization, User defined SQL commands, Flat File Masking.

All these methods are used for one purpose only — to save your data from getting into the wrong hands.

DataSunrise Database Security Firewall includes the Data Auditing, Data Security and Data masking tools. It’s an integrated software product that ensures total security of your organization’s confidential and sensitive data.

DataSunrise supports all major databases and data warehouses such as Oracle, Exadata, IBM DB2, IBM Netezza, MySQL, MariaDB, Greenplum, Amazon Aurora, Amazon Redshift, Microsoft SQL Server, Azure SQL, Teradata and more. You are welcome to download a free trial if would like to install on your premises. In case you are a cloud user and run your database on Amazon AWS or Microsoft Azure you can get it from AWS market place or Azure market place.