What is Data Masking?

What is Data Masking?

Data masking, sometimes called data obfuscation is the process of hiding original data using modified content.

The main reason why data masking is used is to hide sensitive data (personal data) stored in proprietary databases. However, when masking data one shouldn’t forget that this data has to remain usable for other corporate activities, for example, for testing and (further) application development. Data masking is a very useful tool when a company needs to give access to its database(s) to outsource and third-party IT companies. When masking data it’s very important to make it look and appear consistent so that hackers and other malicious actors think that they’re dealing with genuine data. Another situation where data masking may come in very handy is to mitigate operators’ errors. Companies usually trust their employees to make good and secure decisions, however many breaches are a result of operators’ errors. If data is masked the results of such error is not so catastrophic. Also, one needs to bear in mind that not all operations in databases need the use of entirely real, accurate data.

Data masking can be done either statically or dynamically. As the name suggests, when masking data statically database administrators need to create a copy of the original data and keep it somewhere safe and replace it with a fake set of data. That is the content of a database is duplicated into a test environment and can be shared around third-party contractors and others. As a result the data needing masking is removed from the production database and moved into the test database. However perfect it may seem to work with third-party contractors using static data masking, for applications needing real data from production databases statically masked data may be a big problem.

When masking data dynamically, data is obfuscated on the go as an unauthorized database user will be trying to retrieve the data not intended for that user. Real-time masking also means that data never leaves the production database and, as a result, is less susceptible to security threats. Data is never exposed to those access the database because the contents are jumbled in real-time.

Both static and dynamic masking have their pro and con sides and teams responsible for database protection have to choose the most appropriate method of sensitive data protection. The advantages and disadvantages of each masking method with detailed instructions how to mask data using DataSunrise Database Security Suite are described in the other articles in this data masking section.

As it was mentioned earlier any data involved in any data masking has to remain meaningful at several levels:

  1. The data has to remain meaningful and valid for the application logic.
  2. The data must undergo enough changes so that it can’t be reverse engineered.
  3. The obfuscated data may be required to be consistent across multiple databases within an organization when the databases each contain the specific data element being masked.

The following techniques may be used for masking (obfuscating) data:

  • Substitution. It is one the most popular and effective method for data masking. When applying this method real data is substituted with fake but still authentic-looking data. The substitution method is usually applied to phone numbers, zip codes, credit card numbers, Social Security and Medicare numbers, etс. When applying substitution to names, real life names can be randomly substituted from a supplied or customized lookup file.
  • Shuffling is another very popular way of masking data. It is very similar to the substitution method mentioned above with the only exception that the substitution set needed for substitution is taken from the same column of data that is being masked. To put it simply, the data is randomly shuffled within the column.
  • Encryption is one of the most complex method of data obfuscation. A special encryption mechanism requires using a “key” to view data based on user rights and privileges.
  • Nulling values out or deleting them. Just applying a null value to a particular field may look like a very simple yet efficient way to mask data. However, this approach is only useful to prevent direct visibility of data. But in most cases it is not as good and effective as it may seem as this way of data masking will fail the logic of most applications.

Static and dynamic data masking are included in DataSunrise Database Security Suite, so you can choose the most suitable solution for your company. But this is guaranteed, your data will be totally masked!

Download free 30 days Trial