How can we be sure that data in a computer system is correct? What checks can we build in to prevent bad data? Let's learn about validation checks in this post.
You might remember the definition of validation in Chapter 40. Just remember, validation gives an answer to the following question:
Having correct data stored on computer systems is key for the good functioning of applications. In order to reduce the chance of bad data making its way into our systems, we can implement checks called Validation Checks.
What data needs validation?
Here we list a few real-world examples of data that should be validated in programs:
The second value in a Time Variable must be between 0 and 59.
The age value for a person of Legal Age is at least 18.
An email address contains the @ symbol.
A holiday start date is before the holiday end date.
Types of Validation Checks
Range Check This checks the value of data to see if it is within a certain range e.g. the month of the year must be between 1 and 12.
Constraint Check This checks that an entry is obeying a given constraint or constraints for example a password may be required to meet a minimum length and contain characters from multiple categories.
Consistency Check This check ensures that data is logical. For example, the delivery date of an order cannot be greater than the shipment date.
Let us test our understanding! Now that we've seen 3 types of validation checks, can you guess which type corresponds to our first set of examples?
The second value in a Time Variable must be between 0 and 59. [Range]
The age value for a person of Legal Age is at least 18. [Constraint]
An email address contains the @ symbol. [Constraint]
A holiday start date is before the holiday end date. [Consistency]
Uses of Validation Checks
There may be different situations where validation checks may be implemented, but one of the most common use on at the point when data is being entered in a database through a form. In general, the sooner we validate data, the cheaper it is to do so.
In fact, a lot of effort is put in order to ensure that data integrity is maintained within an organisation.
What does data integrity mean? Data integrity is the overall accuracy, completeness, and consistency of data. Why? We always say that the most important task of a computer is the processing of our data. In order to trust computer systems then it is extremely important that the data being handled is solid.
Multiple factors can affect the integrity of our data in reality. Up until now, we were mostly referring to human error. All risks to data integrity need to be addressed and this is not an easy feat. Data is a big part of computer science therefore many topics touch on this important notion somehow. For example, in Chapter 47 we talk about bugs and how important it is to contain them.
We can go on about this forever, so to bring this chapter to a conclusion, we will lastly address how we can maintain data integrity when computers are communicating in a network.
Validating Data Entries in a Database
Database Management Systems like Microsoft Access enable programmers to configure validation rules for particular fields to ensure that direct data entry to the database will not jeapordise the quality of the data.
Validating Data Input in Programs
There is a very good chance that our programs will accept data input from the user. We have done this many times in class. But perhaps we should show you this to refresh your memory...
We would like to think that the user will comply... but there is no way of stopping the user from misbehaving right? What do you think will happen if the user enters their name instead of their age?
If you guessed that the program will crash, well done, you got it right! Reason being that the input is invalid because the program cannot work with the data.
Although, leaving a program crash like that is not very nice either, after all, the user could have simply made a genuine mistake. Let us make a small change to our program so that it does not crash when the user misbehaves...
This is the result...
Much better right? You can try it out for yourself!
Validating Data Transfers 👨💻👩💻
Data Alteration (also knows as diddling) can be defined as illegal or unauthorised fraudulent alteration of data. It is the process of modifying data before or after it is entered in to the system, generating a faulty output. Most especially when transferring data over a network it is really important to realise if the data has been tampered with! A number of approaches exist, we will look into the easiest ones here.
Even/Odd Parity Check
This is just a single check bit that is added to a binary pattern. There are two variants: even and odd. The check bit is set to a value of 1 or 0 to maintain either an even or odd amount of 1s in a message. For example, given a set of bits...
We need to set a check bit at the end. If we are using even parity then the last check digit should be set to 1 so that in the message the number of 1s remains even. On the other hand, if we are using odd parity then the check digit should be set to 0 so that in the message the number of 1s remains odd.
The Check Digit
The check digit is a form of redundancy check used on identification numbers, such as bank account numbers or ISBN numbers. This check is good enough to detect transcription errors or malicious alteration. It is quite similar to a binary parity bit used to check for errors in computer-generated data.
Above we have a picture of an ISBN number. The check digit (in the green box) is the last digit. Its value is calculated from the other 9 digits and provides, as its name implies, a check on the validity of the ISBN. If in transcribing the ISBN a mistake is made then there is a good chance that the resulting ISBN will be invalid thus indicating the error.
How does it work? Let us consider an example...
The check digit is calculated by using the nine digits above. The steps are as follows:
Start with the leftmost digit and multiply it by 10.
From left to right, for each digit multiply it by one less than the one before. For example the second digit is multiplied by 9, the third by 8, and so on... The ninth digit is multiplied by 2.
Take the addition of all products and apply Modulus 11 to get some result.
The check digit will be eleven minus the result.
225 divided by 11 = 20 remainder 5.
The check digit is modulus - remainder, 11 - 5 = 6
Now that we have seen how to get the check digit, all that is left is looking into the validation process. This is actually the easy part! All we have to do is perform the addition of products (as we did before) and divide it by 11. If the result is a whole number (there is no remainder) then the data is valid!
231 (the sum of products) divided by 11 (the modulus) = 21 remainder 0. Zero remainder = valid ISBN.
Comments