Trusting Your Data: Garbage In, Garbage Out

By Alan Crouch - December 6, 2019

Icon of a person throwing garbage into a trash can

The saying “Garbage in, garbage out” has long been used in software engineering to express the idea that poor quality input will always produce faulty output.

Yet many applications still fail to apply the axiom. This can result in anything from stored cross-site scripting attacks, to SQL injections, to buffer overflows, or more benign malformed output that can still reduce the application’s quality and usability.

Improper validation of input can affect more than just the security of your application; it can adversely affect your ability to make effective business decisions as well. Bad data can have impacts on how you make quantitative decisions or create reports, if you can’t trust the dirty data you receive.

Any time data is sent to or used for an application, it should be treated as coming from an untrusted data source. Whether that data comes from the end user or the database, the data should be validated before it is used or saved to a trusted data store.

Developers should not assume that data in the database is correct, because human error in input or improper validation happens all the time. In fact, APIs being used by the application may have become compromised, or even the most trustworthy users can make simple mistakes that could potentially put your application into an error state or be open to compromise.

Developers also should not assume that they must only guard against malicious attackers. Anyone and anything, no matter how well-intentioned, could cause problems if you trust all the input implicitly.

Consider validating data from a variety of sources, including URL parameters, databases, internal and external APIs, other applications, and end-users. Input validation should be applied in both syntactic and semantic ways. Syntactic validation enforces correct syntax of structured fields (such as Social Security numbers, the date, and currency symbols), while semantic validation enforces correctness of their values. By treating all data as potentially bad, you will ensure your application is resistant to both errors and attacks.

When it comes to verifying inputs, there is no theoretical limitation to the lengths to which a validator can go. However, by leveraging built-in input validation libraries, utilizing boundary testing, and implementing fuzzing with a risk-based approach, you can make sure your inputs are sufficiently validated and tested.

Applying a consistent approach to input validation across your application will not only reduce the security risk to your company and customers, but also increase quality by reducing the chance of unforseen bugs caused by bad data.

Tags

validation input data

0 comments

Alan Crouch is a senior software security specialist with Coveros, a Virginia-based firm focused on agile, software quality, and application security. Alan has worked closely with federal agencies and private companies to advise, audit, and support IT security and governance teams. In addition to his cybersecurity experience, he has a strong background in software engineering, test analysis, test automation, and security testing. Alan has focused his career on building secure software and developing better software security practices. You can contact Alan at [email protected].

Trusting Your Data: Garbage In, Garbage Out

Status message