Abstract Syntax Trees
An abstract syntax tree, or AST for short, is simply a tree-structured representation of the source code as might be typically generated by the preliminary parsing stages of a compiler. This tree contains a rich breakdown of the structure of the code in a non-ambiguous manner, allowing for simple searches to be performed for anomalous syntax.
Consider the example of an organization wishing to enforce a set of corporate coding standards. Stated in the standard is the basic requirement for the use of a compound statement block rather than single statements as the body of a loop (e.g. a for-loop). In this case,
Code Path Analysis
Consider now a more complex example. This time instead of looking for style violations, we wish to check whether an attempted dereference of a pointer should be expected to succeed or fail:
if( x & 1 )
ptr = NULL;
*ptr = 1;
In this case it is obvious from manual inspection that the variable “ptr” can assume a NULL value whenever the variable “x” is odd, and that this condition will cause an unavoidable zero-page dereference.
Attempting to find a bug of this type using AST scanning, however, is seriously non-trivial. Consider the (simplified, for clarity) AST that would be created from that snippet of code:
Dereference-pointer – ptr
In this case, there is no obvious tree search or simple node enumeration that could cover the attempted, and at least occasionally illegal, dereferencing of “ptr” in anything like a reasonably generalized form. So for cases such as this, it is necessary to take a step beyond simply searching for patterns of syntax.
What type of issues can be found?
In this section, we will walk through a number of examples of problems that can be identified using modern static source code analysis tools, showing how they occur and what can happen if they are not remedied before shipment. Whilst many more types of weakness can be found using Klocwork’s tools, these examples should give the reader a firm grounding in what a good static analysis suite can do, regardless of the vendor.
Note that the examples given here are shown in a variety of C/C++ and Java. Where appropriate, the relevant capabilities within the product are available in all supported languages, however.
Traditionally of interest to developers working on consumer-facing applications, security is becoming more and more critical to developers in all types of environments, even those that have until recently considered source code security to be a non-issue. Some of the more important areas of security that can be found with automated source code analysis are:
• Denial of service
• SQL injection
• Buffer overflow
• Cross-site scripting (XSS)
Notes for Editors
Gwyn Fisher is the CTO of Klocwork, a leading developer of source code analysis software and he is expert in static code analysis software. At klocwork, he is responsible for guiding the company’s technical direction and strategy of automated source code analysis.
He is responsible for guiding the company’s technical direction and strategy. With nearly 20 years of global technology experience, he brings a valuable combination of vision, experience, and direct insight into the developer perspective.
Klocwork is an enterprise software company providing automated source code analysis software products that automate security vulnerability and quality risk assessment, remediation, measurement for C, C++ and Java software and java static analysis. More than 300 organizations have integrated Klocwork’s automated source code analysis tools into their software development process in order to ensure their code is free of mission-critical flaws while freeing their developers to focus on what they do best – innovate.