Program slicing is a technique to extract parts of a program based on a criterion. Atom (powered by chen library) is a static opinionated data flow slicer optimized for application and dependency analysis use cases with up to 100K LOC.
- Precise - With static analysis, atom can generate precise slices with verifiable location information from the application source code.
- Non-deterministic - The slicing operation is optimized for constant-time generation performance and therefore non-deterministic. Repeated runs could yield slightly varying results depending on code complexity.
- Secure - It is not possible to reverse-engineer and obtain the application source code from the atom slices alone.
All slices produce machine-readable json output that can be parsed using atom proto specification.
Usage slices can help answer two key questions about the usages of external libraries.
- HOW? Are the libraries used as-is or via custom alias or derived type?
- WHERE? File and line number locations of the definitions, imports, usage, calls etc.
The mind map below offers an overview.
- Parse the usages json.
- Iterate over the
objectSlicesarray. For each slice store its fileName and lineNumber.
{
"objectSlices": [
{
"code": "",
"fullName": "com.example.vulnspring.WebController.jwt:java.lang.String(javax.servlet.http.HttpSession,org.springframework.ui.Model)",
"signature": "java.lang.String(javax.servlet.http.HttpSession,org.springframework.ui.Model)",
"fileName": "src/main/java/com/example/vulnspring/WebController.java",
"lineNumber": 274,
"columnNumber": 2,
"usages": [
{
"targetObj": {
"name": "username",
"typeFullName": "java.lang.String",
"lineNumber": 276,
"columnNumber": 3,
"label": "LOCAL"
},
-
Iterate over the
usagesarray. The attributestypeFullName(found intargetObjanddefinedBy) andresolvedMethod(invokedCallsandargToCalls) under each category are of interest. -
Iterate over the
userDefinedTypesarray. Note thefileNameandlineNumberfor each type. For eachfield, the attributetypeFullNameindicates the aliased field. For eachprocedure, theparamTypesarray lists the custom type from index 1 onward.
"userDefinedTypes": [
{
"name": "com.example.vulnspring.WebController",
"fields": [
{
"name": "jdbcTemplate",
"typeFullName": "org.springframework.jdbc.core.JdbcTemplate",
"lineNumber" : 42,
"columnNumber": 15,
"label": "LOCAL"
},
{
"name": "logger",
"typeFullName": "org.slf4j.Logger",
"lineNumber": 44,
"columnNumber": 30,
"label" : "LOCAL"
}
],
"procedures": [
{
"callName": "home",
"resolvedMethod": "com.example.vulnspring.WebController.home:java.lang.String(org.springframework.ui.Model,javax.servlet.http.HttpSession)",
"paramTypes" : [
"com.example.vulnspring.WebController",
"org.springframework.ui.Model",
"javax.servlet.http.HttpSession"
],
"returnType": "java.lang.String",
"lineNumber": 46,
"columnNumber": 2
},Data flow slices represent the data-dependency information computed statically from the source code using Reverse-Reachability Algorithm. The full list of nodes and edges from the
Data Dependency Graph (DDG) is also made available for custom visualization and traversal purposes. Up to 50 reachable
paths are precomputed and made available via the paths attribute in the json by the atom cli tool for convenience.
It is quite common for organizations to have common libraries and modules in separate repositories, jar files, and other packages. These modules would use external libraries as sinks and might lack any entrypoints (sources). Data flow slicing could work well in theses scenarios where the entrypoints (sources) cannot be identified. The brute-force nature means data-flow slicing would often take significant amount of time compared to usages or reachables slicing.
- Parse the data flow json.
- Iterate over the
graph.nodesarray and create a Map for each node with the id as the key and the node as the value. - Iterate over the
pathsarray. For each id, look up the node from the map object created in step 2. - Filter any operator calls where the name starts with
<operator. Note that operator calls could start with either<operator>or<operators>(with ans) due to a known unresolved bug. - All
CALLnodes withisExternal=trueindicates external method calls. ThefullNameproperty is interesting for such external calls along with all theparent*attributes such asparentFileName,parentMethodNameetc. - Nodes with the label
METHOD_PARAMETER_INare method parameters. These could be user-provided input depending on the framework and filename. For instance, method parameters in a controller or service class usually takes input from the users or another service.
The information in a data-flow slice can be used as component evidence in a CycloneDX 1.5 document.
| Data Flow Slice Attribute | CycloneDX Attribute | Comments |
|---|---|---|
| parentPackageName | package | Will be based on the filename for Javascript/Typescript |
| parentClassName | module | Will be based on the filename for Javascript/Typescript |
| parentMethodName | function | |
| parentMethodSignature | parameters | Could be customized to ignore return types |
| lineNumber | line | Could be unavailable for certain projects |
| columnNumber | column | Could be unavailable for certain projects |
| parentFileName | fullFilename |
Reachables represent data flows that can originate from an entrypoint (source) and reach an external sink. These potentially represent the paths an adversary could take to reach and exploit a known vulnerability in a third-party library and hence the terms "reachable flows" or "Forward-Reachability". With atom, reachable slices can be generated for Java, Jars, JavaScript, and TypeScript applications.
A necessary pre-requisite is the presence of the Software Bill-of-Materials (SBOM) file in the directory containing the source code. At present, only the SBOM generated by CycloneDX Generator have the precision and depth required for computing reachables.
- Parse the reachables json.
- Iterate over the
reachablesarray. Each item in this array is an object containing flows (Reachable data-flows) and purls (List of Package URLs) - Each item in the flows array is of type
nodesimilar to the nodes array in the data flow slice above. - Each item in the purls array is of type string
[
{
"flows": [
{
"id": 44,
"label": "METHOD_PARAMETER_IN",
"name": "this",
"fullName": "",
"signature": "",
"isExternal": false,
"code": "this",
"typeFullName": "com.example.SpringKafkaDemo.config.KafkaConsumerConfig",
"parentMethodName": "consumerFactory",
"parentMethodSignature": "org.springframework.kafka.core.ConsumerFactory()",
"parentFileName": "src/main/java/com/example/SpringKafkaDemo/config/KafkaConsumerConfig.java",
"parentPackageName": "com.example.SpringKafkaDemo.config",
"parentClassName": "com.example.SpringKafkaDemo.config.KafkaConsumerConfig",
"lineNumber": 36,
"columnNumber": null,
"tags": "framework-input"
}
],
"purls": [
"pkg:maven/org.springframework.kafka/spring-kafka@2.8.11?type=jar"
]
}
]Use the atom cli to generate slices.
cdxgen -t java --deep -o bom.json .
atom reachables -o app.atom --slice-outfile df.json -l java .atom data-flow -o app.atom --slice-outfile df.json -l java .atom usages -o app.atom --slice-outfile usages.json -l java .Planned for 2.0.0 release


