What is Soot?
Originally, Soot started off as a Java optimization framework. By now, researchers and practitioners from around the world use Soot to analyze, instrument, optimize and visualize Java and Android applications.
What input formats does Soot provide?
Currently, Soot can process code from the following sources:
- Java (bytecode and source code up to Java 7), including other languages that compile to Java bytecode, e.g. Scala
- Android bytecode
- Jimple intermediate representation (see below)
- Jasmin, a low-level intermediate representation.
What output formats does Soot provide?
Soot can produce (possibly transformed/instrumented/optimized) code in these output formats:
- Java bytecode
- Android bytecode
Soot can go from any input format to any output format, i.e., for instance, allows the translation from Android to Java or Java to Jasmin.
Who develops and maintains Soot?
Soot was originally developed by the Sable Research Group of McGill University. The first publication on Soot appeared at CASCON 1999. Since then, Soot has seen contributions from many people inside and outside the research community. The current maintenance is driven by the Secure Software Engineering Group at Technische Universität Darmstadt.
This publication provides an insight into the first ten years of Soot’s development.
What kind of analyses does Soot provide?
- Call-graph construction
- Points-to analysis
- Def/use chains
- Template-driven Intra-procedural data-flow analysis
- Template-driven Inter-procedural data-flow analysis, in combination with heros
- Taint analysis in combination with FlowDroid
What extensions exist to Soot?
- We maintain a list of extensions that can be used in combination with Soot. Feel free to add your own!
How does Soot work internally?
Soot transforms programs into an intermediate representation, which can then be analyzed. Soot provides four intermediate representations for analyzing and transforming Java bytecode:
- Baf: a streamlined representation of bytecode which is simple to manipulate.
- Jimple: a typed 3-address intermediate representation suitable for optimization.
- Shimple: an SSA variation of Jimple.
- Grimp: an aggregated version of Jimple suitable for decompilation and code inspection.
Jimple is Soot’s primary IR and most analyses are implemented on the Jimple level. Custom IRs may be added when desired.
How do I get started with Soot?
How do I obtain the nightly builds?
Nightly builds of soot can be obtained from nightly build. The “soot-trunk.jar” file is an all-in-one file that also contains all the required libraries. The “sootclasses-trunk.jar” file contains only Soot, allowing you to use manually pick dependencies as you need them.
About Soot’s source code
Soot follows the git-flow convention. Releases and hotfixes are maintained in the master branch. Development happens in the develop branch. To catch the bleeding edge of Soot, check out the latter. You will also need the projects jasmin and heros. In case of any questions, please consult the Soot mailing list.
We would like to thank the team of YourKit for providing us with free licenses of their profiler to improve the performance of Soot.