A framework for analyzing and transforming Java and Android Applications
View on GitHub

Soot

A framework for analyzing and transforming Java and Android Applications

Please help us improve Soot!

You are using Soot and would like to help us support it in the future? Then please support us by filling out this little web form.

That way you can help us in two ways:

  • By letting us know how we can improve Soot you can directly help us prioritize newly planned features.
  • By stating your name and affiliation you help us showcasing Soot’s large user base. Thanks!

What is Soot?

Originally, Soot started off as a Java optimization framework. By now, researchers and practitioners from around the world use Soot to analyze, instrument, optimize and visualize Java and Android applications.

What input formats does Soot provide?

Currently, Soot can process code from the following sources:

  • Java (bytecode and source code up to Java 7), including other languages that compile to Java bytecode, e.g. Scala
  • Android bytecode
  • Jimple intermediate representation (see below)
  • Jasmin, a low-level intermediate representation.

What output formats does Soot provide?

Soot can produce (possibly transformed/instrumented/optimized) code in these output formats:

  • Java bytecode
  • Android bytecode
  • Jimple
  • Jasmin

Soot can go from any input format to any output format, i.e., for instance, allows the translation from Android to Java or Java to Jasmin.

Who develops and maintains Soot?

Soot was originally developed by the Sable Research Group of McGill University. The first publication on Soot appeared at CASCON 1999. Since then, Soot has seen contributions from many people inside and outside the research community. The current maintenance is driven by Eric Bodden’s Software Engineering Group at Heinz Nixdorf Institute of Paderborn University.

This publication provides an insight into the first ten years of Soot’s development.

What kind of analyses does Soot provide?

  • Call-graph construction
  • Points-to analysis
  • Def/use chains
  • Template-driven Intra-procedural data-flow analysis
  • Template-driven Inter-procedural data-flow analysis, in combination with heros
  • Taint analysis in combination with FlowDroid

What extensions exist to Soot?

How does Soot work internally?

Soot transforms programs into an intermediate representation, which can then be analyzed. Soot provides four intermediate representations for analyzing and transforming Java bytecode:

  • Baf: a streamlined representation of bytecode which is simple to manipulate.
  • Jimple: a typed 3-address intermediate representation suitable for optimization.
  • Shimple: an SSA variation of Jimple.
  • Grimp: an aggregated version of Jimple suitable for decompilation and code inspection.

Jimple is Soot’s primary IR and most analyses are implemented on the Jimple level. Custom IRs may be added when desired.

How do I get started with Soot?

We have some documentation on Soot in the wiki, including a large range of tutorials on Soot. We also have a JavaDoc documentation and a reference on the command line options.

How do I obtain the nightly builds?

Nightly builds of soot can be obtained from nightly build. The “soot-trunk.jar” file is an all-in-one file that also contains all the required libraries. The “sootclasses-trunk.jar” file contains only Soot, allowing you to use manually pick dependencies as you need them. If you want to build your project with Maven, you can use our Nexus repository.

About Soot’s source code

Soot follows the git-flow convention. Releases and hotfixes are maintained in the master branch. Development happens in the develop branch. To catch the bleeding edge of Soot, check out the latter. You will also need the projects jasmin and heros. In case of any questions, please consult the Soot mailing list.

Acknowledgements

We would like to thank the team of YourKit for providing us with free licenses of their profiler to improve the performance of Soot.