File Structure of an APK
📌Why should you understand the file structure?
Understanding the file structure of an Android application is essential when reverse engineering. It helps you identify which files are worth analyzing and which ones can be safely ignored.
📌What is APK?
APK stands for “Android Package Kit.” It is the file format that distributes and install applications on Android devices. An APK file contains all the necessary components of an Android app, including code, resources, assets, and manifest files. We can easily unzip it and get its components, but it will still be encoded.
Users can download and install APK files from various sources, such as the Google Play Store or third-party app stores.
📍Unzipping APK
APK files can be unpacked using the command:
unzip file.apk
Unzipping an APK will reveal its components, which will look something like this:

Components
assets directory ⇒ Contains resources such as pictures, sounds, certificates, or external files.
com directory ⇒ It usually doesn't contain any interesting data for us.
lib directory
Look into this directory to check if the application supports the x86 architecture. If there is a directory containing a lib or if the app does not contain additional libraries at all, this is most likely the case.
These shared object files contain C and C++ code compiled into this format and are processor-dependent. If your phone has an ARM CPU, you'll find directories like
armeabi
andarmeabi-v7a
. (The developer uses shared object files to import functions from them)
📌 When you write an Android application, you use Java, which compiles into a
classes.dex
file. However, Java isn't always efficient for tasks like rendering or 3D effects, so developers can also include C and C++ code, which gets compiled into these shared object files.META-INF directory
Contains code signatures related to the app's signing process. Every Android app needs to be signed.
MANIFEST.MF: Contains a list of names/hashes (usually SHA256 in Base64) for all the files of the APK.
CERT.SF: Contains a list of names/hashes of the corresponding lines in the MANIFEST.MF file.
CERT.RSA: This file contains the public key and the signature of CERT.SF.
res directory
Contains predefined application resources, like XML files that define a state list of colors, user interface layout, fonts, values, etc.
AndroidManifest.xml
Contains meta-information about the application.
A manifest file that describes the application's package name, activities, resources, version, etc.
classes.dex
fileThis is the most important file, as it contains the compiled Java source code in Dalvik executable format, to be executed by the Android Runtime.
resources.arsc file
Contains information about strings or color definitions used in the app (not usually important).
In the
AndroidManifest.xml
we look for the permissions and exported components. We will find out the corresponding classes which define the key components and can hunt for vulnerabilities.We should also check the assets directory, as it may contain certificates or other data used by the app that isn’t visible in the decompiled Java code. If this directory is absent, we can skip this check. :)
We usually care about Assets, lib, META-INF, AndroidManifest.xml, classes.dex.
📌Dalvik (.dex)
On Android, applications are written in Java but run on the Dalvik virtual machine, designed to work efficiently on battery-powered devices. The Java source code is compiled into a different byte format called the Dalvik executable format, optimized for ARM architecture. This format helps conserve resources and battery life on mobile devices.
The Dalvik executable format is represented as an optimized text file format called .dex
. It contains classes that are generated from the Java source code in the Dalvik executable format. If needed, the .dex
file can be converted back to a regular text file format.
One important limitation of the .dex
file is that it can only contain 65,535 methods. If an application exceeds this limit, it will result in multiple .dex
files, named classes.dex
, classes2.dex
, and so on. Libraries, frameworks, and the Android system itself may also lead to multiple .dex
files due to the number of methods they contain.

Overall, the Dalvik virtual machine and .dex
file format are crucial components of the Android platform, enabling efficient execution of Java-based applications on ARM architecture-based devices.
📌classes.dex
The
classes.dex
files contain Java source code that has been compiled into Dalvik executable bytecode format.The
ghex
tool can be used to view the raw hexadecimal data of theclasses.dex
file but cannot disassemble the bytecode into a readable format.You can use the following command to inspect the file:
ghex classes.dex
The output will look like this:
Here, we can see some header information and identify if the file is compressed or detect certain patterns. To disassemble the Dalvik bytecode into a human-readable format, you should use a tool like
dexdump

📌App retrieval
How can we obtain the APK file?
The program may provide the APK directly by uploading it.
We can find the APK online using APKCombo
If only the package name is provided, here’s how to retrieve the APK:
Extract it from the mobile device after installation from Google Play (using ADB tool or an APK extractor app).
Use a Chrome extension (such as an APK downloader) to download the APK.
Using the ADB tool, follow these commands in the adb shell to retrieve the APK file:
# 1. List all packages pm list packages -3 pm list packages | grep -i <app_name> # 2. Get the package path : pm path <package_name> # 3. Pull the app to your computer adb pull <app_path>
After running the above commands, the APK file will be on your computer.

📌Decompiling with apktool
APKTool is a popular open-source tool used to decompile and reverse-engineer Android APK files. It allows developers and researchers to extract the APK’s resources, manifest, and smali code (Dalvik bytecode) into a human-readable format. The decompiled code provides insights into how the app functions, its resources, and even allows making modifications to the code.
Advantages:
Provides human-readable code in smali format, making it easier to understand and analyze.
Retains the original file structure and resources, making it suitable for modifying and recompiling the APK.
Usage:
To decompile the app, use the apktool with the
d
option:apktool d <app.apk>
APKTool will create a directory containing the decompiled resources:
To rebuild the application after making changes, use the
b
option:apktool b <directory>
📍Unzip vs decompile

White highlight: These directories are identical.
Green highlight: These directories are identical but exist in a different structure.
Other directories: These differ between APKTool and unzipping.
📌In summary: Unzipping an APK gives you access to its non-code assets, but the code remains in compiled form, making it hard to understand. On the other hand, decompiling with APKTool provides deeper insights into the app's functionality and code by converting the bytecode into Smali, which is more human-readable
Last updated