suite/auto-sync/RefactorGuide.md
This is a step by step overview how to refactor an architecture.
It can also be used to add a new architecture module. As long as it is supported by LLVM or a fork of it.
Please always contact us in the Auto-Sync tracking issue before working on a module. We can provide support and save you a lot of time.
Don't hesitate to ask any questions in our Telegram Community channel.
Especially if you feel stuck or struggle to understand where an issue is coming from. The update process is, although already simplified, relatively complex.
Note:
If we talk about C++ files in the steps below, we always refer to the files in the LLVM repo.
PrinterCapstone is the class defined in llvm-capstone/llvm/utils/TabelGen/PrinterCapstone.cpp
Always attempt to make the translated C file behave as closely as possible to the original C++ file! This greatly helps debugging and assures that Capstone behaves almost exactly the same as original LLVM.
CONTRIBUTING.mddocs/ARCHITECTURE.mdsuite/auto-sync/README.mdsuite/auto-sync/ARCHITECTURE.mdsuite/auto-sync/intro.md
arch/<ARCH>/, except the ARCHModule.* and ARCHMapping.*.cd suite/auto-sync/inc filespip install -e .llvm-tblgen (see docs)ASUpdater -h
Target.pyPrinterCapstone.cpp::decoderEmitterEmitDecodeInstruction() (add decoder function)
[!NOTE] Architecture specific code generation.
There are several oddities of architectures which require slightly different generated code.
If you search through PrinterCapstone.cpp for architecture names like AArch64, ARM, or Sparc you can see how these are handled.ASUpdater -s IncGen -a ARCH
inc files in build look good.<ARCH>InstPrinter.cpp and <ARCH>Disassember.cpparch_conf.json (LoongArch for a minimal example).
ARCHIntPrinter.cpp to the list of the AddCSDetail tests!<ARCH>InstPrinter.cpp, <ARCH>InstPrinter.h and <ARCH>Disassembler.cpp to the translation list.
path_vars.jsonPatches/Includes.py. Copy the code from another architecture for the beginning.<arch>.h) for patching:
inc files. Files names like <ARCH>GenCS<something>Enum.inc contain enumerations for the header. Those get patched into the main header file of the architecture.// generated content <...> begin comments for patching. Checkout longarch.h as example.arch/ and include/capstone/<arch>.h header!
ASUpdater -a <ARCH> -w --copy-translated -s IncGen Translate PatchArchHeaderPrinterCapstone::normalizedMnemonicarch/<ARCH>
Include.py. If not, you have to find the LLVM source file where they are defined and add it to the arch_config.json to translate it.
SystemOperands.inc file. Also can be generated by adding the arch to the list in inc_gen.json.-w flag for the ASUpdater and you checked thoroughly that all necessary files got translated!CppTranslator.unsigned GR32Regs[] should be unsigned ARCH_GR32Regs[]. See namespace begin/end comments in the code.ARCHLinkage.h and the functions in the InstPrinter.c, ArchDisassembler.c.
ARCHLinkage.h is to separate the Capstone and LLVM code, at least loosely, into compile units.
So the LLVM and Capstone code can at some point live in their own object files. This is not yet implemented, but
we try to keep them from becoming too entangled.ARCHMapping.c. Esential is everything not releated to details.mattr and mcpu names to the CS identifiers if necessary. -> Edit the mcupdater.json config file.ASUpdater -s MCUpdate -a Arch -w
- It can happen that MCUpdate doesn't generate any tests. This means LLVM has no disassembly tests for this architecture.
You can add your arch to use_assembly_tests in mcupdater.json to do so.
Keep in mind that some tests can later fail even though they are correct.
The compiler can assemble an instruction to a semantically equivalent, but syntactically different one.
This syntactic mismatch can later make those tests fail in Capstone.
cstest tests/MC/ArchLoongArchMapping.c or SystemZMapping.c but change values.arch.h) are only allowed if it was wrong before. Otherwise only extensions.ArchMapping.c should be covered near 100%