I want to be an ARM reverse engineer


Aka, how I can name my self a semi-good reverse engineer from a 5 years experienced Android developer to an (android_system_knowledge++, cracking++, static_analysis++, brain++, travels_around_the_world++, awesome_people_meet++, ecc++).

Hello!!! I’m writing this blog article to help all the guys in secRet community to start their journey into reverse engineering and cracking, eventually exploitation, malware analysis and whatever they like more.

This is meant to highlight all the fail I did, in order to speed up the learning process.

We start by the fact that when I started:

  • I never saw a line of ARM
  • I never attached nor used nor known GDB
  • I never used frida nor known frida (nor the difference between dynamic instrumentation and debugging)
  • Poor C and py knowledge but quite good JAVA and general OOP (If that matter, I’m now at the 17th place py developer in italy (source: git-awards.something))
  • All the things that are consequence of what said (never knew IDA, Binja etc)

some awards achieved 2/3/4 years later, what I think deserve to be said

  • someone around the world believe i’ve something good to say so they invite me for a beer and a speech
  • i’ve met/i’m about to meet lot of friends, best researches from all around the world (Thanks NowSecure)
  • i’ve got a stable and awesome job position
  • i’ve coded a debugger (most of the job is thanks to open sources project – frida, capstone etc) but the core logic of the “tricks” used to achieve breakpoints, watchpoints, emulation with automatically maps from target etc makes me really proud of it
  • i read ASM, hex, binary (wip.. it’s crap hard eh)
  • i can debug with gdb, understand crashes and eventually exploit them in whatever ring
  • i can use frida pretty good
  • i can port any arm 32/64 x86/64 asm to whatever other language
  • i’ve cracked arxan (used on games, banking – gambling – general apps with money involved), tencent, Supercell, TikTok and less protected/populars.
  • i can understand malware logics, break them, debug processes bla bla etc etc
  • this is hard, first blames coming. Until now, i never failed a crack if that’s what i really wanted.

Here the list of the tools. If you read what follow, you’ll find them useful:

What took me into cracking was the wish to give more helps to the Pokemon GO cracking community back in the days (3/4 years ago or so), when I was developing PokeMesh together with @rEDSAMK.

So let’s start. Second blame coming – the first suggestion I’m going to give is: AVOID hard papers, youtube tutorials, general googling in the matter (i.e “cracking tutorial”, “how to reverse engineer arm”) (Please, this is still built by my personal experience xD).

First stage is to give yourself a target, something easy and not hardcore – no packers, obfuscations and sick tricks. I went straight to a semi-hard one (Clash Royale. It was unprotected at all back in the days, but the complexity of Sodium encryption and general encryptions understanding (hashes, priv/pub key encryptions etc) made me fail and waste time for 3 months or more. not really waste but could be optimized.) For the poor debugging and ARM understanding I did hard self-deep-learning but this could be optimized by asking the right questions in secRet 😛 (asking a suggestion, what an instruction mean in short it’s not like asking the cooked soup, aka you’ll probably get a reply. push yourself into it before).

Games are good targets. Nowadays mobile games are real business. ESport is reality. There is betting, tournaments around the world (Thanks Supercell for Tokyo) and so they are heavily protected to prevent cheating and hacking, time to time with similar techniques used to pack malwares or protect an IoT etc (believe me… i saw any kind of crazy shits to protect the user space from data-leak/manipulation and I can’t wait to see new ones!). Any kind of applications which just exchange data with a remote server, multi-os (Android and iOS – to make sure they are shipped with shared native libraries) (giving the assumption you already know some basic JAVA reverse engineering) could be a good target. Android is best. Open source code helps a lot to understand, in example, how the spawn of a process happens. The kernel as well, i’ve wrote a kernel module to give my self some little help from the kernel while debugging the userspace from the userspace thanks to the fact the code is open, learnt to write colorful ways to use shared memory as an ipc from kernel to user and so on.

Do it because you want to do it. Otherwise you are wasting time. If you are pushed enough to the goal you’ll do it and learn whatever (sounds more like a life lesson xD whatever).

Pure example:

  1. I dont give a damn ’bout the effort I’ll be happy once I (and my eyes needs to) see 1500/5000/2500000 py lines that replicate encryption logic. (source)
  2. fact

Another good requisite of our first target is a custom TCP network protocol, unencrypted or with common encryption. So once selected, what’s next is build back the network protocol. This is how I learnt and it was pretty fast.
You can eventually swap this by just: I push that button in the game and it open a dialog with a string. Break at that dialog open and change that string in runtime. whatever.

Next steps could be easily learnt by googling with the right keywords – “how to attach GDB”, “how to run frida on android” etc. To build back the network protocol, start by hooking with frida or breakpointing with gdb low level involved api that dispatch and receive messages through sockets. (send, recv, sendto, recvfrom, SSL_write, SSL_receive etc etc). You’ll learn that there are abstract things which are in every os (android, ios, windows) libc. LIBC <- EL IB C – “libc exported plt” “arm syscall table”.
If you are dealing with http/s your best deal would be some hasher in the headers, to turn it a bit fun, if it’s tcp you’ll have to build it back. Use backtrace to understand where functions comes from, step the code, emulate the functions line by line and understand what each instructions is doing in the real deep. Would you tell me that’s i hard to understand that
MOV R0, R1 is copying R1 into R0? or LDR.W R2, [R4, #4] is loading a word from the pointer in r4+4? eventually a ! in the end of the opstr could bring trouble but it’s easy to search. 2 weeks later you can understand with the same principles arm64 and x32/64. Take arm64 i.e. It have more registers and assembly looks almost the same. X32 is a bit different but you just learn how the shits works down the business.. wow there are some registers which hold things and instructions which are just bytes… like everything else.

The above block of text got all the keywords needed for a successful googling.

Once achieved, whatever time needed, if you pick the cracking/malware analyst way you’ll move to packed and protected targets. Base on my experiences, nowadays, most of the hardcore obfuscations are built on top of LLVM and most of the anti-debugging tricks happens in dt_init because the necessity of stepping the system (2 click/1 line of code in Dwarf debugger) and if it’s not like that, we have great chance it could turn out to something fun.

Other general suggestions:

  • ask the right questions. Don’t ask the things. Ask, how can I do the things. Am I doing them right: show your code, try it, do some research by yourself using the right keywords
  • the laziest and the shit ways. are the best ways. no one see your debugging code.
  • share your knowledge later so whatever I wrote could be optimized more
  • share your knowledge also because you have the chance to start some awesome white cat&mouse (Supercell <3), always ask permissions to the companies to publish your works, discuss how you f**k up the shit with them
  • open source your tools, it’s how most of us are here
  • Dwarf is shipped with all the features needed to achieve those steps and now it supports all the OSs and archs, got an awesome UI and more features thanks to the power of the community!

Another addition by @rEDSAMK, which pointed that those kind of things are something you won’t learn at school nor through public trainings. There are almost no books which fit what you like to do. Reverse engineering it’s about mentality and fantasy and a learn-by-doing approach is one of the best path to follow.

If this could give you even more motivation, I studied 8 years to become a cook, stopped 3 times in 4th class, never graduated. I’m not a wizard but I use google for what has been built for. See you with the part2 in the next years (time given… got a full-time and some lifes to care now) with an UEFI rootkit debugged with jTAG.

To conclude, I want to leave my opinion to all the skilled people out there. I’m still looking to someone that take my hand like a lil child and guide me through fuzzing and exploitation (from know how to do it and actually do it there is a big hole). I was in touch with many people that pointed me to documents, papers, videos but nowadays it’s not like in 90′, you don’t pwn websites with index.php?page= rfi inclusion anymore. There are mitigations, sick tricks to prevent us to do what we do, cracking included. You need to change your training approach because I feel like every time i try to study something in the deep I have hundred of questions or points which i totally miss the logic behind and this turn to *10 amount of time. And I don’t want to speak about the trainings offered by pro people. Not everyone can afford 5000$ or more + travel costs (families, country taxes, whatever prevent people to take a plane), we got “the cloud”. I’m ok to sell the knowledge, but also consider to organize some goddamn free trainings. You won’t learn those things anymore through papers or by seeing time limited keynotes of the awesome iOS exploit developers. You need to take people, place them in front of a bof and watch them doing da job while replying questions and fill the gaps, taking the necessary time, because, nowadays it’s fucking hard to understand any single mitigation in a real doing context (imagining someone wishing to reach code injection into an Arxan guarded memory segment without spending 1+ year).

Good luck <3

About the author


Add comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.