libvmi
https://github.com/libvmi/libvmi
This project does "LibVMI: Simplified Virtual Machine Introspection" which sounds really close.
This project in particular https://github.com/Wenzel/pyvmidbg uses libvmi and features a demo video of debugging a Windows userland application form inside it, without memory conflicts.
As of May 2019, there are two limitations however as of May 2019, both of which could be overcome with some work: https://github.com/Wenzel/pyvmidbg/issues/24
- Linux memory parsing is not yet complete
- requires Xen
The developer of that project also answered further at: https://stackoverflow.com/a/56369454/895245
Implementing it with those libraries would be in my opinion the best way to achieve this goal today.
Linaro lkd-python
First, this Linaro page claims to have a working setup: https://wiki.linaro.org/LandingTeams/ST/GDB that allows you to do usual thread operations such as thread
, bt
, etc., but it relies on a GDB fork. I will test it out later. In 2016, https://youtu.be/pqn5hIrz3A8 says that the implementation was in C, not as Python scripts unfortunately, which would be better and avoid forking. The sketch for lkd-python can be found at: https://git.linaro.org/people/lee.jones/kieran.bingham/binutils-gdb.git/log/?h=lkd-python
Linux kernel in-tree GDB scripts + my brain
I then tried to see what I could do with the kernel in-tree Python scripts at v4.17 + some manual intervention as a prototype, but didn't quite get there yet.
I have tested using this highly automated QEMU + Buildroot setup.
First follow the procedure I described at: How to debug the Linux kernel with GDB and QEMU? to get GDB working.
Then, as described at: How to debug Linux kernel modules with QEMU? run GDB with:
gdb -ex add-auto-load-safe-path /full/path/to/linux/kernel
This loads the in-tree GDB Python scripts from scripts/gdb
.
One of those scripts provides:
lx-ps
which lists all threads with format:
0xffff88000ed08000 1 init
0xffff88000ed08ac0 2 kthreadd
The first field is the address of the task_struct
struct, so we can see the entire struct with:
p (struct task_struct)*0xffff88000ed08000
which should in theory allow us to get any information we want about the process.
Now I wanted to find the PC. For ARM, I've seen: Find program counter of process in kernel and I tried:
task_pt_regs((struct thread_info *)((struct task_struct)*0xffffffc00e8f8000))->uregs[ARM_pc]
but task_pt_regs
is a #define
and GDB cannot see defines without -ggdb3
: How do I print a #defined constant in GDB? which are apparently not set?