Later I found out that in newer kernels there is a really nice virtual file in the /proc file system to get this information. I tried to cat it, doing cat /proc/self/pagemap, and got terrible binary output in my console.
So, it looks like working with this file is not such a pleasant experience. It's a binary file with all that it implies. I found couple of scripts that access this file and provide you with a nice text result, but unfortunately those were written in perl and ruby, and I needed to run it on very minimalistic embedded system. I needed something that fits into a single binary.
Long story short, I decided to bite the bullet and write a tool in C. My contribution might be helpful for someone, that's why I'm sharing this code.
And now how you use it. It's very simple. Of course you need to compile it. Then you need to find out what mapping your target process does have. You can do that by reading /proc/pid/maps file. Fortunately that file is human readable.
When you know a valid virtual address, you can pass it to our tool to get actual value from pagemap, including physical frame number. Here is an example:
$ #let's find get virtual address of a page $ cat /proc/self/maps 00400000-0040b000 r-xp 00000000 08:02 1177367 /bin/cat 0060a000-0060b000 r--p 0000a000 08:02 1177367 /bin/cat 0060b000-0060c000 rw-p 0000b000 08:02 1177367 /bin/cat 0223a000-0225b000 rw-p 00000000 00:00 0 [heap] 7fe7e15e1000-7fe7e1cc3000 r--p 00000000 08:02 1577390 /usr/lib/locale/locale-archive 7fe7e1cc3000-7fe7e1e80000 r-xp 00000000 08:02 527324 /lib/x86_64-linux-gnu/libc-2.17.so 7fe7e1e80000-7fe7e2080000 ---p 001bd000 08:02 527324 /lib/x86_64-linux-gnu/libc-2.17.so 7fe7e2080000-7fe7e2084000 r--p 001bd000 08:02 527324 /lib/x86_64-linux-gnu/libc-2.17.so 7fe7e2084000-7fe7e2086000 rw-p 001c1000 08:02 527324 /lib/x86_64-linux-gnu/libc-2.17.so 7fe7e2086000-7fe7e208b000 rw-p 00000000 00:00 0 7fe7e208b000-7fe7e20ae000 r-xp 00000000 08:02 527300 /lib/x86_64-linux-gnu/ld-2.17.so 7fe7e228d000-7fe7e2290000 rw-p 00000000 00:00 0 7fe7e22ab000-7fe7e22ad000 rw-p 00000000 00:00 0 7fe7e22ad000-7fe7e22ae000 r--p 00022000 08:02 527300 /lib/x86_64-linux-gnu/ld-2.17.so 7fe7e22ae000-7fe7e22b0000 rw-p 00023000 08:02 527300 /lib/x86_64-linux-gnu/ld-2.17.so 7fffce6b6000-7fffce6d7000 rw-p 00000000 00:00 0 [stack] 7fffce722000-7fffce724000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] $ #don't forget alsr, normally only /bin/cat will remain same $ #so let's pick 0x00400000. Now we run our program. $ #First argument is pid, "self" is a legal option too, the second is virtual address $ ./pagemap self 0x00400000 Reading /proc/self/pagemap at 0x2000 Result: 0a60000000008c445
We got 0x0a60000000008c445 as a result. There are some bits showing that the page is valid, along with the size of the page. You can reed more in Linux documentation: https://www.kernel.org/doc/Documentation/vm/pagemap.txt. Basically, the physical page number is 0x8c445.
Note, that in different kernel versions bits 56-60 have different meaning. In most current versions, they are forced to zero, however in kernel version 3.11.0 they represent page size.
The original version of the code worked only on x86-64 machines. It used to read sizeof(unsigned int) amount of bytes from the binary file, but it must read 64 bits no matter what word size target machine has. There was another issue, on OpenRISC simulator the fread() function always failed, that's why I changed the code to use cget() instead, that way it works with all architectures I tested so far.
I plan to expand the functionality of my program, for example include support of range lookups in the /proc/*/pagemap files. If someone is interested in some functionality like that, leave a comment below.