Tuesday, February 25, 2014

rxvt-unicode + tmux do italic instead of reverse

When using rxvt-unicode and tmux, I found that rxvt-unicode shows italic instead of reverse color when I search.

Italic

Reverse
I followed the setting in the article below:

http://sourceforge.net/mailarchive/forum.php?thread_name=20110812111030.GH13508%40plenz.com&forum_name=tmux-users

Modify ~/.tmux.conf and add the line to solve my problem.

set -g terminal-overrides 'rxvt-unicode*:sitm@'

Friday, February 21, 2014

rxvt-unicode - not to select trailing blanks in vim

Everytime I do mouse selection in my vim (run in rxvt-unicode) and paste the content to another place, I will select all trailing blanks in every line. It's a very annoying problem.

Finally I found the solution in

http://www.reddit.com/r/emacs/comments/1ox5pf/why_fill_the_empty_space_with_spaces/

I followed the article and added the following options in my ~/.Xdefault:

URxvt.perl-ext-common: default,selection-autotransform
URxvt.selection-autotransform.0: s/ +$//gm 
It removes trailing blanks by perl and everything works as I want now. Thanks for this article!

Thursday, February 20, 2014

debug with Linux slub allocator


The slub allocator in Linux has useful debug features. Such as poisoning, readzone checking, and allocate/free traces with timestamps. It's very useful during product developing stage. Let's create a kernel module and test the debug features.

Make sure slub allocator is built in your kernel.

CONFIG_SLUB_DEBUG=y
CONFIG_SLUB=y

The slub allocator creates additional meta data to store allocate/free traces and timestamps. Everytime slub allocator allocate/free an object, it do poison check (data area) and redzone check  (boundry).

The module shows how it happens. It allocates 32 bytes from kernel and we overwrite the redzone by memset 36 bytes.

void try_to_corrupt_redzone(void)
{
        void *p = kmalloc(32, GFP_KERNEL);
        if (p) {
                pr_alert("p: 0x%p\n", p);
                memset(p, 0x12, 36);    /* write too much */
                print_hex_dump(KERN_ALERT, "mem: ", DUMP_PREFIX_ADDRESS,
                                16, 1, p, 512, 1);
                kfree(p);       /* slub.c should catch this error */
        }
}

static int mymodule_init(void)
{
        pr_alert("%s init\n", __FUNCTION__);
        try_to_corrupt_redzone();
        return 0;
}

static void mymodule_exit(void)
{
        pr_alert("%s exit\n", __FUNCTION__);
}

module_init(mymodule_init);
module_exit(mymodule_exit);

After freeing the object, the kernel checks the object and find that the redzone is overwritten and says:

[ 2050.630002] mymodule_init init
[ 2050.630565] p: 0xddc86680
[ 2050.630653] mem: ddc86680: 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12  ................
[ 2050.630779] mem: ddc86690: 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12  ................
[ 2050.630897] mem: ddc866a0: 12 12 12 12 60 6b c8 dd 16 80 99 e0 fa 8e 2a c1  ....`k........*.
[ 2050.631014] mem: ddc866b0: 16 80 99 e0 ce 92 2a c1 16 80 99 e0 f2 c1 1b c1  ......*.........
[ 2050.631130] mem: ddc866c0: 16 80 99 e0 4c 8b 0a c1 4c 8b 0a c1 61 80 99 e0  ....L...L...a...
[ 2050.631248] mem: ddc866d0: 16 80 99 e0 61 80 99 e0 16 80 99 e0 61 80 99 e0  ....a.......a...
[ 2050.631365] mem: ddc866e0: 75 80 99 e0 48 01 00 c1 2b 36 05 c1 00 00 00 00  u...H...+6......
[ 2050.631483] mem: ddc866f0: 4a 0c 00 00 99 ad 06 00 6d 35 05 c1 9e 8b 2a c1  J.......m5....*.
[ 2050.631599] mem: ddc86700: 6d 35 05 c1 48 8c 2a c1 6d 35 05 c1 ee 89 0a c1  m5..H.*.m5......
[ 2050.631716] mem: ddc86710: ee 89 0a c1 e4 0a 14 c1 e4 0a 14 c1 ee 89 0a c1  ................
[ 2050.631832] mem: ddc86720: ee 89 0a c1 6d 35 05 c1 6d 35 05 c1 6d 35 05 c1  ....m5..m5..m5..
[ 2050.631948] mem: ddc86730: a7 39 05 c1 ef b8 2a c1 00 00 00 00 00 00 00 00  .9....*.........
[ 2050.633948] mem: ddc86740: 4a 0c 00 00 97 ad 06 00 5a 5a 5a 5a 5a 5a 5a 5a  J.......ZZZZZZZZ
[ 2050.634095] mem: ddc86750: 14 dc 46 dd 14 dc 46 dd 00 00 00 00 6b 6b 6b 6b  ..F...F.....kkkk
[ 2050.634236] mem: ddc86760: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5  kkkkkkkkkkkkkkk.
[ 2050.634378] mem: ddc86770: cc cc cc cc c0 69 c8 dd a0 83 20 c1 fa 8e 2a c1  .....i.... ...*.
[ 2050.634629] =============================================================================
[ 2050.634750] BUG kmalloc-32 (Tainted: P    B      O): Redzone overwritten
[ 2050.634828] -----------------------------------------------------------------------------
[ 2050.634828] 
[ 2050.634967] INFO: 0xddc866a0-0xddc866a3. First byte 0x12 instead of 0xcc
[ 2050.635123] INFO: Allocated in try_to_corrupt_redzone+0x16/0x61 [mymodule] age=1 cpu=0 pid=3146
[ 2050.635255]  alloc_debug_processing+0x63/0xd1
[ 2050.635337]  try_to_corrupt_redzone+0x16/0x61 [mymodule]
[ 2050.635423]  __slab_alloc.constprop.73+0x366/0x384
[ 2050.635506]  try_to_corrupt_redzone+0x16/0x61 [mymodule]
[ 2050.635594]  vt_console_print+0x21e/0x226
[ 2050.635672]  try_to_corrupt_redzone+0x16/0x61 [mymodule]
[ 2050.635758]  kmem_cache_alloc_trace+0x43/0xd7
[ 2050.635832]  kmem_cache_alloc_trace+0x43/0xd7
[ 2050.635909]  mymodule_init+0x0/0x19 [mymodule]
[ 2050.635992]  try_to_corrupt_redzone+0x16/0x61 [mymodule]
[ 2050.636003]  mymodule_init+0x0/0x19 [mymodule]
[ 2050.636092]  try_to_corrupt_redzone+0x16/0x61 [mymodule]
[ 2050.636179]  mymodule_init+0x0/0x19 [mymodule]
[ 2050.636261]  mymodule_init+0x14/0x19 [mymodule]
[ 2050.636343]  do_one_initcall+0x6c/0xf4
[ 2050.636428]  load_module+0x1690/0x199a
[ 2050.636508] INFO: Freed in load_module+0x15d2/0x199a age=3 cpu=0 pid=3146
[ 2050.636598]  free_debug_processing+0xd6/0x142
[ 2050.636676]  load_module+0x15d2/0x199a
[ 2050.636749]  __slab_free+0x3e/0x28d
[ 2050.636819]  load_module+0x15d2/0x199a
[ 2050.636888]  kfree+0xe4/0x102
[ 2050.636953]  kfree+0xe4/0x102
[ 2050.637020]  kobject_uevent_env+0x361/0x39a
[ 2050.637091]  kobject_uevent_env+0x361/0x39a
[ 2050.637163]  kfree+0xe4/0x102
[ 2050.637227]  kfree+0xe4/0x102
[ 2050.637294]  load_module+0x15d2/0x199a
[ 2050.637366]  load_module+0x15d2/0x199a
[ 2050.637438]  load_module+0x15d2/0x199a
[ 2050.637509]  SyS_init_module+0x72/0x8a
[ 2050.637581]  syscall_call+0x7/0xb
[ 2050.637649] INFO: Slab 0xdffa90c0 objects=19 used=8 fp=0xddc86000 flags=0x40000080
[ 2050.637749] INFO: Object 0xddc86680 @offset=1664 fp=0xddc86b60
[ 2050.637749] 
[ 2050.637875] Bytes b4 ddc86670: 14 01 00 00 95 ad 06 00 5a 5a 5a 5a 5a 5a 5a 5a  ........ZZZZZZZZ
[ 2050.637875] Object ddc86680: 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12  ................
[ 2050.637875] Object ddc86690: 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12  ................
[ 2050.637875] Redzone ddc866a0: 12 12 12 12                                      ....
[ 2050.637875] Padding ddc86748: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ
[ 2050.637875] CPU: 0 PID: 3146 Comm: insmod Tainted: P    B      O 3.10.17 #1
[ 2050.637875] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 2050.637875]  00000000 c10a7b59 c10941c5 dffa90c0 ddc86680 de8012cc de801280 ddc86680
[ 2050.637875]  dffa90c0 c10a7bd3 c13689a5 ddc866a0 000000cc 00000004 de801280 ddc86680
[ 2050.637875]  dffa90c0 de800e00 c12a8b2f 000000cc ddc86680 de801280 dffa90c0 dd407e50
[ 2050.637875] Call Trace:
[ 2050.637875]  [&ltc10a7b59&gt] ? check_bytes_and_report+0x6d/0xb0
[ 2050.637875]  [&ltc10941c5&gt] ? page_address+0x1a/0x79
[ 2050.637875]  [&ltc10a7bd3&gt] ? check_object+0x37/0x149
[ 2050.637875]  [&ltc12a8b2f&gt] ? free_debug_processing+0x67/0x142
[ 2050.637875]  [&ltc12a8c48&gt] ? __slab_free+0x3e/0x28d
[ 2050.637875]  [&lte0998075&gt] ? mymodule_init+0x14/0x19 [mymodule]
[ 2050.637875]  [&ltc102063d&gt] ? wake_up_klogd+0x1d/0x1e
[ 2050.637875]  [&ltc10a89ee&gt] ? kfree+0xe4/0x102
[ 2050.637875]  [&ltc10a89ee&gt] ? kfree+0xe4/0x102
[ 2050.637875]  [&lte0998075&gt] ? mymodule_init+0x14/0x19 [mymodule]
[ 2050.637875]  [&lte0998075&gt] ? mymodule_init+0x14/0x19 [mymodule]
[ 2050.637875]  [&lte0998061&gt] ? try_to_corrupt_redzone+0x61/0x61 [mymodule]
[ 2050.637875]  [&lte0998075&gt] ? mymodule_init+0x14/0x19 [mymodule]
[ 2050.637875]  [&ltc1000148&gt] ? do_one_initcall+0x6c/0xf4
[ 2050.637875]  [&ltc105362b&gt] ? load_module+0x1690/0x199a
[ 2050.637875]  [&ltc10539a7&gt] ? SyS_init_module+0x72/0x8a
[ 2050.637875]  [&ltc12ab8ef&gt] ? syscall_call+0x7/0xb
[ 2050.637875] FIX kmalloc-32: Restoring 0xddc866a0-0xddc866a3=0xcc
[ 2050.637875] 
[ 2051.232817] mymodule_exit exit

First the slub allocator print the error type "redzone overwritten"
[ 2050.634629] =============================================================================
[ 2050.634750] BUG kmalloc-32 (Tainted: P    B      O): Redzone overwritten
[ 2050.634828] -----------------------------------------------------------------------------
[ 2050.634828] 
[ 2050.634967] INFO: 0xddc866a0-0xddc866a3. First byte 0x12 instead of 0xcc

To understand what readzone is, take a look at the memory content around the object:

[ 2050.637875] Bytes b4 ddc86670: 14 01 00 00 95 ad 06 00 5a 5a 5a 5a 5a 5a 5a 5a  ........ZZZZZZZZ
[ 2050.637875] Object ddc86680: 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12  ................
[ 2050.637875] Object ddc86690: 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12  ................
[ 2050.637875] Redzone ddc866a0: 12 12 12 12                                      ....
[ 2050.637875] Padding ddc86748: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ

We fill 38 bytes of 0x12 from the start of the 36-bytes object (0xddc86680 - 0xddc8669f) and 4 more 0x12 on the redzone (normal 0xbb or 0xcc). When the object is returned to the kernel, kernel finds that the redzone is neither 0xcc or 0xbb and reports this as a BUG.

The slub allocator reports the latest allocate/free history of this object. You can see the object is just allocated by our kernel module function 'try_to_corrup_redzone'.

Sometime the traces of the object are more useful than function backtrace. For example, if there exists an use-after-free case:  function A allocates an object and writes if after freeing the object. If the object is allocated by another function B. In this case, function B has a corrupted object, and if we have the free trace of this object, we can trace back to the previous owner of the object, function A.

[ 2050.635123] INFO: Allocated in try_to_corrupt_redzone+0x16/0x61 [mymodule] age=1 cpu=0 pid=3146
[ 2050.635255]  alloc_debug_processing+0x63/0xd1
[ 2050.635337]  try_to_corrupt_redzone+0x16/0x61 [mymodule]
[ 2050.635423]  __slab_alloc.constprop.73+0x366/0x384
[ 2050.635506]  try_to_corrupt_redzone+0x16/0x61 [mymodule]
[ 2050.635594]  vt_console_print+0x21e/0x226
[ 2050.635672]  try_to_corrupt_redzone+0x16/0x61 [mymodule]
[ 2050.635758]  kmem_cache_alloc_trace+0x43/0xd7
[ 2050.635832]  kmem_cache_alloc_trace+0x43/0xd7
[ 2050.635909]  mymodule_init+0x0/0x19 [mymodule]
[ 2050.635992]  try_to_corrupt_redzone+0x16/0x61 [mymodule]
[ 2050.636003]  mymodule_init+0x0/0x19 [mymodule]
[ 2050.636092]  try_to_corrupt_redzone+0x16/0x61 [mymodule]
[ 2050.636179]  mymodule_init+0x0/0x19 [mymodule]
[ 2050.636261]  mymodule_init+0x14/0x19 [mymodule]
[ 2050.636343]  do_one_initcall+0x6c/0xf4
[ 2050.636428]  load_module+0x1690/0x199a
[ 2050.636508] INFO: Freed in load_module+0x15d2/0x199a age=3 cpu=0 pid=3146
[ 2050.636598]  free_debug_processing+0xd6/0x142
[ 2050.636676]  load_module+0x15d2/0x199a
[ 2050.636749]  __slab_free+0x3e/0x28d
[ 2050.636819]  load_module+0x15d2/0x199a
[ 2050.636888]  kfree+0xe4/0x102
[ 2050.636953]  kfree+0xe4/0x102
[ 2050.637020]  kobject_uevent_env+0x361/0x39a
[ 2050.637091]  kobject_uevent_env+0x361/0x39a
[ 2050.637163]  kfree+0xe4/0x102
[ 2050.637227]  kfree+0xe4/0x102
[ 2050.637294]  load_module+0x15d2/0x199a
[ 2050.637366]  load_module+0x15d2/0x199a
[ 2050.637438]  load_module+0x15d2/0x199a
[ 2050.637509]  SyS_init_module+0x72/0x8a

Saturday, February 15, 2014

ARM64 Linux kernel virtual address space


Now let's talk about the Linux kernel virtual address space on 64-bit ARM CPU. You can find information about ARMv8 in ARM official website. http://www.arm.com/products/processors/armv8-architecture.php

One big problem on 32-bit CPUs is the limited 4GB limitation of virtual address spaces. The problem remains even if some PAE support since it focuses on the extension of physical address space not virtual address space. Things changes after the born of 64-bit CPUs: AMD64 and ARMv8, they can now support up to 2^64 addresses, which is uhh.. a very big number.
Actually 2^64 is too large, so in the Linux kernel implementation, only part of 64 bits are used (42 bits for CONFIG_ARM64_64K_PAGES, 39 bit for 4K page). This article is assuming 4K page is used (VA_BITS = 39 case)

#ifdef CONFIG_ARM64_64K_PAGES
#define VA_BITS                 (42)
#else
#define VA_BITS                 (39)
#endif

One good thing on ARM64 is that since we have enough virtual address bits, user space and kernel space can have their own 2^39 = 512GB virtual addresses!
All user virtual addresses have 25 leading zeros and kernel addresses have 25 leading ones. Address between user space and kernel space are not used and they are used to trap illegal accesses.

ARM64 Linux virtual address space layout

kernel space:

Although we have no ARM64 environment now, we can analysis the kernel virtual address space by reading the source code and observing a running AMD64 Linux box.

In arch/arm64/include/asm/memory.h, we can see the some differences: we have no lowmem zone, since the virtual address is so big that we can treat all memory of lowmem and do not have to worry about virtual address. (Yes, there is still a limit of kernel virtual address). Second, the order of different kernel virtual address changes:


#ifdef CONFIG_ARM64_64K_PAGES
#define VA_BITS                 (42)
#else                               
#define VA_BITS                 (39)
#endif                              
#define PAGE_OFFSET             (UL(0xffffffffffffffff) << (VA_BITS - 1))
#define MODULES_END             (PAGE_OFFSET)
#define MODULES_VADDR           (MODULES_END - SZ_64M)
#define EARLYCON_IOBASE         (MODULES_VADDR - SZ_4M)


         pr_notice("Virtual kernel memory layout:\n"                             
                   "    vmalloc : 0x%16lx - 0x%16lx   (%6ld MB)\n"
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
                   "    vmemmap : 0x%16lx - 0x%16lx   (%6ld MB)\n"
 #endif
                   "    modules : 0x%16lx - 0x%16lx   (%6ld MB)\n"
                   "    memory  : 0x%16lx - 0x%16lx   (%6ld MB)\n"
                   "      .init : 0x%p" " - 0x%p" "   (%6ld kB)\n"
                   "      .text : 0x%p" " - 0x%p" "   (%6ld kB)\n"
                   "      .data : 0x%p" " - 0x%p" "   (%6ld kB)\n",
                   MLM(VMALLOC_START, VMALLOC_END),
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
                   MLM((unsigned long)virt_to_page(PAGE_OFFSET),
                       (unsigned long)virt_to_page(high_memory)),
 #endif
                   MLM(MODULES_VADDR, MODULES_END),
                   MLM(PAGE_OFFSET, (unsigned long)high_memory),

                   MLK_ROUNDUP(__init_begin, __init_end),
                   MLK_ROUNDUP(_text, _etext),
                   MLK_ROUNDUP(_sdata, _edata));

see also:
arch/arm64/mm/init.c
arch/arm64/include/asm/pgtable.h

You can see that there is no pkmap or fixmap, it's because the kernel is assuming every memory has a valid kernel virtual address and there's no need to create pkmap/fixmap.

ARM64 kernel virtual address space layout


User space:
The memory layout implementation of user virtual address space looks like it does on ARM32. Since the available user space virtual address becomes 512GB, we can build a larger application on 64-bit CPUs.

One interesting topic is that ARM claims the ARMv8 is compatible with ARM 32-bit applications, all 32-bit applications can run on ARMv8 without modification.How does the 32-bit application virtual memory layout look like on a 64-bit kernel?
Actually, all process on 64-bit kernel is a 64-bit process. To run ARM 32-bit applications, Linux kernel still create a process from a 64-bit init process, but limit the user address space to 4GB. In this way, we can have both 32-bit and 64-bit application on a 64-bit Linux kernel.


 #ifdef CONFIG_COMPAT
 #define TASK_SIZE_32            UL(0x100000000)
 #define TASK_SIZE               (test_thread_flag(TIF_32BIT) ? \
                                 TASK_SIZE_32 : TASK_SIZE_64)
 #else
 #define TASK_SIZE               TASK_SIZE_64
 #endif /* CONFIG_COMPAT */

64-bit ARM applications on 64-bit Linux kernel

ARM64 64-bit user space program virtual address space layout


32-bit ARM applications on 64-bit Linux kernel

ARM64 32-bit user space program virtual address space layout

Note that the 32-bit application still have a 512GB kernel virtual address space and do not share it's own 4GB of virtual address space with kernel, the user applications have a complete 4GB of virtual address. On the other hand, 32-bit applications on 32-bit kernel have only 3GB of virtual address space.


ARM32 Linux ARM64 Linux
32-bit user virtual address space size 3GB 4GB
64-bit user virtual address space size N/A 512GB
kernel virtual address space 1GB 512GB

ARM32 Linux kernel virtual address space

The 32-bit ARM CPU can address up to 2^32 = 4GB address*. It's not big enough in present days, since the size of available DRAM on computing devices is growing fast and the memory usage of application is growing as well.

In Linux kernel implementation, user space and kernel must coexist in the same 4GB virtual address space. It means both user space and kernel can use less than 4GB virtual address space.
Linux kernel provides 3 different split of virtual address spaces: VMSPLIT_3G, VMSPLIT_2G, VMSPLIT_1G.


Linux virtual address space options


 The default configuration is VMSPLIT_3G, as you can see, kernel space starts from 0xC0000000 to 0xFFFFFFFF and user space starts from 0x00000000 to 0xC0000000.

Let's take a closer look of the VMSPLIT_3G mapping:

kernel space

We can observe the kernel virtual address by checking the boot log (dmesg) or take a look at arch/arm/mm/init.c.
lowmem: The memory that have 1-to-1 mapping between virtual and physical address. It means the virtual and physical address are both configuous, and this good property makes the virtual to physical address translation very easy. If we have a virtual address from lowmem, we can find out its physical address by simple shift. (see __pa() and __va()).

vmalloc: The vmalloc memory is only virtually contiguous.

fixmap/pkmap: create fast mapping of a single page for kernel. Most used in file system.

modules: The virtual address for module loading and executing. kernel modules are loaded into this part of virtual memory.

user space

The code for deterring user space virtual address is in arch/arm/mm/mmap.c
The user space have two different kind of mmap layout: legacy and non-legacy. Legacy layout sets the base of mmap(TASK_UNMAPPED_BASE) and the mmap grows in bottom-up manner; on the other case, non-legacy set the mmap base from TASK_SIZE - 128MB with some random shift for security reasons).


void arch_pick_mmap_layout(struct mm_struct *mm)
{
        unsigned long random_factor = 0UL;

        /* 8 bits of randomness in 20 address space bits */
        if ((current->flags & PF_RANDOMIZE) &&
            !(current->personality & ADDR_NO_RANDOMIZE))
                random_factor = (get_random_int() % (1 << 8)) << PAGE_SHIFT;
        if (mmap_is_legacy()) {
                mm->mmap_base = TASK_UNMAPPED_BASE + random_factor;
                mm->get_unmapped_area = arch_get_unmapped_area;
        } else {
                mm->mmap_base = mmap_base(random_factor);
                mm->get_unmapped_area = arch_get_unmapped_area_topdown;
        }

The user space virtual address layout looks like:

32-bit user virtual address space layout

*ARM has LPAE (Large Physical Address Extension) mode that can address up to 1TB.