[SOLVED - hardware problem] Bad page map in process uksmd

Hi everybody,

I have some problems with the Calculate kernel on my Dell “server”.

Sometimes, i kave a kernel panic, and the message are the same :

Apr 16 22:56:00 oxygen kernel: BUG: Bad page map in process uksmd  pte:80000000da358225 pmd:d912e067
Apr 16 22:56:00 oxygen kernel: addr:00007f16b7e9f000 vm_flags:80100073 anon_vma:ffff8800d97c2b40 mapping:          (null) index:7f16b7e9f
Apr 16 22:56:00 oxygen kernel: CPU: 1 PID: 621 Comm: uksmd Not tainted 3.18.11-calculate #1
Apr 16 22:56:00 oxygen kernel: Hardware name: Dell Inc.                 OptiPlex GX620               /0FH884, BIOS A11 11/30/2006
Apr 16 22:56:00 oxygen kernel: 0000000000000000 ffff8800d9cc4c38 ffffffff8143e3f8 00007f16b7e9f000
Apr 16 22:56:00 oxygen kernel: ffffffff810f6217 80000000c9b34067 ffffffff81102476 00007f16d70bb000
Apr 16 22:56:00 oxygen kernel: ffff8800d9cc4c38 0000000000000004 80000000da358225 ffffea0003644bb0
Apr 16 22:56:00 oxygen kernel: Call Trace:
Apr 16 22:56:00 oxygen kernel: [<ffffffff8143e3f8>] ? dump_stack+0x41/0x51
Apr 16 22:56:00 oxygen kernel: [<ffffffff810f6217>] ? print_bad_pte+0x197/0x250
Apr 16 22:56:00 oxygen kernel: [<ffffffff81102476>] ? __page_check_address+0xa6/0x110
Apr 16 22:56:00 oxygen kernel: [<ffffffff810f6efe>] ? vm_normal_page+0x6e/0x80
Apr 16 22:56:00 oxygen kernel: [<ffffffff810f52b1>] ? follow_page_mask+0x191/0x320
Apr 16 22:56:00 oxygen kernel: [<ffffffff81116b11>] ? scan_vma_one_page+0x2a1/0x12c0
Apr 16 22:56:00 oxygen kernel: [<ffffffff81117c2e>] ? uksm_do_scan+0xfe/0x1910
Apr 16 22:56:00 oxygen kernel: [<ffffffff810972b0>] ? migrate_timer_list+0x60/0x60
Apr 16 22:56:00 oxygen kernel: [<ffffffff811195a5>] ? uksm_scan_thread+0x165/0x190
Apr 16 22:56:00 oxygen kernel: [<ffffffff81119440>] ? uksm_do_scan+0x1910/0x1910
Apr 16 22:56:00 oxygen kernel: [<ffffffff81063ffc>] ? kthread+0xbc/0xe0
Apr 16 22:56:00 oxygen kernel: [<ffffffff81063f40>] ? kthread_create_on_node+0x170/0x170
Apr 16 22:56:00 oxygen kernel: [<ffffffff81444058>] ? ret_from_fork+0x58/0x90
Apr 16 22:56:00 oxygen kernel: [<ffffffff81063f40>] ? kthread_create_on_node+0x170/0x170
Apr 16 22:56:00 oxygen kernel: Disabling lock debugging due to kernel taint

I thank the problem was LXC but I unmerged LXC and the problem is the same.

My config :

oxygen adrien # inxi -F
System:    Host: oxygen.linuxtricks.fr Kernel: 3.18.11-calculate x86_64 (64 bit) Console: tty 0
           Distro: Calculate Scratch Server 14.12.1
Machine:   System: Dell product: OptiPlex GX620 serial: XXXXXXX
           Mobo: Dell model: 0FH884 serial: ..XXXXXXXXX. Bios: Dell v: A11 date: 11/30/2006
CPU:       Dual core Intel Pentium D (-MCP-) cache: 1024 KB 
           clock speeds: max: 2793 MHz 1: 2793 MHz 2: 2793 MHz
Graphics:  Card: Intel 82945G/GZ Integrated Graphics Controller
           Display Server: N/A driver: N/A tty size: 190x51 Advanced Data: N/A for root out of X
Audio:     Card Intel 82801G (ICH7 Family) AC'97 Audio Controller driver: snd_intel8x0
           Sound: Advanced Linux Sound Architecture v: k3.18.11-calculate
Network:   Card: Broadcom NetXtreme BCM5751 Gigabit Ethernet PCI Express driver: tg3
           IF: enp2s0 state: up speed: 1000 Mbps duplex: full mac: 00:19:b9:10:15:15
Drives:    HDD Total Size: 320.1GB (3.7% used) ID-1: /dev/sdb model: WDC_WD1600AAJS size: 160.0GB
           ID-2: /dev/sda model: WDC_WD1600AAJS size: 160.0GB
Partition: ID-1: / size: 28G used: 7.5G (29%) fs: ext4 dev: /dev/md0p1
           ID-2: /home size: 116G used: 60M (1%) fs: ext4 dev: /dev/md0p3
           ID-3: swap-1 size: 4.00GB used: 0.00GB (0%) fs: swap dev: /dev/md0p2
RAID:      Device-1: /dev/md0 - active raid: 1 components: online: 2/2 - sda1 sdb1
Sensors:   None detected - is lm-sensors installed and configured?
Info:      Processes: 110 Uptime: 23:03 Memory: 599.4/3504.8MB Init: SysVinit rc: OpenRC runlevel: default
           Client: Shell (bash) inxi: 2.2.19 

Can you help me ?

Hello.
Try to emerge =sys-kernel/calculate-sources-3.19.5
Kernel version 3.19 without UKSM and may be problem will be solved.

Hi, thanks for your answer.

Since I posted here, I had some other errors but without uksm.

I think I have a problem ( faulty RAM chips )

I am going to test to remove each RAM and after, I will install the latest kernel.

Thanks

I think I have a memory problem…

I tryed with an newer Kernel as you suggest me but, same problem :

http://pastebin.calculate-linux.org/en/show/10819

And the dmesg : http://pastebin.calculate-linux.org/en/show/10820

So, i think i have a hardware problem (motherboard).

A moved my 2 hards drives (RAID1 mdadm) into an other server and no problems.

I am going to continue with it, and throw away this dell.

Hummm…

I installed a CLDX on this computer to recycle it and … I had no problems…

CSS + RAID1 with mdadm on my DELL ?

CSS + RAID1 with mdadm on the HP is OK… Strange problem :slight_smile: