So, our VM is running. Well, now let's try check, what happend when one node in storage cluster was rebooted. At NAS's first node I typed "reboot". Node start rebooting. At second node heartbeat server bringup new subinterface with 172.16.70.2 and start iSCSItarget service. NAS still reply to ping.
VM continue working.
In Proxmox GUI, I going to storage "NAS01-PVELUNS on node cl02-n01". GUI is hungup.
Console at cl02-n01 do:
root@cl02-n01:~# pvscan
command hungup, but after about 10-15 sec:
/dev/NAS01LUN1VG0/vm-100-disk-1: read failed after 0 of 4096 at 3221159936: Input/output error
/dev/NAS01LUN1VG0/vm-100-disk-1: read failed after 0 of 4096 at 3221217280: Input/output error
/dev/NAS01LUN1VG0/vm-100-disk-1: read failed after 0 of 4096 at 0: Input/output error
/dev/NAS01LUN1VG0/vm-100-disk-1: read failed after 0 of 4096 at 4096: Input/output error
PV /dev/sdd VG NAS01LUN1VG0 lvm2 [4.00 TiB / 4.00 TiB free]
PV /dev/sda2 VG pve lvm2 [297.59 GiB / 16.00 GiB free]
Total: 2 [4.29 TiB] / in use: 2 [4.29 TiB] / in no VG: 0 [0 ]
I repeat command:
root@cl02-n01:~# pvscan
PV /dev/sdd VG NAS01LUN1VG0 lvm2 [4.00 TiB / 4.00 TiB free]
PV /dev/sda2 VG pve lvm2 [297.59 GiB / 16.00 GiB free]
Total: 2 [4.29 TiB] / in use: 2 [4.29 TiB] / in no VG: 0 [0 ]
Hehe! Previous warning about "Found duplicate PV bla-bla-bla" is disapear.
Very interesting.
Ok, let's restart our VM: Done "shutdown" and "start" - all fine. VM started successfuly
The first NAS's node already rebooted and online again. Now I try to reboot second NAS's node.
Do "reboot", node going to reboot...
Now check GUI: no hungup.
Check console:
root@cl02-n01:~# pvscan
Found duplicate PV MiXXJdMcRElPXQPEtzc6pPFAQLhQn0lC: using /dev/sde not /dev/sdd
PV /dev/sde VG NAS01LUN1VG0 lvm2 [4.00 TiB / 4.00 TiB free]
PV /dev/sda2 VG pve lvm2 [297.59 GiB / 16.00 GiB free]
Total: 2 [4.29 TiB] / in use: 2 [4.29 TiB] / in no VG: 0 [0 ]
OMG! "Duplicate" again. :(
But the worst is something happend with VM:
But the worst is something happend with VM:
root@cl02-n01:~# qm stop 100
trying to aquire lock... OK
/dev/NAS01LUN1VG0/vm-100-disk-1: read failed after 0 of 4096 at 3221159936: Input/output error
/dev/NAS01LUN1VG0/vm-100-disk-1: read failed after 0 of 4096 at 3221217280: Input/output error
/dev/NAS01LUN1VG0/vm-100-disk-1: read failed after 0 of 4096 at 0: Input/output error
/dev/NAS01LUN1VG0/vm-100-disk-1: read failed after 0 of 4096 at 4096: Input/output error
Seems to be data at /dev/NAS01LUN1VG0/vm-100-disk-1 was corrupted.
Result: need check my NAS.
Result: need check my NAS.
No comments:
Post a Comment