Hello list,
i have a problem with btape (bacula-sd too) and writing on a tape. After
a while it times out sporadically and i can't find out the reason. The
problem is reproducable.
Filling the tape with tar works well, I tested all the commands (rewind,
fsr,bsr ...) by hand and there is no problem with that. The mtx-changer
script from bacula works well too.
My environment:
Hardware:
- a linux server running Debian Linux 2.6.32-bpo.4-xen-686 #1 SMP Tue
Apr 13 16:03:38 UTC 2010 i686 GNU/Linux
- a new Tapechanger Quantum Superloader 3 LTO4 with an Ultrium 4 drive
(2190)
- its coupled with an Adaptec 1045 SAS-adapter
Since kernel 2.6.30 the mvsas-module supports the adapter
Software:
- Bacula 5.0.1-1 installed from the Backports
- mt-st 1.1-4
- i recompiled the st driver with debugging: st: Version 20081215, fixed
bufsize 32768, s/g segs 256
backup:~# lsscsi
[1:0:0:0] tape QUANTUM ULTRIUM 4 2190 /dev/st0
[1:0:0:1] mediumx QUANTUM UHDL 0061 /dev/sch0
backup:~# lsmod ...stripped...
scsi_transport_sas 15093 2 mvsas,libsas
libata 114448 3 ata_piix,ata_generic,libsas
scsi_mod 101065 9
sg,st,osst,ch,sd_mod,mvsas,libsas,scsi_transport_sas,libata
backup:~# tapeinfo -f /dev/sg4
Product Type: Tape Drive
Vendor ID: 'QUANTUM '
Product ID: 'ULTRIUM 4 '
Revision: '2190'
Attached Changer API: No
SerialNumber: 'JN0933AMJ50492'
MinBlock: 1
MaxBlock: 16777215
SCSI ID: 0
SCSI LUN: 0
Ready: yes
BufferedMode: yes
Medium Type: Not Loaded
Density Code: 0x46
BlockSize: 0
DataCompEnabled: yes
DataCompCapable: yes
DataDeCompEnabled: yes
CompType: 0x1
DeCompType: 0x1
Block Position: 14111
Partition 0 Remaining Kbytes: 853050
Partition 0 Size in Kbytes: 853060
ActivePartition: 0
EarlyWarningSize: 0
backup:~# cat /etc/stinit.def
# Defaults, i added read-ahead and scsi2logical
{buffer-writes read-ahead can-bsr scsi2logical async-writes}
# Entry from EMC NetWorker Technical Note
manufacturer=QUANTUM model="ULTRIUM 4" {
auto-lock=0
timeout=800
long-timeout=3600
can-partition=0
scsi2logical=1
# This gives the highest performance, but with old versions of st module
it can cause problems with spaning tapes
async-writes=1
# linux does not have full LTO-4 support to set modeline properly
(should be 0x46 for LTO-4), so use device defaults instead
mode1 blocksize=0 density=0x00
}
In the document there was the option "drive-buffering", but it's not
available (and covered with buffer-writes?)
backup:~# mt-st -f /dev/nst0 stshowopt
The options set: buffer-writes async-writes read-ahead can-bsr
scsi2logical
backup:~# cat /proc/sys/kernel/hung_task_timeout_secs
300
Storage-Daemon configuration:
Autochanger {
Name = wechsler
Device = band
Changer Command = "sudo /etc/bacula/scripts/mtx-changer %c %o %S %a %
d"
Changer Device = /dev/sg5
}
Device {
Name = band
Drive Index = 0
Media Type = LTO-4
Archive Device = /dev/nst0
Device Type = Tape
Automatic Mount = yes; # when device opened, read it
AlwaysOpen = yes;
RemovableMedia = yes;
RandomAccess = no;
Maximum File Size = 1GB #2GB #5GB
Changer Command = "sudo /etc/bacula/scripts/mtx-changer %c %o %S %a %
d"
Changer Device = /dev/sg5
AutoChanger = yes
Maximum Changer Wait = 300
}
Okay, I can run btape test and it times out spouradically
command: strace -ftt -ostrace_btape4 btape -c /etc/bacula/bacula-sd.conf
band
dmesg output since loading the recompiled module, maybe interesting:
[59066.912097] st: Version 20081215, fixed bufsize 32768, s/g segs 256
[59066.912504] st 1:0:0:0: Attached scsi tape st0
[59066.912548] st 1:0:0:0: st0: try direct i/o: yes (alignment 4 B)
[59066.973150] st0: Block limits 1 - 16777215 bytes.
[59066.973818] st0: Mode sense. Length 11, medium 0, WBS 10, BLL 8
[59066.973867] st0: Density 46, tape length: 0, drv buffer: 1
[59066.973914] st0: Block size: 0, buffer size: 4096 (1 blocks).
[59066.980191] st0: Block limits 1 - 16777215 bytes.
[59066.980856] st0: Mode sense. Length 11, medium 0, WBS 10, BLL 8
[59066.980905] st0: Density 46, tape length: 0, drv buffer: 1
[59066.980952] st0: Block size: 0, buffer size: 4096 (1 blocks).
[59066.981004] st0: Normal timeout set to 800 seconds.
[59066.981040] st0: Long timeout set to 3600 seconds.
[59066.981079] st0: Mode 0 options: buffer writes: 1, async writes: 1,
read ahead: 1
[59066.981138] st0: can bsr: 1, two FMs: 0, fast mteom: 0, auto lock:
0,
[59066.981178] st0: defs for wr: 0, no block limits: 0, partitions:
0, s2 log: 0
[59066.981237] st0: sysv: 0 nowait: 0 sili: 0
[59066.981271] st0: debugging: 1
[59066.981303] st0: Mode 0 options: buffer writes: 1, async writes: 1,
read ahead: 1
[59066.981363] st0: can bsr: 1, two FMs: 0, fast mteom: 0, auto lock:
0,
[59066.981403] st0: defs for wr: 0, no block limits: 0, partitions:
0, s2 log: 1
[59066.981463] st0: sysv: 0 nowait: 0 sili: 0
[59066.981497] st0: debugging: 1
[59066.981528] st0: Default block size set to 0 bytes.
[59066.981564] st0: Density default set to 0
[59066.981597] st0: Setting block size to 0 bytes.
[59066.981631] st0: Setting density code to 0.
[59111.178879] st0: Block limits 1 - 16777215 bytes.
[59111.179531] st0: Mode sense. Length 11, medium 0, WBS 10, BLL 8
[59111.179580] st0: Density 46, tape length: 0, drv buffer: 1
[59111.179627] st0: Block size: 0, buffer size: 4096 (1 blocks).
[59111.179787] st0: Rewinding tape.
[59111.190845] st0: Block limits 1 - 16777215 bytes.
[59111.191525] st0: Mode sense. Length 11, medium 0, WBS 10, BLL 8
[59111.191574] st0: Density 46, tape length: 0, drv buffer: 1
[59111.191621] st0: Block size: 0, buffer size: 4096 (1 blocks).
[59111.191753] st0: Locking drive door.
[59111.192529] st0: Setting block size to 0 bytes.
[59111.193439] st0: Mode 0 options: buffer writes: 1, async writes: 1,
read ahead: 1
[59111.193510] st0: can bsr: 1, two FMs: 0, fast mteom: 0, auto lock:
0,
[59111.193554] st0: defs for wr: 0, no block limits: 0, partitions:
0, s2 log: 1
[59111.193616] st0: sysv: 0 nowait: 0 sili: 0
[59111.193650] st0: debugging: 1
[59111.195264] st0: Block limits 1 - 16777215 bytes.
[59111.195921] st0: Mode sense. Length 11, medium 0, WBS 10, BLL 8
[59111.195970] st0: Density 46, tape length: 0, drv buffer: 1
[59111.196723] st0: Block size: 0, buffer size: 4096 (1 blocks).
[59111.196857] st0: Rewinding tape.
[59111.207772] st0: Block limits 1 - 16777215 bytes.
[59111.208423] st0: Mode sense. Length 11, medium 0, WBS 10, BLL 8
[59111.208471] st0: Density 46, tape length: 0, drv buffer: 1
[59111.208518] st0: Block size: 0, buffer size: 4096 (1 blocks).
[59111.208652] st0: Locking drive door.
[59111.209338] st0: Setting block size to 0 bytes.
[59111.210314] st0: Mode 0 options: buffer writes: 1, async writes: 1,
read ahead: 1
[59111.210375] st0: can bsr: 1, two FMs: 0, fast mteom: 0, auto lock:
0,
[59111.210415] st0: defs for wr: 0, no block limits: 0, partitions:
0, s2 log: 1
[59111.210474] st0: sysv: 0 nowait: 0 sili: 0
[59111.210510] st0: debugging: 1
[59140.637500] st0: Rewinding tape.
[59140.648091] st0: Setting block size to 0 bytes.
[59140.648129] st0: Setting density code to 0.
[59150.021243] st0: Writing 1 filemarks.
[59159.801495] st0: Writing 1 filemarks.
[59162.264608] st0: Rewinding tape.
[59182.339981] st0: Error: 8000002, cmd: 8 0 0 fc 0 0
[59182.340021] st0: Sense Key : No Sense [current]
[59182.340068] Info fld=0xfc00, FMK
[59182.340099] st0: Add. Sense: Filemark detected
[59182.340143] st0: Sense: f0 0 80 0 0 fc 0 1e
[59182.340177] st0: EOF detected (0 bytes read).
[59182.340764] st0: EOF/EOM flag up (2). Bytes 0
[59182.341791] st0: EOF up (2). Left 64512, needed 64512.
[59191.023694] st0: Rewinding tape.
[59191.025548] st0: Spacing tape forward over 4 blocks.
[59191.033038] st0: Spacing tape forward over 195 blocks.
[59191.094513] st0: Spacing tape forward over 9798 blocks.
[59191.400620] st0: Spacing tape forward over 1 filemarks.
[59191.423898] st0: EOF/EOM flag up (2). Bytes 0
[59191.425424] st0: EOF up (2). Left 64512, needed 64512.
[59191.426159] st0: Spacing tape forward over 599 blocks.
[59191.574577] st0: Spacing tape forward over 9398 blocks.
[59191.962161] st0: Rewinding tape.
[59196.152816] st0: Writing 1 filemarks.
[59200.007350] st0: Writing 1 filemarks.
[59200.050531] st0: Writing 1 filemarks.
[59200.089635] st0: Unlocking drive door.
[59200.090563] st0: Number of r/w requests 40013, dio used in 40013,
pages 680221.
[59200.092058] st0: Block limits 1 - 16777215 bytes.
[59200.092714] st0: Mode sense. Length 11, medium 0, WBS 10, BLL 8
[59200.092765] st0: Density 46, tape length: 0, drv buffer: 1
[59200.092802] st0: Block size: 0, buffer size: 4096 (1 blocks).
[59200.092933] st0: Rewinding tape.
[59204.136462] st0: Block limits 1 - 16777215 bytes.
[59204.137134] st0: Mode sense. Length 11, medium 0, WBS 10, BLL 8
[59204.137187] st0: Density 46, tape length: 0, drv buffer: 1
[59204.137242] st0: Block size: 0, buffer size: 4096 (1 blocks).
[59204.137399] st0: Locking drive door.
[59204.138218] st0: Setting block size to 0 bytes.
[59204.139185] st0: Mode 0 options: buffer writes: 1, async writes: 1,
read ahead: 1
[59204.139247] st0: can bsr: 1, two FMs: 0, fast mteom: 0, auto lock:
0,
[59204.139287] st0: defs for wr: 0, no block limits: 0, partitions:
0, s2 log: 1
[59204.139346] st0: sysv: 0 nowait: 0 sili: 0
[59204.139380] st0: debugging: 1
[59204.139652] st0: Rewinding tape.
[59204.141046] st0: Spacing tape forward over 8388607 filemarks.
[59204.142287] st0: Error: 8000002, cmd: 11 1 7f ff ff 0
[59204.142334] st0: Sense Key : Blank Check [current]
[59204.142403] Info fld=0x7ffffc
[59204.142433] st0: Add. Sense: End-of-data detected
[59204.427601] st0: Writing 1 filemarks.
[59206.061925] st0: Rewinding tape.
[59218.160250] st0: Error: 8000002, cmd: 8 0 0 fc 0 0
[59218.160299] st0: Sense Key : No Sense [current]
[59218.160369] Info fld=0xfc00, FMK
[59218.160399] st0: Add. Sense: Filemark detected
[59218.160445] st0: Sense: f0 0 80 0 0 fc 0 1e
[59218.160479] st0: EOF detected (0 bytes read).
[59218.160960] st0: EOF/EOM flag up (2). Bytes 0
[59218.162043] st0: EOF up (2). Left 64512, needed 64512.
[59218.183776] st0: Error: 8000002, cmd: 8 0 0 fc 0 0
[59218.183824] st0: Sense Key : No Sense [current]
[59218.183892] Info fld=0xfc00, FMK
[59218.183923] st0: Add. Sense: Filemark detected
[59218.183969] st0: Sense: f0 0 80 0 0 fc 0 1e
[59218.184004] st0: EOF detected (0 bytes read).
[59218.184800] st0: EOF/EOM flag up (2). Bytes 0
[59218.185862] st0: EOF up (2). Left 64512, needed 64512.
[59218.236914] st0: Error: 8000002, cmd: 8 0 0 fc 0 0
[59218.236962] st0: Sense Key : No Sense [current]
[59218.237031] Info fld=0xfc00, FMK
[59218.237062] st0: Add. Sense: Filemark detected
[59218.237109] st0: Sense: f0 0 80 0 0 fc 0 1e
[59218.237143] st0: EOF detected (0 bytes read).
[59218.237657] st0: EOF/EOM flag up (2). Bytes 0
[59218.238703] st0: EOF up (2). Left 64512, needed 64512.
[59223.238547] st0: Error: 8000002, cmd: 8 0 0 fc 0 0
[59223.238596] st0: Sense Key : No Sense [current]
[59223.238661] Info fld=0xfc00, FMK
[59223.238692] st0: Add. Sense: Filemark detected
[59223.238741] st0: Sense: f0 0 80 0 0 fc 0 1e
[59223.238775] st0: EOF detected (0 bytes read).
[59223.239319] st0: EOF/EOM flag up (2). Bytes 0
[59223.240723] st0: Error: 8000002, cmd: 8 0 0 fc 0 0
[59223.240760] st0: Sense Key : Blank Check [current]
[59223.240809] Info fld=0xfc00
[59223.240838] st0: Add. Sense: End-of-data detected
[59223.240886] st0: Sense: f0 0 8 0 0 fc 0 1e
[59223.240920] st0: Tape error while reading.
[59223.240953] st0: Zero returned for first BLANK CHECK after EOF.
[59223.241790] st0: Rewinding tape.
[59223.243540] st0: Setting block size to 0 bytes.
[59223.243576] st0: Setting density code to 0.
[59227.330717] st0: Writing 1 filemarks.
[59231.179419] st0: Spacing tape backward over 1 filemarks.
[59235.158520] st0: Spacing tape backward over 1 blocks.
[59235.169386] st0: Rewinding tape.
[59239.568865] st0: Writing 1 filemarks.
[59243.425475] st0: Writing 1 filemarks.
[59243.468538] st0: Writing 1 filemarks.
[59243.510459] st0: Writing 1 filemarks.
[59243.551497] st0: Writing 1 filemarks.
[59243.590430] st0: Rewinding tape.
[59251.730032] st0: Spacing tape forward over 1 filemarks.
[59255.792254] st0: Spacing tape forward over 2 filemarks.
[59255.873336] st0: Rewinding tape.
[59260.470836] st0: Spacing tape forward over 4 filemarks.
[59260.476113] st0: Spacing tape forward over 1 filemarks.
[59287.652420] st0: Rewinding tape.
[59300.778049] st0: Writing 1 filemarks.
[59772.344077] INFO: task btape:15348 blocked for more than 300 seconds.
[59772.344132] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[59772.344206] btape D c23e1668 0 15348 15347 0x00000001
[59772.344254] e901d100 00200286 c23e0eec c23e1668 00008020 c144ed20
c144ed20 00000000
[59772.344348] e901d2bc c45bed20 00000000 c10b6d5d 00000000 e98ba540
ed9f204c c2399079
[59772.344441] c2399000 e901d2bc 00000000 c12880e8 00000000 ed9d47c1
c1005df4 00000000
[59772.344533] Call Trace:
[59772.344569] [<c10b6d5d>] ? kmem_cache_alloc+0x79/0xe5
[59772.344626] [<c12880e8>] ? _spin_lock_irq+0xb/0x21
[59772.344681] [<ed9d47c1>] ? sas_queuecommand+0x1f2/0x211 [libsas]
[59772.344723] [<c1005df4>] ? xen_force_evtchn_callback+0xc/0x10
[59772.344764] [<c1287371>] ? schedule_timeout+0x20/0xb0
[59772.344803] [<c112d1ac>] ? blk_peek_request+0x135/0x143
[59772.344861] [<ed862987>] ? scsi_dispatch_cmd+0x185/0x1e5 [scsi_mod]
[59772.344920] [<ed867387>] ? scsi_request_fn+0x3c1/0x47a [scsi_mod]
[59772.344961] [<c128727a>] ? wait_for_common+0xa5/0x101
[59772.345000] [<c10349b0>] ? default_wake_function+0x0/0x8
[59772.345040] [<edf02984>] ? st_scsi_execute_end+0x0/0x45 [st]
[59772.345080] [<edf03099>] ? st_do_scsi+0x2a2/0x2ca [st]
[59772.345118] [<edf07828>] ? st_write+0x4f1/0x926 [st]
[59772.345157] [<c1044dc3>] ? ptrace_stop+0xc0/0xf2
[59772.345195] [<edf07337>] ? st_write+0x0/0x926 [st]
[59772.345232] [<c10bc156>] ? vfs_write+0x7e/0xd6
[59772.345269] [<c10bc246>] ? sys_write+0x3c/0x63
[59772.345306] [<c1008c3c>] ? syscall_call+0x7/0xb
... repeats several times ...
[60130.146514]
/tmp/buildd/linux-2.6-2.6.32/debian/build/source_i386_xen/drivers/scsi/mvsas/mv_sas.c
1669:mvs_abort_task:rc= -5
[60154.292977]
/tmp/buildd/linux-2.6-2.6.32/debian/build/source_i386_xen/drivers/scsi/mvsas/mv_sas.c
1608:mvs_query_task:rc= -5
[60178.439443]
/tmp/buildd/linux-2.6-2.6.32/debian/build/source_i386_xen/drivers/scsi/mvsas/mv_sas.c
1669:mvs_abort_task:rc= -5
[60202.585918]
/tmp/buildd/linux-2.6-2.6.32/debian/build/source_i386_xen/drivers/scsi/mvsas/mv_sas.c
1608:mvs_query_task:rc= -5
[60226.732387]
/tmp/buildd/linux-2.6-2.6.32/debian/build/source_i386_xen/drivers/scsi/mvsas/mv_sas.c
1669:mvs_abort_task:rc= -5
[60250.878860]
/tmp/buildd/linux-2.6-2.6.32/debian/build/source_i386_xen/drivers/scsi/mvsas/mv_sas.c
1608:mvs_query_task:rc= -5
[60275.025323]
/tmp/buildd/linux-2.6-2.6.32/debian/build/source_i386_xen/drivers/scsi/mvsas/mv_sas.c
1669:mvs_abort_task:rc= -5
[60299.171787]
/tmp/buildd/linux-2.6-2.6.32/debian/build/source_i386_xen/drivers/scsi/mvsas/mv_sas.c
1608:mvs_query_task:rc= -5
[60323.318249]
/tmp/buildd/linux-2.6-2.6.32/debian/build/source_i386_xen/drivers/scsi/mvsas/mv_sas.c
1669:mvs_abort_task:rc= -5
[60347.464715]
/tmp/buildd/linux-2.6-2.6.32/debian/build/source_i386_xen/drivers/scsi/mvsas/mv_sas.c
1608:mvs_query_task:rc= -5
[60372.344064] INFO: task btape:15348 blocked for more than 300 seconds.
[60372.344119] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[60372.344193] btape D c23e1668 0 15348 15347 0x00000001
[60372.344241] e901d100 00200286 c23e0eec c23e1668 00008020 c144ed20
c144ed20 00000000
[60372.344334] e901d2bc c45bed20 00000000 c10b6d5d 00000000 e98ba540
ed9f204c c2399079
[60372.344428] c2399000 e901d2bc 00000000 c12880e8 00000000 ed9d47c1
c1005df4 00000000
[60372.344520] Call Trace:
[60372.344556] [<c10b6d5d>] ? kmem_cache_alloc+0x79/0xe5
[60372.344613] [<c12880e8>] ? _spin_lock_irq+0xb/0x21
[60372.344668] [<ed9d47c1>] ? sas_queuecommand+0x1f2/0x211 [libsas]
[60372.344710] [<c1005df4>] ? xen_force_evtchn_callback+0xc/0x10
[60372.344750] [<c1287371>] ? schedule_timeout+0x20/0xb0
[60372.344790] [<c112d1ac>] ? blk_peek_request+0x135/0x143
[60372.344848] [<ed862987>] ? scsi_dispatch_cmd+0x185/0x1e5 [scsi_mod]
[60372.344907] [<ed867387>] ? scsi_request_fn+0x3c1/0x47a [scsi_mod]
[60372.344948] [<c128727a>] ? wait_for_common+0xa5/0x101
[60372.344987] [<c10349b0>] ? default_wake_function+0x0/0x8
[60372.345026] [<edf02984>] ? st_scsi_execute_end+0x0/0x45 [st]
[60372.345066] [<edf03099>] ? st_do_scsi+0x2a2/0x2ca [st]
[60372.345105] [<edf07828>] ? st_write+0x4f1/0x926 [st]
[60372.345143] [<c1044dc3>] ? ptrace_stop+0xc0/0xf2
[60372.345181] [<edf07337>] ? st_write+0x0/0x926 [st]
[60372.345219] [<c10bc156>] ? vfs_write+0x7e/0xd6
[60372.345255] [<c10bc246>] ? sys_write+0x3c/0x63
[60372.345292] [<c1008c3c>] ? syscall_call+0x7/0xb
the INFO-part just says the process is waiting for something and repeats
every 300 seconds.
i traced the btape process using strace -ftt. Its very big, so i
stripped repeating parts and published it here:
http://pastebin.com/87HdsCRG
I'm thinking of too short ioctl-timeouts, but I could be wrong.
Can you help me or give me some hints?
Thanks for working on this,
Christoph Lehmann
--
Christoph Lehmann
Dr. Ralf Schwedler e.K. Amtsgericht Pinneberg, HRA 5609 PI
Internet / Sicherheit Bankverbindung:
Sparkasse Westholstein
Steinbr�ckstrasse 13b BLZ 222 500 20
25524 Itzehoe, Germany Konto-Nr. 11 00 84 323
T +49 4821 40 87 58 International Account Number
0700-SCHWEDLER IBAN DE05 2225 0020 0110 0843 23
F +49 4821 40 87 59 BIC HSHNDEH1IZH
---- ----
www.schwedler.com Steuer-Nr. 18019 34593
lehmann AT schwedler DOT com USt.-Id. DE 214230181
------------------------------------------------------------------------------
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
|