|  (→2009/07/10: Benchmarking on Windows again) |  (→2009/07/10: Benchmarking on Windows again) | ||
| Line 204: | Line 204: | ||
| Here are the results with IOMeter: | Here are the results with IOMeter: | ||
| − | [[Image:WD1000FYPS-bad-latency-iometer1.png]] | + | [[Image:WD1000FYPS-bad-latency-iometer1.png]]<br clear=left/> | 
| So at least some of the time, the read latency was 21 seconds!  Yep, that at least agrees with my observed behavior on Linux: sometimes it just takes too long for any particular operation.  But not always.  I think we can safely say this drive has something wrong with it, and ReiserFS is not the culprit. | So at least some of the time, the read latency was 21 seconds!  Yep, that at least agrees with my observed behavior on Linux: sometimes it just takes too long for any particular operation.  But not always.  I think we can safely say this drive has something wrong with it, and ReiserFS is not the culprit. | ||
Revision as of 19:02, 10 July 2009
Contents
2009/01/18
My email to Western Digital:
I have a Linux box which I use for MythTV, SqueezeCenter, as a file server etc. The primary drive is a WD5000AAKS and performs quite well, so I use a large partition on it for the MythTV recordings. Later I added this WD1000FYPS for longer-term file storage, and I've always been dissatisfied with the performance. It's strange, because hdparm reports good results:
[neutron][01:34:48 PM] hdparm -tT /dev/sda
/dev/sda: Timing cached reads: 1974 MB in 2.00 seconds = 986.64 MB/sec Timing buffered disk reads: 198 MB in 3.01 seconds = 65.70 MB/sec
[neutron][01:49:00 PM] hdparm -tT /dev/sdb
/dev/sdb: Timing cached reads: 2160 MB in 2.00 seconds = 1080.53 MB/sec Timing buffered disk reads: 224 MB in 3.02 seconds = 74.22 MB/sec
(it should be even faster than the 500 gig drive) but it seems like in normal usage, the seek time is really bad or something. The drive can go for several minutes at a time making rhythmic seek noises, much louder than the other one. While it is doing this, any files stored on that drive are not very accessible - even doing "ls" can take a very long time, as much as half a minute or so. I have not had any data loss though. I don't know what to blame the performance on, but I would bet that if I get a different model of terabyte drive and transfer everything to it, I will see a huge difference in performance. I just wondered if there are some known problems along these lines - something caused by the RAID optimization, or the "green" aspects or something that can be corrected with a firmware update, or maybe it's an actual defect (since I'm reading a lot of reviews about problems with this one, people losing all their data after a day, or a month). If I can still trust this drive not to lose data, maybe I just have to put it in a box and use it for an external backup drive, because it can transfer data reasonably fast as long as there's not too much seeking involved.
Then I tried Bonnie++... now it's very obvious there is a performance difference! On sdb it was crunching for many minutes, on sda for much less time. I guess it's telling me this drive has problems with file creation, but not sure how I can narrow down the problem further.
/dev/hda:
Version 1.93c       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
neutron          6G   274  98 51301  18 26301   7  1463  96 59167   9 231.7   3
Latency             57729us    4414ms    1220ms   51215us     214ms     282ms
Version 1.93c       ------Sequential Create------ --------Random Create--------
neutron             -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16 15594  74 +++++ +++ 18589  98 18627  87 +++++ +++ 16965  95
Latency             17607us   10673us    3077us   38672us     755us   14265us
/dev/hdb:
Version 1.93c       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
neutron          6G   237  98 36724  17 20134   5  1391  98 63035  10  30.4   1
Latency             51377us    5875ms    3985ms   27709us    1970ms    3581ms
Version 1.93c       ------Sequential Create------ --------Random Create--------
neutron             -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16    45   1 +++++ +++ 13180  72  5579  77 +++++ +++ 12702  73
Latency                293s     313us     212ms   12460us   12142us   11643us
Both drives are formatted with ReiserFS.
[neutron][02:33:27 PM] fdisk -l /dev/sda Disk /dev/sda: 500.1 GB, 500107862016 bytes 255 heads, 63 sectors/track, 60801 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x8e9c8e9c Device Boot Start End Blocks Id System /dev/sda1 * 1 7296 58605088+ 7 HPFS/NTFS /dev/sda2 7297 7298 16065 83 Linux /dev/sda3 7299 7664 2939895 82 Linux swap / Solaris /dev/sda4 7665 60801 426822952+ 5 Extended /dev/sda5 7665 10097 19543041 83 Linux /dev/sda6 10098 60801 407279848+ 83 Linux [neutron][02:34:12 PM] fdisk -l /dev/sdb Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes 255 heads, 63 sectors/track, 121601 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/sdb1 1 121601 976760001 83 Linux
2009/01/23
Now I've gotten a WD10EADS to see if the performance will improve, and I copied the complete filesystem as-is, since the disk has exactly the same geometry:
neutron][10:47:06 PM] dd if=/dev/sdc of=/dev/sdb 1953525168+0 records in 1953525168+0 records out 1000204886016 bytes (1.0 TB) copied, 40451.8 s, 24.7 MB/s
so it's still ReiserFS. Heeeeere's Bonnie:
Version 1.93c       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
neutron          6G   165  71 60356  30 27076   8  1285  87 58794   9 249.0   7
Latency               107ms    3227ms    1021ms   77159us     305ms     257ms
Version 1.93c       ------Sequential Create------ --------Random Create--------
neutron             -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16    96   7 +++++ +++ 15349  81   902  79 +++++ +++ 12863  74
Latency                132s   10170us    9786us   44270us   20202us   37398us
yeah that seems like an improvement, at least some of the numbers are nearer to those of the 500GB drive. And I don't hear the "swishing" noises the older drive was always making. I think that drive is defective.
2009/02/01: Benchmarking on Windows
Since my Windows machine doesn't have SATA I had to connect it via USB.
C:\bin>h2benchw.exe -english
H2bench -- by Harald Bögeholz & Lars Bremer / c't Magazin für Computertechnik
Version 3.12/Win32, Copyright (C) 2005 Heise Zeitschriften Verlag GmbH & Co. KG
Dutch translation by F&L Technical Publications B.V.
usage: h2bench [options] [<drive>]
options:
  -a          perform all measurements
  -z          perform zone measurement
  -s          measure seek time
  -c <n>      measure interface speed at n % of total capacity ("core test")
  -p          measure application profiles
  -d <n>      check data integrity (first <n> sectors fully checked)
  -dt <n>     specify duration of third phase of integrity check in seconds
  -tt "<txt>" specify title text (hard drive model)
              similarly: -tb (BIOS version), -tc (CPU), -tm (motherboard),
              -ta (host adapter), -ts (media; for removable drives)
  -w <file>   save results in files <file>.*
  -!          do write benchmarks (default: read-only)
  -deutsch    auf deutsche Version umschalten
  -nederlands switch naar de Nederlandse versie
  <drive>     Nummer of drive to test (0=first physical disk etc.)
C:\bin>h2benchw.exe -english -a -w fyps 2
H2bench -- by Harald Bögeholz & Lars Bremer / c't Magazin für Computertechnik
Version 3.12/Win32, Copyright (C) 2005 Heise Zeitschriften Verlag GmbH & Co. KG
Dutch translation by F&L Technical Publications B.V.
Capacity: 1953525144 sectors=953870 MByte
Checking timer for 10 seconds (Win32) ............. Ok.
timer resolution: 0.279 µs, 3.580 MHz
timer statistics: 6181869 calls, min 0.56 µs, average 0.74 µs, max 118.45 µs
Reading some sectors to warm up... done.
interface speed test with block size 128 sectors (64.0 KByte):
sequential read rate medium (w/out delay): 29.1 MByte/s
sequential transfer rate w/ read-ahead (delay: 2.37 ms): 28.5 MByte/s
Repetitive sequential read ("core test"): 29.2 MByte/s
Zone measurement read: calibrating... 29.1 MByte/s at 50% of total capacity.
reading 999 sample points (15262 blocks of 128 sectors = 953.88 MByte)
estimated runtime: 545 minutes...done.
sustained data rate read: average 29729.5, min 26128.4, max 30651.6 [KByte/s]
Measuring random access time (whole disk):
reading... 66.27 ms  (min. 3.00 ms, max. 3389.75 ms)
random access time in lower 504 MByte
reading... 7.79 ms  (min. 0.49 ms, max. 19.02 ms)
Running application profile `swapping' ...6494.8 KByte/s
Running application profile `installing' ...4764.2 KByte/s
Running application profile `Word' ...16564.5 KByte/s
Running application profile `Photoshop' ...10465.9 KByte/s
Running application profile `copying' ...9337.9 KByte/s
Running application profile `F-Prot' ...3791.1 KByte/s
Result: application index = 7.4
!!! WARNING: application profiles inaccurate since measured read-only
and for comparison, another IDE drive:
C:\bin>h2benchw.exe -english -a -w cdrive 0
H2bench -- by Harald Bögeholz & Lars Bremer / c't Magazin für Computertechnik
Version 3.12/Win32, Copyright (C) 2005 Heise Zeitschriften Verlag GmbH & Co. KG
Dutch translation by F&L Technical Publications B.V.
ATA disk: IBM-DJNA-352030
serial #: GQ0GQFJ4958
firmware: J58OA30K
Supported UDMA modes: 0 1 2 3 4
UDMA mode 4 active.
acoustic management not supported.
Capacity: 39873330 sectors=19469 MByte, CHS=(2482/255/63)
Checking timer for 10 seconds (Win32) ............. Ok.
timer resolution: 0.279 µs, 3.580 MHz
timer statistics: 6189109 calls, min 0.56 µs, average 0.74 µs, max 116.77 µs
Reading some sectors to warm up... done.
interface speed test with block size 128 sectors (64.0 KByte):
sequential read rate medium (w/out delay): 14.6 MByte/s
sequential transfer rate w/ read-ahead (delay: 4.71 ms): 54.2 MByte/s
Repetitive sequential read ("core test"): 52.9 MByte/s
Zone measurement read: calibrating... 12.7 MByte/s at 50% of total capacity.
reading 998 sample points (312 blocks of 128 sectors = 19.50 MByte)
estimated runtime: 26 minutes...done.
sustained data rate read: average 12127.6, min  8012.8, max 15159.1 [KByte/s]
Measuring random access time (whole disk):
reading... 14.42 ms  (min. 2.18 ms, max. 25.91 ms)
random access time in lower 504 MByte
reading... 9.23 ms  (min. 0.11 ms, max. 15.58 ms)
Running application profile `swapping' ...4890.3 KByte/s
Running application profile `installing' ...3747.1 KByte/s
Running application profile `Word' ...6071.5 KByte/s
Running application profile `Photoshop' ...4926.8 KByte/s
Running application profile `copying' ...4223.8 KByte/s
Running application profile `F-Prot' ...3538.3 KByte/s
Result: application index = 4.4
!!! WARNING: application profiles inaccurate since measured read-only
!!! WARNING: application profiles inaccurate due to small total capacity
2009/07/10: Benchmarking on Windows again
I got around to buying and installing a PCI SATA controller in my Windows box: the Rosewill RC-212, and attached just this one disk to it. Then formatted it with NTFS, one large partition.
Here are the results with IOMeter:

So at least some of the time, the read latency was 21 seconds!  Yep, that at least agrees with my observed behavior on Linux: sometimes it just takes too long for any particular operation.  But not always.  I think we can safely say this drive has something wrong with it, and ReiserFS is not the culprit.
IOMeter created a file filling the entire filesystem (931 GB (1,000,104,140,800 bytes)) and I guess it did reads and writes within that file, to test various parts of the disk. I left it running during a normal workday (started it before I left for work, and stopped it and examined the results when I got home).
The replacement WD10EADS has started developing bad blocks, too. So I'm not too impressed with Western Digital right now. It could just be that drives are too delicate to actually ship, though. Both of these drives were ordered online and who knows what the carrier did with them. (I don't really remember how well they were packed or who delivered them. Probably UPS though.)
So anyway, before this testing, I had re-installed the FYPS, formatted it with ext4, and started copying all my files back to it, because I figured it's better to have a slow drive than one that's in danger of losing everything. Both were attached to the SATA connectors on the motherboard in my MythTV box, so it should have been plenty fast (no USB or network involved). But in a couple of days it had only copied 50GB, so I figured at that rate it's going to take a few weeks to copy the whole drive, and I didn't know which drive to blame for this slowness since both of them are acting up now.
So last night I got a Seagate 1.5TB drive at Fry's (so no shipping involved, except however they ship the pallets to Fry's), put the EADS and the new one in the Myth box, formatted the new one with ext4, and restarted the copying.  Now, in about 18 hours it's gotten 150 gigs copied (but some files are corrupted; I'm logging the output from cp so I will know which ones, and then we'll see which ones are backed up somewhere else).  That's more like it, speed-wise.  So it's yet another bit of evidence the FYPS has just gotten hopelessly slow.
I will have to RMA both of the Western Digitals and I hope they send me something reliable this time.