The one of the biggest problems of usb flash drives is a slow data write speed. This article will guide you through the process that can possibly increase your flash stick write speed.
Okay, first I bought Transcend 8GB usb flash stick. It had been formatted with FAT32 filesystem initially. So I decided to run data read/write speed test. Mount the filesystem and execute following
# hdparm -t /dev/sdb
/dev/sdb:
Timing buffered disk reads: 102 MB in 3.05 seconds = 33.43 MB/sec
$ dd count=100 bs=1M if=/dev/urandom of=/media/disk/test
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 29.5112 s, 3.6 MB/s
The disk read speed is good enough, but the write speed is not so good. That's because most of NAND flash drives (the most commonly used flash sticks) have 128k erase block size. Filesystems usually have 4k (4096 bytes) block size. And here we came into problem. If the filesystem blocks are not aligned to flash drive blocks, the performance overhead during disk writes will increase. So what we can do is to align filesystem properly. The best way to do this is to use 224 (32*7) heads and 56 (8*7) sectors/track. This produces 12544 (256*49) sectors/cylinder, so every cylinder is 49*128k.
# fdisk -H 224 -S 56 /dev/sdb
Now turn on expert mode with fdisk and force the partition to begin on 128k alignment. In my case I have set new beginning of data to 256. Create as many partitions as you need (I created only one - /dev/sdb1).
Do not forget to save changes and write new layout to flash drive (all data on the flash disk will be lost)
Now it's time to create the filesystem. I used ext4 because there is a way to tell it to specify a strip width to keep your filesystem aligned:
# mke2fs -t ext4 -E stripe-width=32 -m 0 /dev/sdb1
Now lets mount the filesystem and test the overall performance
# hdparm -t /dev/sdb
/dev/sdb:
Timing buffered disk reads: 102 MB in 3.01 seconds = 33.94 MB/sec
$ dd count=100 bs=1M if=/dev/urandom of=/media/disk/test
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 17.0403 s, 6.2 MB/s
As we can see, the data read performance is almost the same while the write speed is considerably faster.
Okay, first I bought Transcend 8GB usb flash stick. It had been formatted with FAT32 filesystem initially. So I decided to run data read/write speed test. Mount the filesystem and execute following
# hdparm -t /dev/sdb
/dev/sdb:
Timing buffered disk reads: 102 MB in 3.05 seconds = 33.43 MB/sec
$ dd count=100 bs=1M if=/dev/urandom of=/media/disk/test
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 29.5112 s, 3.6 MB/s
The disk read speed is good enough, but the write speed is not so good. That's because most of NAND flash drives (the most commonly used flash sticks) have 128k erase block size. Filesystems usually have 4k (4096 bytes) block size. And here we came into problem. If the filesystem blocks are not aligned to flash drive blocks, the performance overhead during disk writes will increase. So what we can do is to align filesystem properly. The best way to do this is to use 224 (32*7) heads and 56 (8*7) sectors/track. This produces 12544 (256*49) sectors/cylinder, so every cylinder is 49*128k.
# fdisk -H 224 -S 56 /dev/sdb
Now turn on expert mode with fdisk and force the partition to begin on 128k alignment. In my case I have set new beginning of data to 256. Create as many partitions as you need (I created only one - /dev/sdb1).
Do not forget to save changes and write new layout to flash drive (all data on the flash disk will be lost)
Now it's time to create the filesystem. I used ext4 because there is a way to tell it to specify a strip width to keep your filesystem aligned:
# mke2fs -t ext4 -E stripe-width=32 -m 0 /dev/sdb1
Now lets mount the filesystem and test the overall performance
# hdparm -t /dev/sdb
/dev/sdb:
Timing buffered disk reads: 102 MB in 3.01 seconds = 33.94 MB/sec
$ dd count=100 bs=1M if=/dev/urandom of=/media/disk/test
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 17.0403 s, 6.2 MB/s
As we can see, the data read performance is almost the same while the write speed is considerably faster.
Hi, thanks for a really interesting article.
ReplyDeleteWhat is the algorithm you are using to calculate the heads and sectors?
I have a Sandisk 32GB memory card I would like to use this technique on.
I am very enjoyed for this blog. Its an informative topic. It help me very much to solve some problems. Its opportunity are so fantastic and working style so speedy. wooden wedding usb
DeleteThe partitioning geometry doesn't depend of the disk size, so you should be ok with
ReplyDelete# fdisk -H 224 -S 56 /dev/sdb
now if only you could tell me how to do it in WinXP. much appreciated. atleast there is hope that there is a way
ReplyDeleteThere are some quirks to do this under Windows XP. Take a look into the nice guide http://www.ocztechnologyforum.com/forum/showthread.php?48309-Partition-alignment-importance-under-Windows-XP-%2832-bit-and-64-bit%29-why-it-helps-with-stuttering-and-increases-drive-working-life
ReplyDeleteYou wrong when use /dev/urandom, because this device is very slow. Really, you can measure your tuning with follow command for example:
ReplyDelete# dd count=100 bs=1M if=/dev/zero of=/media/disk/test oflag=sync
It using /dev/zero which is extremely fast, just compare following:
$ dd count=100 bs=1M if=/dev/urandom of=/dev/null
$ dd count=100 bs=1M if=/dev/zero of=/dev/null
Please report us what really you get.
Thank you for your tip! I have run your test and get
ReplyDeletedd count=100 bs=1M if=/dev/urandom of=/dev/null
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 11.6463 s, 9.0 MB/s
dd count=100 bs=1M if=/dev/zero of=/dev/null
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 0.0862834 s, 1.2 GB/s
Also I didn't consider that for precise test we need to append `sync` mount option. This option disables write cache buffer. Moreover, today's tests show me extremely good performance with ntfs file system having 64k cluster size and aligned the same way as mentioned in the article (basically it is the same flash drive). Take a look at these values:
$ sudo mount -t ntfs-3g -o sync,user,umask=0002 /dev/sdb1 /mnt/flash/
$ dd if=/dev/zero bs=1M count=200 of=/mnt/flash/z
200+0 records in
200+0 records out
209715200 bytes (210 MB) copied, 14.7261 s, 14.2 MB/s
It would be good if someone make the same tests to confirm these results
I tried to fill up all flash disk space with one ntfs partition and noticed that speed was dropped then to approx. 5,7 megs per second. So there is nothing extraordinary
ReplyDeleteWhat I don't understand is why you haven't made the cylinder size 128K, eg:
ReplyDelete# fdisk -H 32 -S 8 /dev/sda
If you start with cylinder 1 the partition will be at the first track: 4K, if you start at cylinder 2 or later the track will start at a multiple of 128K. You can leave the first cylinder (erase block) empty as now, or put an unaligned filesystem as partition 1.
Dear Michael,
ReplyDeleteCould you please explain what you are doing with the number of heads and sectors? Why are you changing them? Are you trying to get track size to be the same as the filesystem block size?
And what does the number of heads mean? I know it does not mean physical heads, but what does it mean? Is the track size derived from total disk capacity divided by number of heads divided by number of cylinders divided by number of sectors per track?
That does not seem to add up, for example:
# fdisk -l /dev/sda
Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 83 Linux
/dev/sda2 14 60801 488279610 8e Linux LVM
# perl -e 'print 500107862016/255/63/60801,"\n"'
512.002674878606
#
Cheers,
Aleksey
@Alex: Read "heads" as "tracks/ cylinder" in the computer outputs and explanations. Now re-read the third paragraph of the blog above that begins with:
ReplyDelete"The disk read speed is good enough . . . "
and ends with:
". . . 49*128k."
Key concept is: "Adjusting disk geometry of flash drive to multiples of the 128k NAND flash erase block." The 4k OS read/ write size IS bound by disk geometry parameters, and the 128k erase block size IS NOT. This way any 4k write operation will be within one 128k erase block.
You have 16065 sectors/ cylinder total, which is
8225280 bytes/ cylinder. 8225280 bytes is not evenly divisible by 128k (= 128*1024bytes): Actually, 8225280/ 128k= 62.753... You can't even begin to "align" the 4k read size to the 128k erase block size in this case.
I want to know the exact difference between cylinder and erase block.In the above article heads(---*7),from where that '7' factor comes from? help me at gokhalesushant65@gmail.com because I m in the final yr of engg. and we are doing project related with the usb.Also help me with the different articles if possible
DeleteI want to know whether is there any way to change the NAND erase block size.reply me at gokhalesushant65@gmail.com
DeleteThank you. That helps a lot.
ReplyDeleteMy actual USB stick is 8 GB, like the author's, and I was able to use the same parameters for -H and -S to get a cylinder block size that is evenly divisible by 128k.
(The 500 GB disk was a regular spinning disk, I was just trying to understand the fdisk output as it relates to this issue.)
Big thanks to the author for this post, we tripled our USB flash drive write speed.
ReplyDeleteMake sure you have the "sync" option ommited for flash drive mounts. There are two major disadvantages with the "sync" option on flash drives.
ReplyDelete1. It will do more erase/write cycles on your flash drive reducing its lifetime
2. It will be very slow to write in to the flash file system (sometimes more than 100 times slower) because it keeps writing/re-writing sectors.
Use the "async" option when mounting flash drives.
Hello,
ReplyDeleteis this applicable to a 16GB USB Stick?
yes it is
ReplyDeleteInteresting, but confusing (for a noob like me...)
ReplyDelete1. For the read-test you use "/dev/sdb" and for the write-test "/media/disk/test". Am I correct that this should also be /dev/sdb ?
2. Shouldn't people be warned NOT to run dd on a stick containing data!?!?
3. Can you explain a bit more about the fdisk/expert/begin partition please? Do I first create the partition and then hit 'x' for expert and then change to..... what? My default states 2048, do I change that to 256? Is that the same for every size stick?
4. I tried this and the outcom is exactly the same speed.
Hi Michael,
ReplyDeletemy case is a Kingston DT 101 2G 32GB using NTFS.
I called fdisk -H 224 -S 56 /dev/sdd and did the instructions getting
Disk /dev/sdd: 31.2 GB, 31221153792 bytes
224 heads, 56 sectors/track, 4861 cylinders, total 60978816 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xc3072e18
Device Boot Start End Blocks Id System
/dev/sdd1 256 60978815 30489280 7 HPFS/NTFS/exFAT
After (w)riting to save things, I called fdisk /dev/sdd to verify and obtained
Disk /dev/sdd: 31.2 GB, 31221153792 bytes
44 heads, 24 sectors/track, 57745 cylinders, total 60978816 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xc3072e18
Device Boot Start End Blocks Id System
/dev/sdd1 256 60978815 30489280 7 HPFS/NTFS/exFAT
That is, the heads (44) and sectors (24) have changed !!!
Tried several times and the results were the same.
Is this the expected behaviour? This also happened with you?
As I need to use NTFS (ntfs-3g), what the necessary parameters while making the file system
with mkntfs to obtain the correct alignment?
Regards
Fabio
Are you really serious with this "guide"?
ReplyDeleteIt's really impossible to repartitioning and reformating every USB stick I can use ;)
Some of them doesn't belongs to me and I use them to write some data to anyone else...
Why the Windows (2k,xp,++) can write 8-10 times faster than Linux?
Linux have serious problem with USB sticks - yeah! it is unusable...
I had hope that it will be fixed soon, but waiting about 5 years with no advance.
Yes i tried all these tricks with ehci and others, sync and async options but still no change.
My writing speed on flash disks is about 400kB/s - various HW with linux - the same poor results.
Using the same HW with Windows work fine. But any other work in windows is much worse than in linux.
Writing to SSD card in my (Samsung) mobile connected via USB is almost instant. Cannot find any difference in mounting parameters.
It is much faster to burn data to DVD than copy data to flash disk. That's really poor and sad...
Great tip. I use Linux to deploy Windows by USB sticks and this is a good tip. Bookmarked!
ReplyDeleteThis is insane! Using if=/dev/zero almost writes in an instant!
ReplyDeleteI tried testing this on my Transcend JetFlash 32GB having an ext4 and an xfs filesystem and below are the results. Here are the partition of my Flash Drive.
1st Partition - 1GB (ext4) Label: boot-os
2nd Partition - 30GB- (xfs) Label: Transcend32G
The way I format the 1st partition is:
mkfs.ext4 -E stripe-width=32 -m 0 -O ^has_journal,large_file -L boot-os -vvv /dev/sdc1
mke2fs 1.42 (29-Nov-2011)
fs_types for mke2fs.conf resolution: 'ext4'
warning: 224 blocks unused.
Filesystem label=boot-os
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=32 blocks
65664 inodes, 262144 blocks
0 blocks (0.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=268435456
8 block groups
32768 blocks per group, 32768 fragments per group
8208 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376
Allocating group tables: done
Writing inode tables: done
Writing superblocks and filesystem accounting information: done
The second partition is this:
mkfs.xfs -L Transcend32G /dev/sdc2
meta-data=/dev/sdc2 isize=256 agcount=4, agsize=1915072 blks
= sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=7660288, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=3740, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Then I tested their read speed and they're almost identical:
N53SV ~ # hdparm -t /dev/sdc1
/dev/sdc1:
Timing buffered disk reads: 62 MB in 3.06 seconds = 20.29 MB/sec
N53SV ~ # hdparm -t /dev/sdc2
/dev/sdc2:
Timing buffered disk reads: 62 MB in 3.06 seconds = 20.24 MB/sec
Now here's the fun part using if=/dev/zero
john@N53SV:~$ dd count=100 bs=1M if=/dev/zero of=/media/Transcend32G/test.xfs;echo;dd count=100 bs=1M if=/dev/zero of=/media/boot-os/test.ext4
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 0.0612922 s, 1.7 GB/s
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 0.0589694 s, 1.8 GB/s
Using if=/dev/urandom yields the results below:
john@N53SV:~$ dd count=100 bs=1M if=/dev/urandom of=/media/Transcend32G/test.xfs;echo;dd count=100 bs=1M if=/dev/urandom of=/media/boot-os/test.ext4
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 7.88737 s, 13.3 MB/s
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 7.90414 s, 13.3 MB/s
hey John,
ReplyDeletetry using sync option when mount to measure real speed.. it seems that in your case you are using OS write cache.
hi can you help me please ! how can i test read/write speed in linux ?
ReplyDeleteI have Kingstone DTSE9 32G usb stick and I've installed an linux operating system on it (Debian Wheezy) ... my question is what I can do to improve writing speed cause is really bad. A simple menu browse will cause system hang and it is simply unusable as hard drive.
ReplyDeleteI try to format like this tutorial said but with no success. I will very appreciate any help or an "step by step" noob tutorial :)
You should really try to use aufs on top of your rootfs in case you're using usb stick for Linux distribution. Try to read this article https://help.ubuntu.com/community/aufsRootFileSystemOnUsbFlash and see if it helps.
ReplyDeleteHi,
ReplyDeleteDoes this apply to external usb hard disks as well? (like 1TB sized and so on) ?
Currently I got 3MB/s write speed and it's actually slower than I can download files from Internet.
Best regards
denu
Hey denu,
ReplyDeleteNo it does not apply to hard disks since they dont have similar erase block technique.
Your guide suggests measuring write speed by writing from /dev/urandom However, on a fast drive (e.g. USB3 or if one is testing an internal drive) the random number generator can become a bottleneck. On my machine it maxes out at 14 megs a second. It's best, therefore, to use /dev/zero as the source. e.g.
ReplyDeletedd if=/dev/zero of=tempfile bs=1M count=1024 conv=fdatasync,notrunc
Using /dev/zero to test is actually a bad idea (usually*), as filesystems such as ext4 will store files full of zero using "sparse blocks", meaning that they don't actually write the data to the disk. (Instead, they will just write a small amount of data saying, effectively, "this is a 2GB file full of zeros".) This is why John above observed such a fast write speed with /dev/zero -- because it wasn't actually writing the data.
ReplyDeleteI'd suggest running dd count=100 bs=1M if=/dev/urandom of=/dev/null and checking how fast it goes -- if your drive is slower than this, then you should be right with urandom for input. Otherwise, I'd suggest that the next best bet is to create a 100MB file on a fast storage device (another flash drive?) and use that as the input file, so that you're not generating the randomness on-the-fly.
* Of course, if you're using an older/simpler filesystem such as FAT32, that doesn't support sparse blocks, then using /dev/zero is not a problem.
I've tested this out, and found that at least with Ubuntu 12.04, following these steps gives no speed increase compared to just creating and formatting the partition with gparted. I also found that the stripe_width parameter to mke2fs didn't make any noticeable different to write speed for me.
DeleteIn fact, I'm not sure that making these changes should even in principle speed up your drive. While it's true that flash drives use large (eg 128k) erase-block sizes, my understanding is that they still write data in sector-sized blocks (512 bytes), and therefore aligning your partition to 128k will not make a difference to write speeds, at least on a fresh drive. It should only make a difference when the drive starts deleting blocks of data, which only happens once the drive gets full. (And even then, partition alignment has very little to do with the speed.)
So I don't think that following these steps will actually make your drive any faster. My suspicion is that the only reason the OP found that his drive ran 2x faster than before, is because he changed the filesystem from FAT32 to ext4, and ext4 is a much faster filesystem than FAT. (Or, at a minimum, the linux ext4 driver is faster than the linux VFAT driver.)
Postscript: I think I've confirmed this theory. I reformatted with FAT32 on the exact same disk I was testing on before, and found that my write speed went from ~ 7MB/s down to ~ 4.4MB/s. So, moral of the story: if you want it to work faster (at least under Linux), reformat it with ext4 instead of FAT32.
For testing with random data, just store the data in memory (eg a very small ramdrive.) To keep it noob-friendly, I'd say just run from a live CD distro and you're pretty much starting up essentially with a ramdrive right off. You don't need a huge file anyway. Probably just 100MiB is sufficient for performance testing. This eliminates the CPU bottleneck without having a dependency on the host media being fast enough to properly test.
DeleteOn the subject of FAT32, let's not forget that the default format parameters are wrong. In particular it seems you generally need 32K cluster sizes from what I understand, but the best thing to do purportedly is to format it with the Panasonic SD formatter (which of course is Windows only.) Before testing its speeds, try that. As far as aligning things properly goes though, if nothing else it might help decrease the actual number of erases and therefore improve the lifetime of the memory card.
DeleteBTW, it seems most people feel that turning off journalization is the way to go for the best performance. Though honestly, I think it depends on what you're doing with it. For operations where the data on it is unimportant (say videos thrown onto a card to watch later for instance) performance might be the most important. But for more important stuff the compromise might be necessary. I think this could also be impacting people's results though potentially as it increases the bottleneck somewhat.
Ok, so I'm trying to make ext4 work as well as possible with SDHC cards and I ran across this. There are a few things I'm wondering though. Firstly, it says "Now turn on expert mode with fdisk and force the partition to begin on 128k alignment. In my case I have set new beginning of data to 256." How do you know exactly what number to put in? Do we just put 256 for anything, or does it vary? And I'm assuming you mean the "b" command in expert mode. Most of us don't exactly use the expert mode options in fdisk very much though (in fact, this is the first time ever that I've seen anything using it.) I'm hoping I can use this to optimize for the Raspberry Pi especially because it so foolishly uses multiple partitions with one being FAT32 and another being Linux-based rather than just one single partition (I guess they chose FAT32 for the sake of being friendlier towards Windows users, but that's kind of silly no matter how I look at it since directly booting Windows on a Raspberry Pi is quite beyond impossible and any Windows users are still going to have to ultimately learn how to use Linux a little bit...) This made it impossible for me to find a way to make the actual Linux partition properly optimized. To this end, I'm guessing when doing the second partition you need to align it as well, but I'm not really sure how to do that (yeah, I don't mess with fdisk much as far as actually aligning things. It's usually just creating and deleting partitions really.)
ReplyDeleteIs this enough though? I originally was trying to figure out how to do what this article describes: http://blogofterje.wordpress.com/2012/01/14/optimizing-fs-on-sd-card/ Unfortunately, the flashbench command just tells me "invalid argument" and I'm beginning to suspect that the -a command was removed within a very very short time after that article was written (the last commits say two years ago, so the project is kind of dead I guess) though oddly the readme still shows using the -a command first thing... (But I'm not 100% sure if it's that or maybe something else. For instance, you'll note that they are using /dev/mmcblk0 which means an integrated card reader system. Unfortunately, you can't really do all this stuff on the Raspberry Pi since it runs from its SD card reader, so I have no clue how to od it that way. Perhaps you have to access the device a certain way for it to work and it won't accept stuff like /dev/sdb though?) Anyway, it seems the point of that article is that the erase boundary could be different depending on the card. Is there any other way we can find out and adapt the partitions to reflect this on a standard PC without a memory card reader built into the motherboard directly (which would likely break legacy compatibility quite a lot anyway)?
thx for the tutorial.im a new user too on ubuntu.but not the first time.im not sure to try this because too many mistake i made since using ubuntu.i will looking for other information about this tutorial on another.
ReplyDeleteThat's interesting. But I've read so many times that the pre-installed filesystem on a usb-stick usually have the best optimized parameters. Isn't that the case?
ReplyDeleteSince 2002, the CHS geometry addressing method is obsolete. I recommend to refrain from using it. Besides, it is intuitive only with regard to HDD; it does not make sense to use it with flash memories. Use LBA (logical block addressing) exclusively, and work with linear sectors and clusters.
ReplyDeleteWell,
ReplyDeletewhile you are wondering here how factory-made partition and fs layout may not be optimal I faced such a problem.
My wife bought Silicon power luxmini 32gb fladh drive with significant discount.
And write speed is nightmare 2mbyte/s.
Its default layout:the only fat32 partition starts at 1.28mb!!!! Not 1,2,4 mb, not multiple of 64k or so!!!!!
512b sector size
4k cluster
After reading lot of howtos
I
tested it with flashbench and found peaks of performance at
64k and 4m block sizes at 7 mbytes/s. First should be pagesize×"number of parallel pages could be read/written"
Second probably allocation group.
Alignment of partition start to 4m
+ creating fat32 with 64k cluster size and alignment of its start of data section to 4m made large progress to copying files.
It is 3.5mbytes/s now under linux but this flash is not seen in win Vista:angry smile:.
This flash is seen under android tablet, and ancient laptop with winxp.
DeleteCopying in xp from flash to flash has 1.8mb/s.
Addition: write speed i measured by simply copying 9gb of large movie files from external usb hdd to flash.
I also reserved 512 clusters after fat for vfat data alignment.
I made partitioning with gparted and dont remember drive geometry or whether lba mode is used.
Can modifying drive geometry/setting lba mode help to speed up this flash?
Why is this flash is not seen by Vista?
What can I try more in order to speed it up to normal 7mb/s, shown by flashbench?
Thanks to all in advance,
Igor
I read your blog now share great information here. Salesforce data backup and recovery
ReplyDeleteHello, can you increase the speed formatted in fat32?
ReplyDelete