open("/root/dmesg.txt", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=23818, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2afbfba25000
read(3, "Linux version 2.6.18-120.el5 (bre"..., 4096) = 4096
read(3, "dle threads.\nCPU: Physical Proces"..., 4096) = 4096
read(3, "device 0000:00:1c.0 to 64\nPCI: Se"..., 4096) = 4096
read(3, "SB hub found\nhub 1-0:1.0: 8 ports"..., 4096) = 4096
read(3, "t 4\nusb-storage: waiting for devi"..., 4096) = 4096
read(3, "20\nACPI: PCI Interrupt 0000:01:04"..., 4096) = 3338
read(3, ""..., 4096) = 0
close(3)
Now what is buffered IO? it is a buffered layer implemented on top of Direct IO calls like open, read etc. Glibc library will have a buffer of typical size 4096bytes(i.e. size of page 4K), every read and write by the application is served from this buffer. If you want to know more about buffered IO read the books The C Standard Library and Unix File Systems.
The mmap() call is used by the glibc to create buffer, it is clearly mapping 4K page buffer. I really don't know why this is done this way, I asked in the glibc mailing list but nobody responded, see the post.
So how we will avoid minor page faults created by buffered IO calls? Luckily glibc provides another function called setvbuf(), by which application can provide it is own buffer. If we provide our own buffer then glibc will not allocate buffer by using mmap(). So using setvbuf() avoids minor page faults and also improves the program performance.
10000 loops [root@mysys testnew]# time ./read_nobuffer --> direct IO real 0m0.922s user 0m0.210s sys 0m0.711s [root@mysys testnew]# time ./read_bufmmap --> buffered IO with library mmap real 0m0.321s user 0m0.106s sys 0m0.215s [root@mysys testnew]# time ./read_bufsetvbuf --> buffered IO with user provided buffer,setvbuf() real 0m0.178s user 0m0.071s sys 0m0.106s [root@mysys testnew]# Minor Page Faults (see under faults/s) [root@mysys ~]# sar -B 1 10000 Linux 2.6.18-120.el5 (mysys) 12/16/2009 06:17:43 PM pgpgin/s pgpgout/s fault/s majflt/s 06:17:44 PM 0.00 0.00 51.00 0.00 06:17:45 PM 0.00 0.00 24.00 0.00 06:17:46 PM 0.00 0.00 12.00 0.00 06:17:47 PM 0.00 0.00 12.00 0.00 06:17:48 PM 0.00 31.68 11.88 0.00 06:17:49 PM 0.00 4.04 12.12 0.00 06:17:50 PM 0.00 0.00 12.00 0.00 06:17:51 PM 0.00 0.00 14.00 0.00 06:17:52 PM 0.00 0.00 12.00 0.00 06:17:53 PM 0.00 0.00 163.00 0.00 ---> direct IO 06:17:54 PM 0.00 0.00 36.00 0.00 06:17:55 PM 0.00 0.00 14.00 0.00 06:17:56 PM 0.00 0.00 10213.00 0.00 ---> buffered IO with library mmap 06:17:57 PM 0.00 0.00 13.27 0.00 06:17:58 PM 0.00 28.57 217.35 0.00 ---> buffered IO with user provided buffer,setvbuf() 06:17:59 PM 0.00 0.00 12.24 0.00 06:18:00 PM 0.00 0.00 12.12 0.00 06:18:01 PM 0.00 0.00 12.24 0.00
Source code for programs I used is pasted below. [root@mysys testnew]# cat read_nobuffer.c int main(void) { char buffer[256]; int i = 0; int fp; while(i++ < 10000) { if(!(fp = open("/root/dmesg.txt", O_RDONLY))) return 0; while(read(fp, buffer, 256)); close(fp); } return 0; } [root@mysys testnew]# cat read_bufmmap.c int main(void) { char buffer[256]; int i = 0; FILE *fp; while(i++ < 10000) { if((fp = fopen("/root/dmesg.txt", "r")) == NULL) return 0; while(!feof(fp)) fread(buffer, 256, 1, fp); fclose(fp); } return 0; } [root@mysys testnew]# cat read_bufsetvbuf.c char buffer123[8192]; int main(void) { char buffer[256]; int i = 0; FILE *fp; while(i++ < 10000) { if((fp = fopen("/root/dmesg.txt", "r")) == NULL) return 0; setvbuf(fp, buffer123, _IOFBF, 4096); while(!feof(fp)) fread(buffer, 256, 1, fp); fclose(fp); } return 0; }
The message you linked to had an interesting suggestion: to teach glibc to maintain
ReplyDelete- a (small) cache of buffers (to avoid munmap() and then a further mmap())
I haven’t investigated or thought this through much, but it might be worth trying that.
I suspect you’d get more response from the glibc maintainers if you send a patch along with your suggestion. Then people can try it out, and if it’s worth the trouble, point you to towards getting copyright assignment in order and so on.
I feel there should be a reason behind using mmap way of buffer allocation for buffered IO, that is reason why they provide setvbuf call, for the people who want to overcome it.
ReplyDelete