2013年9月26日星期四

1 G about text files, one record per line , is now taken for each record hashcode ()% 1000 into 1000 after documents how efficient division ?

Now is the new 1000 file, and then read line by line record calculate hashcode ()% X after writing to the appropriate file , try the next , with an estimated several hours are endless . Is there any better way ?
------ Solution -------------------------------- ------------
1. according to your set up two large cache memory , read at least the first cache
2. process the first time with another one cached thread , the main thread to read the file into another cache
3. additional build 1000 cache, determines the fate of each record after thrown into the corresponding cache, the cache is full before writing files
4. finally put all the rest of the 1000 cache write files
------ Solution ------------------- -------------------------
1 register number used to temporarily store files read other is already read files, the parallel processing.
because you read a deal with one , then the disk seek time for a long time . If you order a lot of reading , disk speed will be faster .
1. Thread1: read file into buffer1
2. Thread2: process buffer2 if buffer2.remaining ()> 0
3. swap buffer1 and buffer2 if buffer1 is full and buffer2 is empty
4. goto 1.
------ For reference only -------------------------- -------------

the two large cache are doing ? Hear you say it a little bit confused . .

没有评论:

发表评论