问题:对一个一列两亿行的无序的文本文件进行排序,生成一个排好序的新文本文件。
1. 生成无序文件,BigFileTest.java代码如下:
import java.io.BufferedWriter;
import java.io.FileWriter;
import java.util.Random;
publicclass BigFileTest {
static Random random = new Random();
publicstaticvoid main (String[] args) throws Exception {
createFile();
}
publicstaticvoid createFile() throws exception {
BufferedWriter fw = new BufferedWriter(new FileWriter("D:\\BigFileTest\\bigfile.txt"));
for (int i=1; i<200000000; i++) {
fw.write(random.nextLong() + "");
fw.newLine();
if (i % 10000 == 0) {
fw.flush();
}
}
}
}
javac BigFileTest.java
java BigFileTest
至此生成了一个两亿行的文本文件bigfile.txt
2. 建立外部表
create directory data_dir as'D:\BigFileTest\';
createtable bt_ext_test(a varchar2(30))
organization external
(type oracle_loader
default directory data_dir
access parameters
(records delimited by newline characterset zhs16gbk
badfile data_dir:'bigfile.bad'
discardfile data_dir:'bigfile.dsc'
logfile 'bigfile.log'
fields terminated by 0x'09' ldrtrim
missing field values are null
reject rowswithallnull fields
)
location ('bigfile.txt')
)
parallel
reject limit unlimited;
3. 使用sqlplus的spool生成排序的新文件
set echo off
set feedback off
set termout off
set arrarsize 5000
set heading off
set head off
set trimout on
set pagesize 0
set trimspool on
set ;inesize 30
spool result.txt
select /*+ parallel(bt_ext_test,8) */ * from bt_ext_test orderby a;
spool off
exit;
在4个双核CUP,64位oracle11.2上,用8个并行查询,生成排序文件用时32分钟。
--转自