2011年2月21日月曜日

2/21

Marsを動作させようと奮闘。
結果的に、Marsをダウンロードする場所を間違えて、古いのを使っていたのが失敗要因だった。
原因究明を手伝ってくれたI君とU君、ありがとう。

新しいCUDA SDKを拾ってコンパイルし直すと、今度はうまく動作。
最後に一応結果を掲載します。

今後Marsを拡張するなら、他のSDKなどへの依存がないよう作り直すべきだろうか。

計算結果 Fermi GTX460

running test suite...
==========Similarity Score=========
PCI-E I/O: 8.954000ms
Map: 34.964000ms
Group: 85.582000ms
all-test: 164.454000ms
==========StringMatch=========
io-test: 555.225000ms
PCI-E I/O: 0.107000ms
Map: 5.742000ms
all-test: 561.273000ms
==========MatrixMul=========
generate two 1024x1024 matrice...
rotate matrix2: 575.593000ms
PCI-E I/O: 20.445000ms
Map: 325.586000ms
all-test: 421.117000ms
==========InvertdIndex=========
generating 28 MB data...
io-test: 591.807000ms
PCI-E I/O: 0.736000ms
Map: 59.946000ms
Group: 145.929000ms
all: 799.424000ms
==========PageViewCount=========
rm Gen
gcc -o Gen main.c
generating data...
preprocess: 696.710000ms
PCI-E I/O: 16.061000ms
Map: 24.795000ms
Group: 251.808000ms
Reduce: 8.789000ms
PCI-E I/O: 15.979000ms
Map: 99.858000ms
Group: 2026.143000ms
all: 3159.068000ms
==========PageViewRank=========
rm Gen
gcc -o Gen main.c
generating data...
io-test: 736.345000ms
PCI-E I/O: 10.029000ms
Map: 6.983000ms
Group: 191.776000ms
count: 2147482603, offset: 25070899
count: 2147480853, offset: 30364534
count: 2147478632, offset: 14685990
count: 2147474733, offset: 34364671
count: 2147473550, offset: 36486720
count: 2147470289, offset: 30271619
count: 2147465831, offset: 9384424
count: 2147460496, offset: 8174314
count: 2147459207, offset: 21390190
count: 2147456172, offset: 23958395
all-test: 983.356000ms
==========WordCount=========
preprocess: 549.211000ms
PCI-E I/O: 0.087000ms
Map: 10.612000ms
Group: 2.005000ms
all: 562.383000ms
# of words:373
ACCESS - size: 7 - count: 12
ACHIEVE - size: 8 - count: 4
ADDITION - size: 9 - count: 16
ADDITIONAL - size: 11 - count: 8
ADVANCEMENT - size: 12 - count: 20
ADVANCES - size: 9 - count: 4
AFFORD - size: 7 - count: 4
AGAINST - size: 8 - count: 20
ALASKA - size: 7 - count: 16
ALWAYS - size: 7 - count: 4
==========Kmeans=========
preprocess: 2.273000ms
PCI-E I/O: 0.988000ms
Map: 1.180000ms
Group: 8.722000ms
Reduce: 9.396000ms
PCI-E I/O: 0.980000ms
Map: 1.106000ms
Group: 8.911000ms
Reduce: 8.368000ms
PCI-E I/O: 0.973000ms
Map: 1.161000ms
Group: 9.208000ms
Reduce: 7.768000ms
PCI-E I/O: 0.969000ms
Map: 1.118000ms
Group: 9.031000ms
Reduce: 7.416000ms
PCI-E I/O: 0.974000ms
Map: 1.175000ms
Group: 8.886000ms
Reduce: 7.159000ms
PCI-E I/O: 0.973000ms
Map: 1.111000ms
Group: 9.001000ms
Reduce: 6.870000ms
PCI-E I/O: 0.977000ms
Map: 1.179000ms
Group: 9.287000ms
Reduce: 6.752000ms
PCI-E I/O: 0.969000ms
Map: 1.116000ms
Group: 9.064000ms
Reduce: 6.587000ms
PCI-E I/O: 0.995000ms
Map: 1.162000ms
Group: 8.911000ms
Reduce: 6.611000ms
PCI-E I/O: 0.982000ms
Map: 1.121000ms
Group: 9.026000ms
Reduce: 6.548000ms
PCI-E I/O: 0.990000ms
Map: 1.182000ms
Group: 9.318000ms
Reduce: 6.556000ms
PCI-E I/O: 0.991000ms
Map: 1.133000ms
Group: 9.066000ms
Reduce: 6.478000ms
PCI-E I/O: 0.986000ms
Map: 1.189000ms
Group: 8.887000ms
Reduce: 6.458000ms
PCI-E I/O: 0.995000ms
Map: 1.141000ms
Group: 9.006000ms
Reduce: 6.486000ms
PCI-E I/O: 0.999000ms
Map: 1.217000ms
Group: 9.309000ms
Reduce: 6.483000ms
PCI-E I/O: 0.997000ms
Map: 1.137000ms
Group: 9.084000ms
Reduce: 6.455000ms
PCI-E I/O: 0.989000ms
Map: 1.184000ms
Group: 8.941000ms
Reduce: 6.402000ms
PCI-E I/O: 0.984000ms
Map: 1.143000ms
Group: 9.072000ms
Reduce: 6.452000ms
PCI-E I/O: 0.987000ms
Map: 1.237000ms
Group: 9.370000ms
Reduce: 6.481000ms
PCI-E I/O: 0.998000ms
Map: 1.161000ms
Group: 9.152000ms
Reduce: 6.492000ms
PCI-E I/O: 1.008000ms
Map: 1.192000ms
Group: 8.946000ms
Reduce: 6.423000ms
PCI-E I/O: 0.979000ms
Map: 1.151000ms
Group: 9.053000ms
Reduce: 6.439000ms
PCI-E I/O: 1.022000ms
Map: 1.224000ms
Group: 9.358000ms
Reduce: 6.478000ms
PCI-E I/O: 1.028000ms
Map: 1.179000ms
Group: 9.129000ms
Reduce: 6.468000ms
PCI-E I/O: 1.005000ms
Map: 1.209000ms
Group: 8.985000ms
Reduce: 6.529000ms
PCI-E I/O: 1.013000ms
Map: 1.184000ms
Group: 9.105000ms
Reduce: 6.489000ms
PCI-E I/O: 0.998000ms
Map: 1.233000ms
Group: 9.440000ms
Reduce: 6.477000ms
PCI-E I/O: 1.006000ms
Map: 1.161000ms
Group: 9.142000ms
Reduce: 6.553000ms
PCI-E I/O: 1.016000ms
Map: 1.228000ms
Group: 8.938000ms
Reduce: 6.560000ms
PCI-E I/O: 1.171000ms
Map: 1.273000ms
Group: 9.129000ms
Reduce: 6.491000ms
PCI-E I/O: 1.053000ms
Map: 1.279000ms
Group: 9.391000ms
Reduce: 6.543000ms
PCI-E I/O: 1.039000ms
Map: 1.210000ms
Group: 9.146000ms
Reduce: 6.536000ms
PCI-E I/O: 1.039000ms
Map: 1.249000ms
Group: 8.966000ms
Reduce: 6.595000ms
PCI-E I/O: 1.042000ms
Map: 1.212000ms
Group: 9.062000ms
Reduce: 6.576000ms
PCI-E I/O: 1.047000ms
Map: 1.292000ms
Group: 9.435000ms
Reduce: 6.504000ms
PCI-E I/O: 1.044000ms
Map: 1.213000ms
Group: 9.207000ms
Reduce: 6.544000ms
PCI-E I/O: 1.051000ms
Map: 1.267000ms
Group: 8.976000ms
Reduce: 6.546000ms
PCI-E I/O: 1.054000ms
Map: 1.218000ms
Group: 9.125000ms
Reduce: 6.600000ms
PCI-E I/O: 1.063000ms
Map: 1.289000ms
Group: 9.451000ms
Reduce: 6.581000ms
PCI-E I/O: 1.058000ms
Map: 1.212000ms
Group: 9.241000ms
Reduce: 6.581000ms
PCI-E I/O: 1.051000ms
Map: 1.263000ms
Group: 8.957000ms
Reduce: 6.583000ms
all: 746.297000ms


tesla C1060
running test suite...
==========Similarity Score=========
PCI-E I/O: 8.715000ms
Map: 56.532000ms
Cuda error in file 'MarsSort.cu' in line 978 : unspecified launch failure.
==========StringMatch=========
io-test: 50.755000ms
PCI-E I/O: 0.171000ms
Map: 1051.171000ms
all-test: 1102.357000ms
==========MatrixMul=========
generate two 1024x1024 matrice...
rotate matrix2: 57.361000ms
PCI-E I/O: 19.936000ms
Map: 294.171000ms
all-test: 389.633000ms
==========InvertdIndex=========
generating 28 MB data...
io-test: 95.979000ms
PCI-E I/O: 0.813000ms
Map: 48.929000ms
Cuda error in file 'MarsSort.cu' in line 1047 : unspecified launch failure.
==========PageViewCount=========
rm Gen
gcc -o Gen main.c
generating data...
preprocess: 190.078000ms
PCI-E I/O: 15.911000ms
Map: 38.648000ms
Cuda error in file 'MarsSort.cu' in line 978 : unspecified launch failure.
==========PageViewRank=========
rm Gen
gcc -o Gen main.c
generating data...
io-test: 220.167000ms
PCI-E I/O: 9.934000ms
Map: 7.805000ms
Cuda error in file 'MarsSort.cu' in line 1047 : unspecified launch failure.
==========WordCount=========
preprocess: 47.811000ms
PCI-E I/O: 0.134000ms
Map: 12.250000ms
Group: 3.541000ms
all: 64.294000ms
# of words:373
ACCESS - size: 7 - count: 12
ACHIEVE - size: 8 - count: 4
ADDITION - size: 9 - count: 16
ADDITIONAL - size: 11 - count: 8
ADVANCEMENT - size: 12 - count: 20
ADVANCES - size: 9 - count: 4
AFFORD - size: 7 - count: 4
AGAINST - size: 8 - count: 20
ALASKA - size: 7 - count: 16
ALWAYS - size: 7 - count: 4
==========Kmeans=========
preprocess: 2.085000ms
PCI-E I/O: 1.151000ms
Map: 2.921000ms
Group: 10.451000ms
Reduce: 8.029000ms
PCI-E I/O: 1.081000ms
Map: 2.926000ms
Group: 10.566000ms
Reduce: 6.691000ms
PCI-E I/O: 1.094000ms
Map: 2.973000ms
Group: 10.842000ms
Reduce: 6.225000ms
PCI-E I/O: 1.080000ms
Map: 2.946000ms
Group: 10.577000ms
Reduce: 5.944000ms
PCI-E I/O: 1.102000ms
Map: 2.985000ms
Group: 10.468000ms
Reduce: 5.989000ms
PCI-E I/O: 1.092000ms
Map: 2.953000ms
Group: 10.586000ms
Reduce: 5.926000ms
PCI-E I/O: 1.099000ms
Map: 3.031000ms
Group: 10.865000ms
Reduce: 5.898000ms
PCI-E I/O: 1.100000ms
Map: 2.960000ms
Group: 10.574000ms
Reduce: 5.740000ms
PCI-E I/O: 1.115000ms
Map: 2.994000ms
Group: 10.458000ms
Reduce: 5.673000ms
PCI-E I/O: 1.114000ms
Map: 2.959000ms
Group: 10.624000ms
Reduce: 5.615000ms
PCI-E I/O: 1.112000ms
Map: 3.027000ms
Group: 10.904000ms
Reduce: 5.677000ms
PCI-E I/O: 1.119000ms
Map: 2.957000ms
Cuda error in file 'MarsSort.cu' in line 1047 : unspecified launch failure.


teslaでは何かErrorが出ているようだが、水曜以降に確認します。

2011年2月7日月曜日

2/7 今後の予定

現在のテーマは高速化できるか怪しいので、ある程度のものでまとめて終わらせる。

3月までにやること
・XSLTの実装
・文字列比較・文字列コピーの32スレッド化
・Xerces、Xalanの速度計測(C++版)

3月中旬までによること
・上記の内容をまとめた文章の作成

文章は実装・評価と並列して書き上げる。
グラフ関連のGPUのアイデアや論文読みも行う。