Ryuz's tech blog

FPGAなどの技術ブログ

Ultra96V2をベンチマークしてみる

環境

ボード: AVNET Ultra96V2 OS : ikwzm氏 Debianブートイメージ

Stream

インストール&実行

wget http://www.cs.virginia.edu/stream/FTP/Code/stream.c
gcc -O3 stream.c -o stream
./stream

Ultra96V2での結果

-------------------------------------------------------------
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:            2164.1     0.074573     0.073932     0.075202
Scale:           2105.2     0.078765     0.076002     0.085328
Add:             1769.1     0.136118     0.135661     0.138638
Triad:           1691.1     0.142036     0.141922     0.142198
-------------------------------------------------------------

参考 (Core i7-4770 3.4GHz Win10+WSL2)

-------------------------------------------------------------
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:           16925.7     0.009898     0.009453     0.010830
Scale:          11061.1     0.015303     0.014465     0.016600
Add:            11894.2     0.021037     0.020178     0.022866
Triad:          11668.0     0.021247     0.020569     0.022306
-------------------------------------------------------------

姫野ベンチ

インストール&実行

wget http://i.riken.jp/wp-content/uploads/2015/07/himenobmt.c.zip
unzip himenobmt.c.zip
lha e himenobmt.c.lzh
make
./bmt

Ultra96V2での結果

mimax = 129 mjmax = 65 mkmax = 65
imax = 128 jmax = 64 kmax =64
cpu : 13.717483 sec.
Loop executed for 200 times
Gosa : 1.688752e-03
MFLOPS measured : 240.097925
Score based on MMX Pentium 200MHz : 7.440283

参考 (Core i7-4770 3.4GHz Win10+WSL2)

mimax = 129 mjmax = 65 mkmax = 65
imax = 128 jmax = 64 kmax =64
cpu : 0.875033 sec.
Loop executed for 200 times
Gosa : 1.688699e-03
MFLOPS measured : 3763.902847
Score based on MMX Pentium 200MHz : 116.637832

UnixBench

インストール&実行

git clone https://github.com/kdlucas/byte-unixbench
cd byte-unixbench/UnixBench
./Run

Ultra96V2での結果

------------------------------------------------------------------------
Benchmark Run: Wed Nov 04 2020 22:10:27 - 22:38:40
4 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables        6101883.0 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     1413.3 MWIPS (9.8 s, 7 samples)
Execl Throughput                               1484.1 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        148614.8 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           50324.8 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        345562.0 KBps  (30.0 s, 2 samples)
Pipe Throughput                              400640.3 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  73303.1 lps   (10.0 s, 7 samples)
Process Creation                               3580.6 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   2695.9 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    837.4 lpm   (60.0 s, 2 samples)
System Call Overhead                         533879.9 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0    6101883.0    522.9
Double-Precision Whetstone                       55.0       1413.3    257.0
Execl Throughput                                 43.0       1484.1    345.1
File Copy 1024 bufsize 2000 maxblocks          3960.0     148614.8    375.3
File Copy 256 bufsize 500 maxblocks            1655.0      50324.8    304.1
File Copy 4096 bufsize 8000 maxblocks          5800.0     345562.0    595.8
Pipe Throughput                               12440.0     400640.3    322.1
Pipe-based Context Switching                   4000.0      73303.1    183.3
Process Creation                                126.0       3580.6    284.2
Shell Scripts (1 concurrent)                     42.4       2695.9    635.8
Shell Scripts (8 concurrent)                      6.0        837.4   1395.6
System Call Overhead                          15000.0     533879.9    355.9
                                                                   ========
System Benchmarks Index Score                                         399.8

------------------------------------------------------------------------
Benchmark Run: Wed Nov 04 2020 22:38:40 - 23:06:57
4 CPUs in system; running 4 parallel copies of tests

Dhrystone 2 using register variables       24404805.8 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     5652.6 MWIPS (9.8 s, 7 samples)
Execl Throughput                               5086.0 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        301630.2 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           85465.1 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        822063.7 KBps  (30.0 s, 2 samples)
Pipe Throughput                             1599503.0 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 298430.7 lps   (10.0 s, 7 samples)
Process Creation                               9903.6 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   6816.7 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    880.4 lpm   (60.2 s, 2 samples)
System Call Overhead                        2058190.0 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   24404805.8   2091.2
Double-Precision Whetstone                       55.0       5652.6   1027.7
Execl Throughput                                 43.0       5086.0   1182.8
File Copy 1024 bufsize 2000 maxblocks          3960.0     301630.2    761.7
File Copy 256 bufsize 500 maxblocks            1655.0      85465.1    516.4
File Copy 4096 bufsize 8000 maxblocks          5800.0     822063.7   1417.4
Pipe Throughput                               12440.0    1599503.0   1285.8
Pipe-based Context Switching                   4000.0     298430.7    746.1
Process Creation                                126.0       9903.6    786.0
Shell Scripts (1 concurrent)                     42.4       6816.7   1607.7
Shell Scripts (8 concurrent)                      6.0        880.4   1467.3
System Call Overhead                          15000.0    2058190.0   1372.1
                                                                   ========
System Benchmarks Index Score                                        1108.9

参考 (Core i7-4770 3.4GHz Win10+WSL2)

------------------------------------------------------------------------
Benchmark Run: Sun Nov 15 2020 17:49:12 - 18:18:26
8 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables       40985759.9 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     3485.8 MWIPS (16.5 s, 7 samples)
Execl Throughput                               2913.9 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        638309.5 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          172178.1 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1896399.7 KBps  (30.0 s, 2 samples)
Pipe Throughput                              917174.1 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  18669.1 lps   (10.0 s, 7 samples)
Process Creation                               6325.2 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   9387.6 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                   2797.5 lpm   (60.0 s, 2 samples)
System Call Overhead                         514957.4 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   40985759.9   3512.1
Double-Precision Whetstone                       55.0       3485.8    633.8
Execl Throughput                                 43.0       2913.9    677.7
File Copy 1024 bufsize 2000 maxblocks          3960.0     638309.5   1611.9
File Copy 256 bufsize 500 maxblocks            1655.0     172178.1   1040.4
File Copy 4096 bufsize 8000 maxblocks          5800.0    1896399.7   3269.7
Pipe Throughput                               12440.0     917174.1    737.3
Pipe-based Context Switching                   4000.0      18669.1     46.7
Process Creation                                126.0       6325.2    502.0
Shell Scripts (1 concurrent)                     42.4       9387.6   2214.1
Shell Scripts (8 concurrent)                      6.0       2797.5   4662.6
System Call Overhead                          15000.0     514957.4    343.3
                                                                   ========
System Benchmarks Index Score                                         944.9

------------------------------------------------------------------------
Benchmark Run: Sun Nov 15 2020 18:18:26 - 18:46:43
8 CPUs in system; running 8 parallel copies of tests

Dhrystone 2 using register variables      179500633.3 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                    32425.3 MWIPS (9.9 s, 7 samples)
Execl Throughput                              11816.5 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        529507.7 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          141669.7 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1588413.2 KBps  (30.0 s, 2 samples)
Pipe Throughput                             3805383.7 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                1032506.3 lps   (10.0 s, 7 samples)
Process Creation                              25412.9 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                  26313.6 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                   3617.2 lpm   (60.0 s, 2 samples)
System Call Overhead                        2092523.2 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0  179500633.3  15381.4
Double-Precision Whetstone                       55.0      32425.3   5895.5
Execl Throughput                                 43.0      11816.5   2748.0
File Copy 1024 bufsize 2000 maxblocks          3960.0     529507.7   1337.1
File Copy 256 bufsize 500 maxblocks            1655.0     141669.7    856.0
File Copy 4096 bufsize 8000 maxblocks          5800.0    1588413.2   2738.6
Pipe Throughput                               12440.0    3805383.7   3059.0
Pipe-based Context Switching                   4000.0    1032506.3   2581.3
Process Creation                                126.0      25412.9   2016.9
Shell Scripts (1 concurrent)                     42.4      26313.6   6206.0
Shell Scripts (8 concurrent)                      6.0       3617.2   6028.7
System Call Overhead                          15000.0    2092523.2   1395.0
                                                                   ========
System Benchmarks Index Score                                        3050.5