allocates twice the specified amount of memory, zeros it, and then times
the copying of the first half to the second half. Results are reported
in megabytes moved per second.
specification may end with ``k'' or ``m'' to mean
kilobytes (* 1024) or megabytes (* 1024 * 1024).
Output format is "%0.2f %.2f\\n", megabytes, megabytes_per_second, i.e.,
This benchmark can move up to three times the requested memory.
Bcopy will use 2-3 times as much memory bandwidth:
there is one read from the source and a write to the destionation. The
write usually results in a cache line read and then a write back of
the cache line at some later point. Memory utilization might be reduced
by 1/3 if the processor architecture implemented ``load cache line''
and ``store cache line'' instructions (as well as ``getcachelinesize'').