Every now and then, we need to perform mundane operations that are very
simple but don't have a built-in function in the language. So we write some
ad-hoc code, maybe even copy something from StackOverflow and are done with it.
What we sometimes fail to notice, however, is the affect this has on
performance.
This entry will focus on splitting an array, but this is relevant for
other operations as well.
The time it takes to perform an operation is only important if the code
is in a time-critical section and/or is performed a large amount of times. If
this is not the case, whatever implementation you choose will probably be OK.
Having said that, let's look at some of the ways to split an array.
We'll use a medium sized byte array for this purpose (256), splitting it to a
short array (16 bytes) and the rest.
Assuming you are using .Net 3.5+, the most natural way to go about this
is to use LINQ:
private static void SplitArayUsingLinq(byte[] data) { byte[] first = data.Take(16).ToArray(); byte[] second = data.Skip(16).ToArray(); }
As you can see, the method is very elegant, only 1 line of code required to create each part. In addition, it seems like using Take and Skip on a small number of runs can't cause a performance issue.
However, running the above code a 1,000,000 times takes around 10
seconds, which is a considerable amount of time.
Let's compare the LINQ version to a good old for loop:
private static void SplitUsingForLoop(byte[] data) { byte[] first = new byte[16]; for (int i = 0; i < first.Length; i++) { first[i] = data[i]; } byte[] second = new byte[data.Length - first.Length]; for (int i = 0; i < second.Length; i++) { second[i] = data[i + first.Length]; } }
This yields a run time of less than 2 seconds – more than x5 improvement! The looping method seems to be much more efficient. Let's try to improve this some more.
If we do our Googling right, we find that copying arrays actually is a
library function – Array.Copy. Let's test this:
private static void SplitArrayUsingArrayCopy(byte[] data) { byte[] first = new byte[16]; Array.Copy(data, first, first.Length); byte[] second = new byte[data.Length - first.Length]; Array.Copy(data, first.Length, second, 0, second.Length); }
We get a result of 250ms – another x8 improvement, a total of x40
compared to the LINQ version!
Digging even dipper and only in the case of byte[], we can use another
method called Buffer.BlockCopy that actually performs a low level byte
copy:
private static void SplitArrayUsingBlockCopy(byte[] data) { byte[] first = new byte[16]; Buffer.BlockCopy(data, 0, first, 0, first.Length); byte[] second = new byte[data.Length - first.Length]; Buffer.BlockCopy(data, first.Length, second, 0, second.Length); }
Now the results are 180ms, which is yet another improvement, albeit not
as dramatic as the previous ones.
In conclusion:
Method
|
Time
for 1,000,000 splits (s)
|
Improvement
factor
|
LINQ
|
10.2
|
-
|
for loop
|
1.93
|
x5
|
Array.Copy
|
0.25
|
x40
|
Buffer.BlockCopy
|
0.18
|
x56
|
Kids, don't trust LINQ blindly for what really matters J
.GetRange is quite fast, which is a LINQ extension.
ReplyDelete