Sunday, May 11, 2008

Linq: Composite keys don't work + Beware of ElementAt ...

I was trying to join two lists (one being a linq-to-sql result and the other being a List<> in memory) using a composite index. I tried doing it the right way, but it just didn't work. (By the way, the "right" way is really awkward. it means you must define a new anonymous type in both query, having the same fields. The best resource I found is here). So after the "right" way didn't work, I tried the more time-consuming way, which involves a Where inside another Where and turned out to be completely irrelevant performance-wise (~20K rows).

In the end, I had to do the join by myself. By chance, the two lists I needed to join had the exact same number of records, and the only thing I had to do was to make sure both lists are sorted in the same manner. Then I could just join each element in one list with the element at the same position in the second list. So the code looked something like that:

for (int i = 0; i < sWeights.Count(); i++)
{
double val = 0.0;
DateTime date = sWeights.ElementAt(i).Date;
while (i < sWeights.Count() && sWeights.ElementAt(i).Date.Equals(date))
{
val += sWeights.ElementAt(i).Weight * sChanges.ElementAt(i).Change;
i++;
}
// Do something with date and val
}




Now here's the deal - this code sucks! It takes AGES to complete. I searched MSDN for an indication about the running time of ElementAt, because I had a feeling this could be the problem - but it doesn't say anything about it. So I made a test - turned the two lists into arrays and ran using an array selector ([i]) and ... voila - the code completes in no time.



So now the code looks like this:


for (int i = 0; i < sWeights.Length; i++)
{
double val = 0.0;
DateTime date = sWeights[i].Date;
while (i < sWeights.Length && sWeights[i].Date.Equals(date))
{
val += sWeights[i].Weight * sChanges[i].Change;
i++;
}

// Do something with val
}




CONCLUSION: BEWARE - ElementAt DOES NOT guarantee anything about its running time, so if you need to run through the whole list, it's better to create an array with the list's elements and run over the array.

No comments: