Wednesday, October 22, 2008

Configurable connection string with Linq to SQL

When using Linq-to-SQL with the dbml designer, by default it generates an App.config file and puts the connection string to use in it. This is fine (well, not really, but at least it works) for applications, but when working with class libraries it's a problem – DLL's don't load the app.config, so changing the connection string in the app.config won't produce any change (the application will still try to use the connection string that was used at design time).

I found a pretty neat solution here by David Klein ( http://ddkonline.blogspot.com/2008/02/set-connection-string-in-linq-dbml-file.html ), which is based on Jon Gallant's solution ( http://blogs.msdn.com/jongallant/archive/2007/11/25/linq-and-web-application-connection-strings.aspx )

P.S: The problem is known by MSFT: http://msdn.microsoft.com/en-us/library/bb386996.aspx . It doesn't seems to bother them though…

Wednesday, July 16, 2008

Undoubtedly the most amazing technology I have ever seen

No matter what you had planned for the next 2 minutes - change your plan. Check BigDog - The Most Advanced Quadruped Robot on Earth. It's worth it!

Sunday, May 11, 2008

Linq: Composite keys don't work + Beware of ElementAt ...

I was trying to join two lists (one being a linq-to-sql result and the other being a List<> in memory) using a composite index. I tried doing it the right way, but it just didn't work. (By the way, the "right" way is really awkward. it means you must define a new anonymous type in both query, having the same fields. The best resource I found is here). So after the "right" way didn't work, I tried the more time-consuming way, which involves a Where inside another Where and turned out to be completely irrelevant performance-wise (~20K rows).

In the end, I had to do the join by myself. By chance, the two lists I needed to join had the exact same number of records, and the only thing I had to do was to make sure both lists are sorted in the same manner. Then I could just join each element in one list with the element at the same position in the second list. So the code looked something like that:

for (int i = 0; i < sWeights.Count(); i++)
{
double val = 0.0;
DateTime date = sWeights.ElementAt(i).Date;
while (i < sWeights.Count() && sWeights.ElementAt(i).Date.Equals(date))
{
val += sWeights.ElementAt(i).Weight * sChanges.ElementAt(i).Change;
i++;
}
// Do something with date and val
}




Now here's the deal - this code sucks! It takes AGES to complete. I searched MSDN for an indication about the running time of ElementAt, because I had a feeling this could be the problem - but it doesn't say anything about it. So I made a test - turned the two lists into arrays and ran using an array selector ([i]) and ... voila - the code completes in no time.



So now the code looks like this:


for (int i = 0; i < sWeights.Length; i++)
{
double val = 0.0;
DateTime date = sWeights[i].Date;
while (i < sWeights.Length && sWeights[i].Date.Equals(date))
{
val += sWeights[i].Weight * sChanges[i].Change;
i++;
}

// Do something with val
}




CONCLUSION: BEWARE - ElementAt DOES NOT guarantee anything about its running time, so if you need to run through the whole list, it's better to create an array with the list's elements and run over the array.

Monday, May 05, 2008

Can't this be simpler?

I'm trying to run a Linq query which, in SQL, would look like this:

select V.Date, SUM(F.Factor/V.Change) AS Denom
from AllVols V JOIN Factors F on F.Key = V.FId
group by V.Date



The only way I found looks like this:


var denoms = from v in allVols
join f in factors on v.FId equals f.Key
group new {v.Date, Factor = f.Value, v.Change} by v.Date
into g
orderby g.Key.Date
select new {g.Key.Date, Denom = g.Sum(d => d.Factor/d.Change)};




Is there no better way ?!?!

Wednesday, April 30, 2008

Linq Goodies 2 - Calculating Standard Deviation

Check out the following function, which calculates the Standard Deviation of a given list of values:

private static double calcStdev(IEnumerable<double> values)
{
double avg = values.Average();
return Math.Sqrt( (values.Sum(d => Math.Pow(d - avg, 2))) / (values.Count() - 1) );
}



Extra sweet...



Note that I could have replaced (d - avg) with (d - values.Average()) , hence resulting in a single line instead of 2, but the performance hit isn't worth it.



It may not look very readable looking at it as a programmer, but if you look at the mathematical formula of standard deviation, the above code is much closer to it than anything I've previously seen in C*.

Linq Goodies 1 - Extracting a range from an array

Slowly but surely I'm starting to get the huge benefits Linq is bringing into our lives. Take a look at the following code snippet, which retrieves values from an array in a specified range:

var range = cData.Where((d, index) => index >= (i - 40) && index < (i));
Sweet!

No support for static Extension Methods - bummer!

I wanted to add an extension method to Debug, which would automatically write a given set of parameters separated with commas (to generate CSV files). However, since extension methods are not supported for static methods, and the Debug.WriteXXX are static - it's not possible. Bummer!

Yet another missing feature in C#/CLR ...

Monday, April 21, 2008

Excel WTF

This is an old one, but I'm always stunned by the fact that a major application such as Excel still has issues like this. I'm trying to view two copies of the same file, located in different folder (I want to check a specific cell to see if it was changed). For some obscure reason, Excel can't handle two simultaneously opened files with the same name, even if they reside in different folder (not that there is any option for them to do reside in the same folder, but that's beside the point).

"A document with the name 'blablabla.xls' is already open. You cannot open two documents with the same name, even if the documents are in different folders. To open the second document, either close the document that's currently open, or rename one of the documents."

Thursday, March 20, 2008

Connecting to a remote console

Say you want to connect to a remote machine with Remote Desktop (RDP) but want to get hold of the actual machine's console. That is - you want to get the session that you would have were you standing in front of the machine physically.

To do so, run the following command:

mstsc /console

Then connect to the machine as you would with a regular RDP session. What you will get is the actual console session.

Thanks to Chen Avnery for this little (but helpful) info.

Tuesday, February 26, 2008

SQL Server 2005 rantings - User Defined Aggregate Functions are nice, but not there yet...

1. Why can't there be UDA's in T-SQL? Granted, it's easy to write it in CLR, but sometimes it would be simpler (and more appropriate) to write it in SQL. It also took me a while to figure out that indeed there is such limitation...

2. UDA's must be serializable. Why? I don't know yet (still need to figure that one out), although I have some ideas, but anyway it's besides the point - it's a must and I assume there are good reasons for that. The problem is that whenever you're doing something slightly more complicated than just an average or Product, you need to accumulate all the values until you get to Terminate() (e.g. a variation on STDEV). This means that this list you've just accumulated could grow significantly. Now to the pitfall - when you use user-defined serialization (which you would have to in this case), you must specific the maximum size that the UDA structure could grow to. This maximum size is limited to 8000 bytes (*sounds familiar...). So in my case, I'm using a UDA over double values, and thus I'm limited to aggregating a little below 1000 records. IMHO this reduces the practical usage of UDA's to about 50%...

3. I tried to write a UDA for decimal data. No matter what I did, it constantly produced a function defined to return decimal(18,0). In other words - no decimal numbers to the right of the dot. In the end I didn't have the time to find out the KB article talking about it, but I suppose there is - I pretty much tried everything. In my particular case using double values was an acceptable compromise - it won't always be that way...

Thursday, February 14, 2008

Learning Machine Learning - The WEKA Way

If you're interested in working with or learning about Machine Learning, you really MUST check out WEKA. When I first saw WEKA, a few years ago, it looked like a cute tool to start learning ML, with a very small set of implemented algorithms and only available for Java developers. Now, it has become a very rich research platform, in which one can easily test a very wide variety of ML algorithms with endless tuning parameters and analysis tools. You can read data directly from a database and you can now even run WEKA directly from within your .NET code (check also this) !!!!!

I'm a complete newbie with WEKA, but it seems that it's going to be a lot of fun and much faster working with it than anything I did before. I just hope it will hold up to the expectations that are building up in me now...

One more thing - notice that there is the "book version" and the "developer version". The former is the one on which their book is based on and is not expanded (only bug fixes). The latter is the version that is on constant development, has more features, and significantly more implemented algorithms.

Tuesday, February 12, 2008

WLW - Didn't they hear about 64-bit ???

When opening WLW it says that the Beta has expired and forwards me to download the new version. When I do that - I get a message that it is not supported for 64-bit windows (I'm using XP 64bit).

Hum, what?

1. 64bit is alive and kicking and getting more and more users. It's time that software companies (MS being one, IMHO) get used to provide support for 64bit platforms by default.

2. If the new version does not support my platform - why sending me to download it and waste my time and nerves?

Grrrrr...