Tuesday, October 31, 2006

"The District" - WTF

I was taking a break, eating a salad in front of the TV, watching half an episode of "The District". There was a whole fuss going on about a kidnapped little girl. The parents got the ransom email, the ultra-smart police-guys analyzing everything on the spot, etc.
Suddenly one of the geeks says: "We've got a partial trace - it's been sent from an anonymous server using a 256-bit key". Then the father's associate jumps saying: "256... Hey, that's the software we've developed!".
And then of course they concluded that the kidnaper must be one of their past employees... (I didn't see whether they were right - it's commercial time now)

Need I say more?

Networked-RNG

True Random Number Generators are hard and almost impossible to create. Most computer programs use pseudo-Random Number Generators (PRNG) which are, in fact, not random at all. They simply use a mathematical function (most commonly LCG) with a seed based on the system's clock.
For security applications, these simple PRNGs are bad bad bad - they create a giant hole in the security mechanism through which hackers can fairly easily penetrate. So entropy-based RNGs are usually used for cryptographic purposes (for example Window's CryptGenRandom). These RNGs try to collect randomness (actually it's entropy) from as many resources as possible: keyboard events, mouse events, hard-disk events, network events, etc.
Linux also has such an entropy-based RNG, but it has been shown to be weak in some circumstances.
To my knowledge, Windows' counterpart (CryptGenRandom) hasn't yet experienced such a faith...
BTW - an important problem with the cryptographically strong RNGs on a computer is that sometimes they must wait for enough entropy to occur before being able to returning a new random number, causing the application to stall (and making it difficult to generate a large amount of random numbers at short intervals).
Would it be possible to have a really random RNG?
Well, there are various physical phenomena (such as some nuclear processes) that could be used, but it's complicated to incorporate this into every PC :-)
An idea came to my mind - Imagine we had a web service, generating simple random numbers, based on previous calls to the service (and maybe a few other entropy sources). Then each user would get a random number that is influenced by some other user, completely unknown to her. The main idea behind this is that the randomness is aquired from the various independent requests, making it essentially random.
A simulation of such a generator could be easily created by using Blogspot's "Next Blog" link, where you are randomly forwarded to another blog. Take the link you got (or some information inside the blog), hash it somehow (MD5 hasn't been broken yet), and what you get is really random (I checked - a blog doesn't refer twice to the same blog).
This is all very fuzzy, and I'm sure it's flawed in many ways. Yet I think that the principle could be interesting - using user navigation information to provide an RNG service over the web.
And what about performance - having to execute a web request to get one random number isn't very nice!? Well, I didn't say it's perfect, did I? Yet it may (maybe) be extended to return pools of random numbers, thus requiring much less web calls. Also, if you're already running a web site, you may be able to use your own users' information to generate a local RNG only for yourself.

OK, enough babling, ciao!

Afraid of moving to IE7

I know that using IE is not very "geeky", but I don't care - I'm an IE user. I've tried FireFox several times (including the latest version 2), and even installed the latest Opera browser. Both had real difficulties coping with my banks' sites (Opera remained blank and FireFox often gets completely stuck, up to getting my whole machine stuck!). I want to use one single browser  - without needing to remember which browser works best with which site. I know that the culprits are the web developers and not the browsers (probably), but I don't care -the bottom line is that I can't use either of them for all my browsing tasks.
Conclusion - I have to keep using IE.
Now IE7 is out, and it looks pretty promising - enhanced security, tabs (finally), RSS, etc. Not as rich as FireFox or Opera, but still - much better than IE6. The problem is that I don't know whether the sites I usually visit properly support IE7, and I really don't feel like counting on uninstalling it in case of problems.
Dillema....

Things I'm missing in C#/CLR

Although C# is a great language, I'm still missing some features. I know it's mostly a matter of CLR limitations, but I'm missing it nonetheless.

  1. Signature-free Delegate - imagine you have a system that works intensively with configuration/settings files, loading stuff at runtime. Now you want this system to be able to dynamically receive a delegate from one source and parameters from another source, and run that delegate with those parameters. At compile time, the only thing you know is that you have to run some delegate - you don't know its signature. I'd like a way to define a delegate that accepts any signature, and a technique to dynamically call it with whatever parameters I get. Of course it means that important stuff cannot be checked at compile time, but sometimes it's the best way to provide a generic solution (if you're interested, I could post an example of what I mean in some later post).
  2. Multiple Inheritance - I know, this is almost a theological issue (much like the Good/Bad Agile debate going on lately). I also admit that it doesn't happen a lot that I really need multiple inheritance. Yet, sometimes I do, and at these times, I'm really p'd off for not having the option (and don't give me the brainwashed crap feeded by Microsoft that you never need it and can always use some pattern to work around it!). BTW - Eiffel.NET has full multiple inheritance support.
  3. Method Signatures - as in C++, methods can't differ only by return type. That sucks, plain and simple.
  4. Multiple Return Parameters - who said a method must return only one parameter? If I want to divide two numbers and get both the integer result and the remainder - why do I need to get one in the return value and the other as an "out" argument?
  5. Interactive Scripting - I've written enough about that.

Monday, October 30, 2006

.NET 2.0 Tip: Strongly Typing Configuration Settings

Every body say Thanks to Sahil Malik for his cool tip!

Sunday, October 29, 2006

An Amazon WTF

(From Amazon.com)

"Monte Carlo and Quasi-Monte Carlo Methods 2004 by Harald Niederreiter and Denis Talay (Paperback - Dec 31, 1899)"

Unless the writers were able to travel in time, I really don't see how they could publish a book in 1899 about stuff they worked on in the 21st century...

P.S: I would have posted an image, but it's such a pain with Blogger that I try to avoid it as much as possible :-)

Joke of the day

A lawyer died and arrives at the gates of heaven. He immediately starts making trouble, claiming he wasn't supposed to die. Eventually, he manages to speak to one of the angels in charge and says: "I don't understand what happened. I was happily sitting in my office, working on a very important case. Then, out of the blue, I just died". The angel wasn't very impressed, opened his laptop, searched for the lawyer's file and read in a monotonic tone: "I'm sorry mister, it says here that you died of old age". "But I'm only 35 years old!!!" -replied the crying lawyer.
"Well, that may be right, but if you count the hours you've charged your customers, you've reached the incredible age of 187..."

Wednesday, October 25, 2006

Scripting in C# - take 3

I've been ranting about the lack of scripting capability with C# (here and here).

Another example of what I meant could be M# (via Larkware) and F# - both having an interactive interface.

Why, oh why is there no such thing for plain old C#?

[UPDATE] After inquiring with Extreme Optimization, it turns out that I misunderstood their site. They DON'T have an interactive scripting capability (yet), so only F# has it, not M#...

Sunday, October 22, 2006

Could we benefit from two mices?

We've been using a mouse and a keyboard for ages now. It's an axiom with contemporary computers: computer, monitor, keyboard, mouse.

Did you ever consider using two pointing devices instead of one? With one mouse, you can manipulate the mouse and keep one hand on the keyboard. But in most cases you're so used to typing with two hands, that you're mostly incapable of using your keyboard with one hand anyway.

I'm used to using my two hands simultaneously, sometimes doing different things each. Could there be a way to put that in action with two pointing devices?

Friday, October 20, 2006

Scripting in C# - take 2

Yesterday I complained that I want a way to run scripts in C#. Funnily, Leon Bambrick (The Secret Geek) published an add-in to VS (for VB) that does part of what I'm searching for approximately at the same time. This add-in lets you mark some VB code and execute it on-the-fly. That's great, and it's part of what I am looking for (although his add-in is for VB only, and I'm looking for a solution for C#). What I want in addition to that, is a command-like UI where every expression you type gets executed once it's completely written.

Here's an example of a scripting session as I perceive it (Matlab users will probably feel at home):

>> int number = 5;
ans:
     number = 5
>> for (int i=0; i < number; i++)
     {
         Console.WriteLine(i.ToString());
     }
ans:
     0
     1
     2
     3
     4
>> $Clear(number);
ans:
     number cleared from memory

>> class MyClass
     {
          MyMethod(string input) 
          {
                    Console.WriteLine(input);
          }
     }
ans:
     MyClass declared successfully

>> MyClass myObject = new MyClass();
ans:
     myObject created successfully

>> myObject.MyMethod(1234);
ans:
     Syntax error - MyClass.MyMethod(string input) does not accept input (1234)

>> myObject.MyMethod("Hello World!");
ans:
     Hello World!

A few notes:

  1. As you can see above, expressions can span more than one single line. The engine simply waits for the expression to be completed (semi-colon or brackets). So the for loop or the class declaration are being executed only when the closing bracket is typed.
  2. Although I wouldn't use this for very large and complicated classes, there should be support for classes and any other construct of the language.
  3. Each expression ends with a feedback from the engine about its execution (the "ans" regions - taken from Matlab)
  4. Errors don't throw exeptions, but rather give you as meaningful an explanation as possible
  5. Since the engine must keep some data in memory, we need to be able to free it at will. That's what the $Clear command would do. I suppose we might need support for a few more such scripting commands (the less the better).
  6. Once you have such a scripting engine, evaluating portions of some code (like Leon's add-in) becomes trivial.

P.S: For all those who say - "Dude, with Powershell you can do that and much more", I say: "Dude, as long as it's in another language it's not the same. It means I can't take it for granted any C# programmer will know the scripting language as well as he knows C#. It also means I can't easily copy code from my scripting console to my code - I have to first translate it into C#. So no - unfortunately Powershel is still not there in terms of ease-of-use."
Still about Powershell - I have nothing against Powershell. Actually, it was about time decent scripting became available for Windows users (and sure enough Powershell is way beyond decent!!!). What I don't understand, though, is why was there a need for a new language? Wouldn't it have been nicer to have an additional set of libraries to support the various functionalities available in Powershell (WMI, IIS, File System, etc.) - add these libraries to the .NET Framework and add a scripting engine as described above? Everyone would be able to pick his favorite language and use it for scripting. Granted, Powershell provides some functionalities with just a single command, that would have required much more lines of code in any other language. But I think that's not a good excuse - you can always wrap those commonly-used functionalities inside some static function, providing the same final result (single function call to do a rather complicated task).

RTF to HTML converter - for posting code in blogs (via Mike Stall)

Mike Stall wrote a very nice and simple tool to convert RTF to HTML and easily post code in blogs.

It's a really cool tool, much simpler than anything other I could find. I had to make a few changes:

1. To complete escape formatting, you must add the following line at the beginning of the Escape method:

	st = st.Replace("&", "&amp");

2. I personally don't like having to go through a concrete file (Mike's code writes the result to an "out.html" file). I want the RTF data on the clipboard to be converted to HTML inside the clipboard. So instead of writing the converted data to a StreamWriter, I used a StringWriter and set the result to the clipboard:


using (StringWriter sw = new StringWriter())
{
Format(sw, data);
sw.Close();
Clipboard.Clear();
Clipboard.SetText(sw.ToString());
}

Et voila - now I define a shortcut on SlickRun to run this little utility easily, and I can now post formatted code in no time.

Thanks Mike!!!

Thursday, October 19, 2006

I want an Immediate Window now!

You know the Immediate Window in Visual Studio? If you don't, I hope you'll start working with it right after reading this post - it's a MUST for debugging. Simply put (and it's simple, really), it allows you to evaluate expressions about the currently running application. These expressions are written in simple (say C#) code syntax and can be much more elaborated than just querying a property (like the Watch Window). It's almost like a scripting tool that is available while you're debugging, which compiles on the fly.

That's exactly where my problem is - it's only available at run-time. What I would like is a scripting engine that would receive C# code, compile it on the fly, and give me the output. And I want it from the command line (with intellisense and all)!!!

I know, I know, there is Powershell and a whole bunch of other scripting tools. There is also SnippetCompiler, which is as close as you can get if you want to stick to C# syntax.

I am lazy, and I don't like to have to learn yet another scripting tool to get things done. I have been using Matlab a lot lately, which made me feel how nice and simpler life is when you can use the same syntax for scripting and for writing applications. Matlab is a scripting language. With the command line of Matlab you can do anything you can also do in m-files (files holding Matlab code). So when you just want to test a few things, you can check it with the command line, and once you got the results you like, you can copy-paste it into some persistent files (which you can then embellish and structure). When you have a lot of testing going on, it's a real cool feature. It's also nice if you just want to see the result of something, without needing to persist it into a file at all.

I want this, and I want it now!!!

Matlab dual core 100% or 50% CPU

[UPDATE: If you're searching for ways to better use your multiple cores, the R2007a release seems to support multithreaded computations. More on this on my recent post.]

I recently discovered that A LOT of my blog visitors came by searching things like "matlab dual core 100% CPU" (sometimes it's 50%). They are redirected to a post I wrote half a year ago. Apparently, people are having a lot of problems using Matlab properly on a dual-core machine. I've been contacted by emails by a few people, all complaining of the exact same problem: they recently purchased a dual-core machine. When the machine starts up, the matlab.exe process takes up 50% and doesn't release it (before even opening Matlab). They then open Matlab, which creates another instance of matlab.exe for which they have now only one free core. And that's not the end of it - the second matlab.exe process seems to be as CPU-hungry as the first one. Even after the initialization stage is over, it keeps taking up 50% CPU.

Bottom line - it takes a long time for the machine to start up, and once they open an instance of Matlab, the computer starts crawling like an old 486 machine, stuck with 100% CPU usage.

I don't know whether the problem is with Windows XP or with Matlab. Windows has an issue with dual core machines, as described in their knowledge base, but for some reason they don't make the hotfix available to the public.

If I find out the reason or a workaround, I'll post it. If any of the readers who reach this post find a solution - please let me know or write it in the comments of this post, so others won't have to go through the same ordeal.

Sunday, October 15, 2006

Here are some helpful ways to get along at the workplace

(Just got this from a friend by email. I don't know the real source, so I can't give credits. Anyway, it sounded so much like things I know that I had to post it. Have fun!)

Don’t be irreplaceable, if you can’t be replaced, you can’t be promoted…
The more crap you put up with, the more crap you are going to get…
Everything can be filed under “pending.”…
If it wasn’t for the last minute, nothing would get done…
When you don’t know what to do, walk fast and look worried…
The last person that quit or was fired will be held responsible for everything that goes wrong…
Never criticize or boast, call it ‘information sharing’…
Never call something a failure or mistake, its a ‘positive learning experience’…
Never argue, have an ‘adult conversation’…
If you can’t get your work done in the first 24 hours, work nights…
A pat on the back is only a few centimeters from a kick in the butt…
It doesn’t matter what you do, it only matters what you say you’ve done and what you’re going to do…
After any salary raise, you will have less money at the end of the month than you did before…
You can go anywhere you want if you look serious and wear a lab coat…
Eat one live toad the first thing in the morning and nothing worse will happen to you the rest of the day…
When the bosses talk about improving productivity, they are never talking about themselves…
If at first you don’t succeed, try again. Then quit. No use being a damn fool about it…
There will always be beer cans rolling on the floor of your car when the boss asks for a ride home from the office…
Keep your boss’s boss off your boss’s back…
Never delay the ending of a meeting or the beginning of a cocktail hour…
To err is human, to forgive is not our policy…
Anyone can do any amount of work provided it isn’t the work he/she is supposed to be doing…
Important letters that contain no errors will develop errors in the mail…
If you are good, you will be assigned all the work. If you are really good, you will get out of it…
You are always doing something marginal when the boss drops by your desk…
People who go to conferences are the ones who shouldn’t…
At work, the authority of a person is inversely proportional to the number of pens that person is carrying…
Following the rules will not get the job done…
Getting the job done is no excuse for not following the rules…
When confronted by a difficult problem you can solve it more easily by reducing it to the question, “How would the Lone Ranger handle this?”…
No matter how much you do, you never do enough…

If you don’t know what it is, call it an ‘issue’…
If you don’t know how it works, call it a ‘process’…
If you don’t know whether its worth doing, call it an ‘option’…
If you don’t know how it could possibly be done call it a ‘challenge’ or an ‘exciting opportunity’…
If you want to confuse people, ask them about ‘customers’…
If you don’t know how to do something, ‘empower’ someone else to do it for you…
If you can’t take decisions, ‘create space’ for others to operate…
If you need a decision, call a ‘workshop’ to ‘network’ and ‘ground the issue’, followed by an ‘awayday’ to ‘position the elephant in the room’ and achieve ‘buy-in’…

Saturday, October 14, 2006

Reflexology to help with pooping

Here comes a somewhat weird post about babies and pooping. Don't complain I didn't warn you!

Our baby-daughter, Tamar, is now 9 days old. She is feeding on breast-milk only and things are generally going pretty well. In the past 2 days she has started to be restless, complaining even while sleeping, evidently suffering from gas. She has also stopped pooping completely. This wouldn't be surprising, since breast-feeded baby poop less, but it was obvious she needed to free herself and didn't succeed. Also, breast-feeded babies rarely suffer from constipation, so this is not a very common situation. This morning, after having passed a restless, poop-less night, we decided to search for a way to help her. Obviously, the stomach massages we have been giving her during the last days didn't help - a more drastic measure was necessary...

The desperate measure for such situations is using a rectal thermometer. The physical touch on the anal region is a very strong poop-generator. And, sure enough, after having taken this measure, before I had the chance to put her diaper on, she started pooping and pooping like there's no tomorrow. 5 diapers and 20 minutes later, she was done.

Yet, that was in the morning, and the more the day went on, the more she became restless again. What next? We can't keep pushing her thermometers in her anus - it could very easily lead to some sort of addiction, where her system can't function without a physical trigger. But she's suffering - what can we do?

After calling friends, family and about any person we thought might be able to help, and searching throughout the Internet, the most promising finding was doing her reflexology. Apparently, massaging the middle inner part of the foot clockwise stimulates the whole digestive system and helps pooping. We didn't put much hope in it, but what have we got to lose? It took less than 10 seconds of massage for the crap to get loose!!! It's like unlocking a door. A M A Z I NG !!! (by the way - pay attention that doing the massage in counter-clock direction has the exactly opposite effect, it's apparently excellent to solve diarrhea problems).

Friday, October 13, 2006

QA Day

I recently posted why I think that to excel in QA and in Programming are mutually exclusive. Today I'd like to introduce you to a tool I find very useful for a software project's evolution, even if it may seem to contradict my previous claims (it doesn't contradict, really, it just seems to)...

On QA Day all programmers become QA engineers for one day. Typically, the QA team/department leader is responsible to divide the work between her temporary "employees". She's the actual boss for the day, and all programmers (including team leaders) must follow her orders. With large teams, some of the QA engineers move between the programmers to help them out with things they don't know, the rest participate in testing with the programmers. A primary requirement is that a programmer will test the parts of the application she hasn't written. I know that the previous statement was in bold, but I want to emphasize it again: putting a programmer to test a feature she has written would be a complete waste of time and miss the whole point of QA Day.

Which brings us to the inevitable question - what's the point? Why would you waste the programmers' valuable time doing stuff I claim myself they cannot be really good at (assuming you have really good programmers)?

  1. By putting a lot of efforts in QA for a short period of time, you can often produce a rather good overview of the real fitness of your application. When done at the right timing, QA Day is an invaluable tool to find out where you stand and how you are going to meet your deadlines. Do it too early, and you will have large parts of your release uncovered. Do it too late, and you might find yourself with much more bugs than time allows to handle. Do it on time, and you'll know where you stand and have the time to act accordingly - in one day!!!
  2. All programmers know the features they work on perfectly (hopefully). They are usually also well acquainted with some other portions of the application. We all have, however, "black zones" - parts of the applications we don't know at all (even at a user level). By forcing programmers to test their "black zones" you gain twice:
    1. The programmers get to know the whole application to a much better degree, which will open their mind and let them see the "whole picture". This in itself could be enough a reason to do a QA Day once in a while.
    2. Since they are not familiar with the features they are testing, they are a pretty good simulation of a "dumb user". In most cases you will see that programmers testing something in their "black zones" tend to behave like the most stupid of users. I don't know why it is, but it's so - I tend to ask the most stupid of questions in such situations, and so does every programmer I know. This level of "innocence" exists with QA engineers the first time they work on a feature. However, it evaporates extremely fast, leaving you with no way to really see how a new user would react. The programmer is a pretty good tabula rasa for this.
  3. For weeks/months/years your application is being tested by the same persons over and over again. QA Day is an excellent way to refresh that - programmers will find errors in the test procedures, missing parts, propose improvements, etc. Good programmers are lazy people. As such, they will always try to find ways to work less. The idea is to leverage this characteristic to improve the test procedures.
  4. Programmers tend to underestimate how hard it is to do QA. After one QA Day, the most bragging programmer will start to see things otherwise. It is also an invaluable way of improving communication between QA engineers and programmers and yields to better cooperation in the future.

There are, of course, several issues you must consider prior the QA Day to ensure its success:

  1. Organization - people should know in advance when the QA Day is, all test procedures must be ready and assigned beforehand, etc. A thorough explanation of why we're doing it and what it's good for is a must.
  2. Timing, as I mentioned above.
  3. Short day - try to organize it such that people would finish their work relatively early. This has two reasons:
    1. It's nice to finish early, and all other things being equal - people will remember it favorably.
    2. There are always unexpected problems, which can cause serious glitches in time.
  4. Choose the QA engineers responsible to assist the "testers" wisely - they should know the application thoroughly, explain things simply and fast, etc. They must constantly move among the people and make sure everybody can do his job and nobody is stuck with some stupid configuration problem.
  5. Use a simple bug-report mechanism. You don't want to waste the whole day only because the programmers don't know how to report a bug.
  6. The Achilles Heel of QA Day is that it might be confronted with antagonism from both sides (QA and Programming). Deal with it up-front by explaining its importance and turning it into a fun day. A nice add-on is to make it a tournament - the programmer who found the most bugs is the winner.

I've had the opportunity to work in a company that had QA Days at some point before each important release. It's one of the best organizations I've ever worked for. The relationship between testers and programmers was amazing. So was the level of knowledge of the whole system by everyone.

I truly believe it's an excellent tool for improving the overal performance of the development effort.

Thursday, October 12, 2006

Working in competition

When is it good to create a competitive environment for a project? When would you assign several persons/teams/departments with the same task and select the best solution presented to you?

I think it is sometimes a good idea, but there are a few conditions:

  1. All participant entities (persons/teams/departments/whatever) must be mutually independent. If you create a competition between entities that need each other's help to succeed, you'll just harm their chances of getting good results and increase the time it takes to get these sub-optimal results.
  2. There must be a different general approach for each entity. For example, one team might provide a web-based solution while the other might provide a smart-client solution. If two entities use the same approach, you stand much better chances to succeed if you put them together.
  3. (Optional) The task should have an intrinsic reward. If working on the task is a reward in itself, or if the results are a reward even if you don't win the competition - then everybody wins, no matter what. An excellent example is a competition of "the best-looking room" in the office. The more you invest, the better your room will look and the nicer it will be for you to work there, regardless of your ranking in the competition.

Can you think of anything else?

Friday, October 06, 2006

Father of two

My wife (a super-hero if you ask me) gave birth yesterday to our second daughter - Tamar. Everything went fine - it went lightning fast and both feel great (well - as great as possible).

I'll be taking a blog-holiday for a while, until we get settled and my adrenaline gets back to normal.

Wednesday, October 04, 2006

Bloggers ARE rockstars

Roy recently posted his opinion about the decision of U2U from Belgium to avoid working with Israeli companies for political reasons (you can read my own opinion here). Being a famous technical blogger, Roy managed to attract a nice amount of responses, which can be roughly categorized as follows:

  1. People who agree with him
  2. People who don't
  3. People who think he should keep his blog technical and shut up about political issues

I'd like to talk about the third kind of assertments, but let's first start with rockstars...

 I think it's great that rockstars use their influence to make the world a better place. Bono is probably the best example - he's been doing a tremendous amount of work to help those in need, with the highlight being his work to make the Group of Eight Summit write off Africa's huge debt. Had he not been the rockstar he is - he would never have been able to accomplish that (and a lot of other stuff). The thing is that in this work he had to get himself very much involved into global politics, and there are many people out there who completely disagree with his opinions.

I agree with Philip Haack when he says that "Joel is the closest thing the software community has to a bonafide rockstar". As such, I think it's completely appropriate for him (or any other famous technical blogger for that matters) to talk about non-technical things, yes - even politics!

When Mike Stall compared his baby to a finite state machine or talked about his dog teaching him about race conditions I didn't see anyone complaining. Neither did anyone complain when Scott Hanselman tiled his kitchen or when he talks about diabetes.

Bloggers are not just technical - they are first people. They have families, friends, hopes, regrets, ups, downs, the whole deal. I personally like reading a little more personal stuff about bloggers that interest me. I wouldn't want all their posts to be about their personal life, but once in a while, getting a glimpse of it, a small reminder that there is a person behind the words, is nice.

As for the question whether politics is in-bounds or out of it - I don't see any reason why it should be excluded.

So Roy, do me a favor, disregard these single-minded people who can't accept you have things to say other than "regular expressions", "agile", "test driven", etc - SAY IT, loud and clear!!!

I'll be there reading it :-)

Tuesday, October 03, 2006

About Windows Live Writer, Blogger and Images

I've been using Windows Live Writer for the past week or so. It's really great - I'm able to post in a much simpler and faster way than before, allowing me to post more frequently.

There is one problem, though - posting images to Blogger! For some reason, Live Writer doesn't know how to post images to Blogger. This is particularly annoying, since I've always had trouble with that and I was hoping it would become easier with Live Writer. So for my previous post, I had to post it without the images, and add the images manually from the browser. Not only that, but I'm an IE user, and it took me some time to remember that I've always had trouble uploading images from IE. So I had to dig out FireFox from the programs-that-are-candidates-to-be-eliminated-from-my-computer-on-the-next-clean-up and do it from there. Not nice - really really not nice!

Distance Measures

Definition

Say you have a set of vectors, and you need to define how similar/different they are from each other.

There are many different approaches to measure the distances between vectors. All these approaches must obey (at least) the following basic rules in order to be referred to as a metric:

Given a real-valued function d : X × XR

  1. Positivity: Distance(X, Y) ≥ 0
  2. Identity: Distance(X, Y) = 0 <=> X == Y
  3. Symmetry: Distance(X, Y) == Distance(Y, X)
  4. Triangular Inequality: Distance(X, Z) ≤ Distance(X, Y) + Distance(Y, Z)

The most commonly used metrics

Euclidean distance (a.k.a. 2-norm)


Manhattan distance (a.k.a. City-Block, 1-norm)


Cosine correlation coefficient

(where θ is the angle between the vectors)

The thing is that there are many many more distance measures, each with specific qualities. For each problem you may need to use a different metric, usually based on some empyrical tests.

Additional metrics

Following is a list of additional metrics, for more details about each, please refer to the links at the bottom:

  • p-norm (a.k.a. Minkowski distance of order p): same as 2-norm, just replace the 2 with p, where p is a real number > 1.
  • infinity-norm: like p-norm with p → infinity.
  • Pearson correlation coefficient, Uncentered Pearson correlation coefficient, Squared Pearson correlation coefficient: all are very similar to the Cosine correlation coefficient.
  • Averaged dot product: the dot product of the two vectors, devided by the number of elements in the vector. Very simple, probably too simple in most cases.
  • Rank correlation methods: non-parametric methods that look at the rank of the values instead of the values themselves.
  • Canberra Distance: often used to detect abnormalities, since it has a bias for distances around the origin.
  • Chi-square: often used in statistics, to determine how well an observation fits the theory.
  • Mahalanobis distance: similar to Euqlidien distance, except that it also takes into account the correlations of the data set and is scale-invariant.

Sources and additional references

http://axon.cs.byu.edu/~randy/jair/wilson2.html

http://en.wikipedia.org/wiki/Distance

http://en.wikipedia.org/wiki/Metric_(mathematics)

http://en.wikipedia.org/wiki/Kendall

http://en.wikipedia.org/wiki/Spearman

http://en.wikipedia.org/wiki/Mahalanobis_distance

http://genome.tugraz.at/Theses/Sturn2001.pdf

http://www.ucl.ac.uk/oncology/MicroCore/HTML_resource/distances_popup.htm

http://www.ucl.ac.uk/oncology/MicroCore/HTML_resource/Distances_detailed_popup.htm

http://axon.cs.byu.edu/~randy/jair/wilson2.html

http://149.170.199.144/multivar/dist.htm

http://en.wikipedia.org/wiki/Norm_(mathematics)

http://fconyx.ncifcrf.gov/~lukeb/clusdis.html

http://en.wikipedia.org/wiki/Chi-square_distribution

Monday, October 02, 2006

Laptop (short) review - LG LS70 3JJE 1.8Ghz

I bought my laptop 16 months ago. It's the LG LS70 3JJE to which I increased the memory.

Specification

CPU: Intel® Pentium® M 1.86GHz (750) ,533MHz FSB, 2MB L2

Memory: 1024MB DDR2 533MHz (Dual channel) - 2048MB (increased from the standard 512MB)

Screen: 15 " XGA (1024x768), High Brightness: 200nit

Graphics card: ATI Radeon X600 64MB

Hard drive: Fujitsu 60GB (SATA) 5400 RPS

DVD-R/RW: DVD-R/RW Super Multi support most burning formats

Wireless: Intel® PRO/Wireless 2200BG (802.11b/g)

Modem: 56Kbps

Network card: 10/100/1000 on board

Card reader: MMC, Secure Digital

PCMCIA: Express Card/54, PCMCIA Type II

Audio: High Definition Audio (24 bit)

Connectors: S-Video,RJ-45, RJ-11, VGA, USB 2.0x4, MIC-IN, IrDA, S/PDIF, IEEE1394,PARALLEL

General impression

This laptop is a charm. Its specification is comparable to the IBM T42 and at a significantly lower price it gives it a pretty good fight. It's fast, excellent screen, all necessary add-ons are build-in, smooth install. I'm really happy with this buy. The only thing I would have done otherwise, on hindsight, is buy it with 2GB memory instead of 1GB.

Pros

  • Fast - the CPU is one of the fastest in the pre-dual-core era, fast memory access, etc.
  • Excellent graphics (card and monitor)
  • Smooth installation - the CD that comes with it includes Windows XP Pro and has everything you need in it. I installed it from scratch with it, and it was really easy - no need for any other external disk.
  • Every possible connector, including FireWire, 4xUSB 2 ports and SD card reader, all of which have been very useful to me.
  • Not too heavy, if you consider the size of the screen (and the alternatives)
  • In Israel it's sold with 3 years warranty (1 international). I didn't need it, but it's important to know you have it (and worth another 100-200$)
  • If it matters to you (it doesn't to me) - it has a cool look. Many people at the university started asking me questions about it only because it looks neat.

Cons

  • Expensive - although you're given excellent value for your money, it's still not a cheap deal, even today. So if you don't need all these specs, you can certainly find something that will suit your needs at much cheaper.
  • Heat - It heats up pretty fast, and when it does, it's really hot. It does cool down very fast, once you turn it off, but still. Working with it on your knees, especially on a hot summer day, it totally out of the question.
  • Hard drive is too small. A laptop like this is aimed at the enthusiast, who certainly need much more disk space than the poor 60GB.
  • Graphics card does not provide 24bit colors (16 or 32, but no 24). My home monitor is a Samsung SyncMaster 910v, which has the best output when given 24bit colors. I know this sounds weird, but that's the way it is - when I checked with my old PC, when the graphics was configured at 24 bits the output was better than with 32 bits. Not the end of the world, just annoying.
  • Sound is rather weak and not very rich. Buy some good external speakers and you'll be fine.

All'n'all, it's a really good machine, that even managed to surpass my already high expectations. I'm pretty sure that next time I buy a laptop, LG will be on the top of my potential brands!

Testers are from Venus, Programmers are from Mars

I don't know how this works in other parts of the world, but here in Israel it is very common for wannabe programmers to start with a job as testers. Usually it's some part-time job they do in their last year of studies, with the hope of being appointed a fix job as programmer when they graduate. I didn't go through this path, but many programmers I know have.

My impression from this process, as an outsider who has worked with experienced QA Engineers as well as Software Engineers from various backgrounds is simple - it's WRONG WRONG WRONG!!!

Why?

Well, first, it emphasizes a common misconception that QA Engineering doesn't require specific qualities, something most person with enough brain to eat with fork and knife would be able to do. I think it's degrading one of the most important parts of software development. Being a good QA Engineer is hard, really hard even. It requires specific qualities very few people actually have. When you take inadequate people to do a difficult job - don't be surprised that the results are sub-optimal (to say the least).

Second, I think that there are elementary differences between the character traits required from good QA Engineers and from good Software Engineers. Actually, the qualities required are so contradicting that I don't know any person who would be able to be good at both.

Let me elaborate:

  • Patience - programmers are usually rather impatient people. When they feel their computer isn't working well, they'll go poke around the registry and whatever, sometimes making things even worse, just because they don't have the patience to wait for an IT person to help them out. A good tester needs a tremendous amount of patience. They may find themselves doing hundreds of tests, each slightly different from the other, for several days in a row.
  • Feedback - programmers like to get immediate feedback, to see results quickly. That's why add-ons like Resharper give you the little green light on the side - so you know up-front your code will compile. Before that, most programmers used to hit the 'compile' button every few minutes, because they want to know - now! Testers can find themselves working on a good release, finding very few bugs. So there is really no feedback for a long time - the fact that they didn't find a bug, is it because the software is indeed so good, or maybe they missed something?
  • Ego - programmers are very fond of their ego. We're not like sales-persons, but still, we like being appreciated (hope I didn't offend anyone here...). Good testers are rarely adequately appreciated - if they find many bugs, programmers won't like them (which is a mistake, of course, but that's life). If they don't find bugs, then they inevitably will show up in production, which is even worse. So testers need to be able to swallow their pride often, very often. Something most programmers I know are not capable of.
  • Order - finding bugs is not enough. Being a good QA Engineer means you should be able to work in a very orderly fashion, documenting each step in order to be able to reproduce the bug (and give the developer as much information about it as possible). I know very few programmers who are good at this - it's sad, but it's a fact of life we must accept and learn to live with.
  • Creativity - both tasks need a healthy amount of creativity. Yet, there is an important difference in the type of creativity required. Programmers need a contructive creativity - finding better ways to build software. Testers, on the other hand, need to find new ways to break the appplication. The difference is subtle, but significant.

I'm sure I'm missing still many other characteristics, but I think the idea has passed. When I think of the really good QA Engineers I know - none of them would make a good Programmer. Same goes the other way - I know of no good Programmer who would be able to be a good QA Engineer.

I strongly believe QA is one of the most important aspects of Software Development. In some sense, it's the most important one - it's the last phase the application goes through before it gets to the clients, the last chance to improve it before you make a fool of yourself. To my great dissapointment, many software companies I know fail to understand this. With companies that are not 100% software (ISPs, networking, applicances, etc.) the situation is even worse - some don't see why they need a QA Team, if they hire good enough programmers...

As I said, QA Engineering is hard. Doing this good is really hard. Finding good QA Engineers is hard. It's important to understand this, and accept it, if you really want your software to be good!

In this spirit I intend to post a couple more posts in the next few days (unless my wife gives birth before I get the chance). I intend to talk about "QA Day" and "Sanity Tests". If you don't know what these are - stay tuned, these are very effective tools for improving the whole development process.

Sunday, October 01, 2006

SQL Injection Attacks

Scott Guthrie posts about Guarding Against SQL Injection Attacks. He also points to a great post by Bertrand Le Roy on the exact same subject. Oren Eini tried to tried to create a HQL injection with no results so far.

I think that of all, what fascinated me the most was Rocky Heckman's webcast, where he displays a step-by-step SQL injection attack, which cut the air of my lungs the first time I saw it. Be sure you don't miss it!