Why not? (Ilan Assayag's blog): December 2006

Wednesday, December 27, 2006

I may not be cool, but I think this sucks!!!

A new Israeli site provides users a means to retrieve peoples' name and address given their phone number.
Now imagine this scenario: your 16-year old daughter sat for a nice cup of coffee with her friend at some coffee shop and forgets her mobile phone. A guy who noticed her in that coffee shop and found her good looking takes the phone and starts going through her numbers in memory. He finds the "Home" number, and through this despicable site, knows exactly where to locate her.....
This site should go down - the sooner the better! I know that in many cases the database is outdated, but I managed to find the names and addresses of many people very close to me.

(through YNET)

Article updated

I've updated my API for Google image search article. It now compiles with VS2005 and loads the regular expressions from an external text file, instead of hard-coded.

Tuesday, December 26, 2006

The Circle of Life in the eyes of a 2.5-year old

Gal: They are going home.
Dad: Who's going home?
Gal: The rain! (pointing at the raindrops on the car window)
Dad: Where is their home sweetie?
Gal: Underneath the trees! (said in a what-kind-of-stupid-question-is-that tone)
Dad: Who told you that's their home?
Gal: I did.
Dad: Hmm, if that's their home, then where are they coming from?
Gal: From underneath the trees! (same tone)
Dad: So the raindrops originate from underneath the trees, and then go back home underneath the trees?
Gal: Yes!

Monday, December 25, 2006

A goose laying golden eggs?

While reviewing my blog statistics I discovered that, obviously, the highest hit rates occur during periods where I post more and more frequently.

>> No surprise here - where's the gold?

The more interesting bit, is that almost every post is followed by one or more clicks originating in the Bloggers' "Next Blog" button (look at the upper right corner of this page). I always thought this button refers the reader to some random blog on Bloggers' blogosphere. Well, that's not exactly the case. I don't know exactly how it works, but basically, you have a chance of being referred to by this button, if you posted something a short while ago.

>> OK, OK, now didn't you promise gold?

Here it comes - AdSense, and all other advertisement that are based on either clicks or views (CPM or Cost Per Mille (1000) impressions). With AdSense, they work with batches of 1000 views (I don't know how much you get paid for these - I'm not advertising on my blog in any way, as you see). So, with a little program that would write a new post every second (you could split this into several blogs, of course), you get 86400 views per day (assuming each post results in one single random viewer, through the "Next Blog" button), which I suppose should be beneficial...

>> Nice... But aren't they using CAPTCHA's to avoid misuse of the blogs?

Yes and no - you need to prove, for a while, that you're up to no bad. Then you can send Bloggers an email requesting to be exempted from the annoying CAPTCHA. Once they've done that, you're free to go. Also, with the advent of Windows Live Writer and other similar tools, I guess they are getting looser and looser with this issue.

>> Now really - how much could I earn this way?

Well, it depends on a lot of factors. Let's assume you get 5$ for each 1000 views. With one post/second, you get 432$/day or 12960$/month. And once you have the infrastructure up and running, there is very little work to do, and you may be able to increase it by adding more and more blogs to it. The main problems you may encounter, IMO, are:

How to get CPM deals? I suppose in many cases the advertisers will want to really know the sites on which they will be publishing. If that's the case for all CPM programs, you'll have to rely on clicks, in which case the revenues will be substantially lower (if any).
How to remain below the radar? Obviously, the more blogs and posts, the more chances you will stand out.
Generating random posts - it wouldn't hurt if you managed to make those posts look genuine. Somehow...

Summary:

Create several blogs on Bloggers. I would even go for several thousands, in order to have an average of only a few blogs a day, to remain below any radar.
For a while, post reasonably valid stuff on all these blogs. It's a lot of work, but it's going to be worth it.
For each of these blogs, ask Blogger to be exempted from CAPTCHA.
Create an AdSense account for each of these blogs. You may want to try other advertisement tools as well, why not?
Write a program that automatically generates random posts. I would make them using concrete English phrases, maybe citing some scanned book or some news website (beware of legal pitfalls). Automaticaly generate a post to one of these blogs every second.
Start counting the money...

DISCLAIMER: What I've written above is the result of some random thoughts, while I was trying to solve some completely unrelated problem. I have no idea whether it would indeed work or not - I didn't try it and don't intend to. This should also not be viewed as a recommendation to act this way - I have no idea whether it is legal or not, and am not responsible for anyone who tries it and finds himself behind bars because of that. If you have any information that could shed some light as to whether or not this is feasible, drop a line, to fulfill my academic curiosity ;-)

Sunday, December 24, 2006

You call this medical treatment???

My wife's grandpa, over 80, got hospitalized a couple of weeks ago due to hyponatremia. As this problem implies, it was probably a result of various other illnesses he has been suffering from lately. While at the hospital, he catched pneumonia and above all he got a very violent bacteria, which reached his heart and even harmed his artificial pacemaker. The only way to save his is surgery, to remove the pacemaker, but his condition is so bad that they're currently trying to figure out how to do it, and in any case the chances of success are very slim.
Had he stayed at home, eaten better and drink a lot, he'd probably had gotten rid of the problem in a few days.

And that, ladies and gentlemen, happened in one of Israel's finest hospitals (Rambam).

Thursday, December 21, 2006

VS2005 SP1 - I'm so glad I'm not an early adopter

Because my heart wants me to be an early adopter, I'm always extra careful not to be one (hope this made sense...).

I haven't installed Vista yet, nor Office 2007. Now that Visual Studio 2005 SP1 is out, I almost installed it immediately, but I managed to restrain myself and force myself to wait for a few days.

Well, patience pays well, they say. Apparently there are serious problems with the running time, disk space used and disk I/O in general during the SP1 installation. Thankfully, Jon Galloway posted a way to overcome it.

I think I'll wait a few more days to see if anything else shows up and then I'll take my chances :-)

(through The Daily Grind)

Update: Oren Eini has gone through a similar ordeal when he installed the service pack. Here are some other useful links from his blog:

http://blogs.msdn.com/astebner/archive/2004/11/10/255346.aspx

http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=1042560&SiteID=1

( http://connect.microsoft.com/VisualStudio/feedback/Vote.aspx?FeedbackID=226009 )

Tuesday, December 19, 2006

The telephone evolution

Once upon a time, you only had a simple phone, with a single 10-digit dial ring. Period. That was a huge leap from the Telex and those phone communications that had to go through some central. It worked, everybody was happy.

Then came the more fancy, digitized phones. Still 10 digits, but push-buttons instead of a dial ring. Some even came with no wire (and usually very short-life batteries).

Then they added all kinds of features, with memory and coffee making. Well, it never made coffee, but usually saved the 10 numbers you recorded upon purchase correctly, but you never remembered whether number 3 was Mom or Mother-in-law. Bummer!

Then came the cellular phones - like the latter, just mobile. You felt safe, that no matter where you went, you'd always be able to ask for help. You ended up being reachable even at the least convenient moment, and getting low connectivity or no more battery juice when stuck with a flat tire in the middle of the Nevada desert at 3AM. Murphy rules!

Now look at this review in YNet about the Nokia 6131 (Hebrew). I won't get into the details, but instead I'll just translate the subtitles of the article, in the order in which they appear:

Design
Screens (that's no typo - this little baby has 2 teeth, er..., screens)
Media (referring to camera - video and stills - , music, flash disc)
Connectivity
Conversation quality and battery
Summary

I wander what Darwin (or Alexander Graham Bell) would have to say about this...

Monday, December 18, 2006

Fighting spam mail collectively

(Note: this post may describe an already existing technology. If it doesn't exist, then it's high time it would...)

As far as I know, there are two major techniques used by spam filters:

List - the filter has a list of indicators that allow the classification of a message as spam. This list is typically updated periodically, and usually the users can add items to it when they mark a mail as spam. Outlook 2003's Junk Mail Filter is a good example, which, as far as I know, is primarily based on domains and email addresses.
Adaptive - the system starts with an initial classification mechanism, sometimes as simple as allowing all emails. Then, whenever a mail is marked by the user as spam, the classifier is updated to inhibit this rule. These kinds of techniques are usually based on some machine learning algorithm, often some kind of derivation of the Bayes classifier. The advantage of these techniques is that they are more likely to discover spam mail of a completely new format. On the other hand, it may result in weird cases of false negatives and even false positives.

The limitation of these methods is that:

The first one depends on how well your provider knows about new types of spam mail
The second one requires, for each type of new spam, a set of samples it can learn from. So you will inevitably need to mark some mails as spam until your classifier is correctly tuned.

What I'm proposing, is that whenever you manually mark a mail as spam, the whole mail would be sent to the provider. This will give the provider a huge amount of recent spam mails, making it possible to update the filters in a matter of minutes. Then, each client that reads mails thereafter will already have a filter that will know how to filter these new types of spam mails (assuming the clients update frequently, such as once a day).

Problems:

This could, in itself, generate a huge amount of traffic, only due to the constant data sent to the provider.

Solution: Data could be sent in bulks to the provider (once every few hours), and no necessarily in email format
Solution: If the clients are frequently updated, this shouldn't be a problem, because then fast enough all clients will have an updated filter and stop sending data to the provider.

A spammer could buy several licenses and start pumping the provider such as to make him generate completely useless filters.

Solution: Data sent to the provider will be encrypted and include recognizable information from the user (based on the license information).
Solution: It should be fairly easy to detect such spamming clients and rule them out.

The bottom line is that using the strength of collectivity, it should be possible to create much more accurate spam filters, much faster.

Sunday, December 17, 2006

Martial Arts Against Rapists

Some time ago I advocated we should include some kind of martial arts in the school curriculum, to help preventing violent crimes. Well, here is a concrete proof of this idea (Hebrew). A 16 year old girl has been attacked by a potential rapist. Fortunately, the girl is proficient in Karate and with a couple of strikes managed to break the bastards nose and make him run away for his life. As simple as that - she knew how to defend herself, didn't freeze out of panick, and managed to avoid one of the most awful experiences a girl could endure.
I'm just sorry she didn't manage to break his legs, so he could get caught, or maybe his neck, which would have solved the matter completely...

Tuesday, December 12, 2006

בחנוכה אוכלים סוכריות ולחם

גל: אבא, איזה חג קרב ובא?
אבא: נו גל, איזה חג?
גל: חג החנוכה!!!
אבא: ומה אוכלים בחנוכה?
גל: מה?
אבא: מה אוכלים בחנוכה?
גל: מה?
אבא: מה אוכלים בחנוכה? ... סו ... סו
גל: סוכריות !!!
אבא: לא, אוכלים סופגניות. מה עוד אוכלים בחנוכה? ... ל... ל
גל: לחם !!!

Wednesday, December 06, 2006

Rob's done it again...

Check this out - An Open Letter to the Software Managers of the World - oh, so true...

Tuesday, December 05, 2006

Buffering hacks

Check this out: http://blogs.microsoft.co.il/blogs/applisec/archive/2006/12/04/Buffer-Overflow-_2F00_-Overrun-examples.aspx

Monday, December 04, 2006

A truely amazing pantomime...

Just got this by email - really amazing: http://www.dailymotion.com/visited/search/jerome%20murat/video/xf9oo_jerome-murat=

More missing things in C#

Something I'd like to add to my C# wish-list:

Why is it that when you use "using", you can only put a single parameter inside it. Often enough you need to use multiple resources at once, and want to get hold of them and release them at the same time. In such case you need to write something like this:


using (StreamReader reader = new StreamReader(inputFileName))
{
    using (StreamWriter writer = new StreamWriter(outputFileName))
    {
        
    }
}

What I would really have liked better would be something like this:

using (StreamReader reader = new StreamReader(inputFileName),
       StreamWriter writer = new StreamWriter(outputFileName))
{
    
}

Wouldn't you agree?

More ISP ranting...

[Update: Upon request, I've added the names of the companies I was referring to as well as some updates at the end.]

I'm not happy with my Internet. There are, at least for a few more months, 5 major ISP's in Israel (which will unfortunately drop to 3 soon enough). There are more, but I'm only considering the largest ones.
Anyway, of these 5 ISP's, one (Internet Gold) I don't like for personal reasons (I'd rather not elaborate), another (012 Golden Lines, the smallest of the lot) has a reputation of being technically inferior and yet another (Bezeq International) has a very bad reputation of taking money off customers without prior notice, making it very hard to get it back (experienced by people close to me).
This leaves me with 2 choices. I had been customer of one of these 2 options (Barak) for several years, being pretty happy with it (service, support and quality). Yet, during the past few months I had had a need to work more intensively with a cross-atlantic VPN connection. This connection got constant short disconnections due to packet losses somewhere in the middle. After having tried to fix this problem together with my ISP for a couple of weeks, I came to the conclusion that that's the best they can offer, and decided to move to the second of my optional ISP's (Netvision).
Yet, after a while I started to feel the download times are significantly lower than with the previous ISP. I'm talking HTTP download (I don't use any P2P) from major US-based websites. I talked to their support, and after some junior support person finished proposing me stupid solutions (workarounds that would barely make my granma happy, if she were alive), I finally got to talk to a senior support guy. This was the worst support conversation I ever had - he was rude, didn't let me finish my sentences, constantly made me feel like I'm a bone-head, and made me decide to move back to my previous ISP - at least they were polite!
Customer service tried to convince me to stay, and since I have technical problems with both, I'm now paying for 2 ISP's, checking the differences between them every day or so, hoping I'll be able to decide in the next couple of weeks.

I am now several weeks in this test. In general, there is no significant difference in download time between the two providers (Barak and Netvision). When one of them is slow, the other seems to be as slow (with sometimes an advantage for Barak). One MAJOR difference is the download time of emails (I think it's most promiment with my 2 GMail accounts). There Netvision can get stuck for several minutes (especially mails with large attachements), whereas Barak works like a charm.
I am supposed to receive in the next few days a router that will do the VPN work instead of the software installed on my machine. I'll then check the VPN connection more thoroughly, and make my final decision. Due to the quality of the support team and the email speed, I really hope I will be able to choose to get back to Barak and get this done with.
In any case, I assume the decision between these two will have only a short-term impact, since the two companies are due to be merged during the year of 2007.

Sunday, December 03, 2006

Where is the regulator?!?!

Until recently, the Internet market in Israel was a pretty good example of the free market. You had various suppliers, each with their pros and cons, and as a customer you could move relatively freely between them. Also, due to the harsh competition, they were eager to please - to our (the customers') benefit: prices kept going down, service quality kept increasing, etc.

A couple of months ago, Netvision and Barak have announced a merger that will take place by the beginning of 2007.

Today, Internet Gold and 012 Golden Lines have announced a similar merge.

Thus, from 5 medium-sized competitive companies (Netvision, Barak, Internet Gold, 012 Golden Lines and Bezeq International), we're left with 3 strong behemoths (Bezeq Int'l was pretty large to begin with), significantly harming the service and prices for the customers.

BTW - it's not only Internet we're talking about - the exact same companies also provide international communication services, and some are also involved in either the cabling or the sattelite television.

REGULATOR - WHERE ARE YOU ?!?!?!

A project I'd like to know more about

Sam Gentile speaks about a successful 14-months Agile/XP project w/ Wilson OR/M and other goodies. He promises to provide us more info - and I'm looking forward to reading it.

Friday, December 01, 2006

Sanity Tests

Continuing my QA posts (Testers are from Venus, Programmers are from Mars, QA Day), I'd like to talk about Sanity Tests.
Regardless of your development methodology (Waterfall, Agile, Scrum, WhatEver...), at some point or another you have to forward releases to the QA team's inspection. Much too often, the QA team receives a version of the software, starts testing it, and after a while (one day, 3 days, sometimes a week), they realize that some basic functionality has been harmed too significantly to be able to continue their tests. So the release goes back to the programmers to fix the problem, after which the testers must start everything all over again.
An excellent way to avoid this, is to use a layered testing methodology. The idea is to create testing procedures that, instead of investigating a specific feature to its smallest details, investigates the whole system up to some level of detail. Then each test procedure is actually a "drill-down" into more specific parts in the whole system. The idea behind this methodology is to be able to provide as profound a test as time permits, without ever missing a single feature.
Yet, many organizations and QA teams are not build for this kind of methodology, or simply don't believe in it. Also, it has some inherent overlapping of work, which could make the whole thing completely irrelevant in many cases. Imagine a feature which, to be tested, requires a whole day or preparations. With layered testing you would need to setup this test several times - once for each "layer".
So if we can't use layered testing, what can we do to avoid the problem stated above?

Well, here comes the Sanity Test...

The purpose of a Sanity Test is to test all the basic functionality of the system, without getting into the specifics of any of them. For example, if you're working on a customer support system, it is completely inacceptable to get thrown out of the application whenever you try to open a new support case. On the other hand, you might want to continue testing the system, even if some of the reports don't open properly. So the idea is to test the system end-to-end by making sure that at its base, all features work, without focusing further into any of the features. From my experience, for a team of approximately 4 developers, the sanity test should take around 1-2 working days tops (that's for development cycles of ~1 month - Agilists would get other figures). The most important thing here is that: a build that does not pass Sanity Test is not fit to be transfered to QA!

I've been working with Sanity Tests at 3 separate organizations, twice were my initiative. It's not easy to make everybody understand how important it is. Most of the times, the developers are those responsible to do the testing (after all, although it's acceptable to have bugs, the developers are still responsible to provide a workable release), and they don't like it one bit. Project managers get p--d off when a release is not transfered on time to QA because it didn't pass the test. Some QA teams don't know how to enforce the rule that they should not accept non-sane builds. And, of course, creating the Sanity Test Procedure can be quite tricky - you want it to be such that it completes fast, but tests all features well enough to know they are more or less OK.

Despite the problems it incurs, Sanity Tests have proven temselves to be very useful in creating more reliable software on time. The developers become more responsible about their code, testers are more productive and the whole development process runs more smoothly.

Why not? (Ilan Assayag's blog)