Most search engines limit the total number of results they return per search string. With Google it's 1000. With Yahoo!, if it hasn't changed lately, it's 5000.
I've been asked a few days ago whether I know of any workaround that makes it possible to get more results per search string. My first answer was - "sorry, no can do". I did find a way to circumvent the limitation imposed by the Google API that limits the number of queries that can be executed per day (which is accidentally also limited to 1000). This workaround is a side-effect of my Google Image Search API. Yet, this does not provide a means to get more than 1000 results per search string.
After giving it some thought, I could figure out at least one way to increase the number of results per query. It's not a very accurate solution, but it's better than nothing. The idea is to use the various search engines that perform query refinement (some call it clustering of results). A good example is Ask Jeeves. When you perform a search on these engines, they also give you a list of suggestions to narrow or expand your search. That is, if you search for "apple", the narrowing suggestions are things like "Apple the fruit", "Facts about Apples", "Apple Tree", "Macintosh", etc.
When you work with this kind of engines (or with a "simple" search engine and one with narrowing capabilities together), you can start out by running the original search (apple) and retrieve all the results available for that. Then you can iteratively retrieve the results for all narrowing queries as well (up to 1000 for each), and keep drilling down as much as you like. Of course, there will most probably be a substabtial amount of duplicates in the results, which you will have to handle. Also, the more you drill down, the farther you'll get from the original query (i.e. query drift). Another problem is that of ranking - say your original query was "apple", how do you define the ranks between the results for "apple tree" and "Macintosh". So this still raises quite a few questions. Yet, in the end, you can end up with a much larger number of results that are to some extent related to the original query.
You may ask - why would someone need more than 1000 results per search string? Besides, the further you go down the ranking, the less their relevance to the original search string. In most cases - you're right. Yet, for some research purposes, not only would you need more than 1000 results - you might even prefer getting these than the "good" results returned in the first few pages.
Can anyone come up with some other (better?) idea to work around this limitation??? If you have an idea - please drop me a line!
(Note: I'm using the term "search string" to indicate a complete search, regardless of the number of results pages you get. The term "query" refers to what retrieves one single results page, since the query also includes the result index at which the results page should start. In other words, all "search results" for a single "search string" are achieved by sending multiple "queries" - if you have 100 results in each results page you need to execute approximately 10 "queries" to retrieve all the results Google provides for that "search string")
Sunday, August 27, 2006
Getting more results from your search engine
Posted by Ilan Assayag at 12:14 PM 0 comments
Monday, August 21, 2006
Setting default share permissions
If you use shared folders often, you probably know that Windows XP defaults share permissions to "Everyone" (with full access). If you don't know that - shame on you!
I don't know exactly where this default is stored, but if you want to change it, you can do with with the Tweak UI PowerToy.
In the tree on the left, select "Access Control". Then choose the "Default share permissions" in the combobox and click "Change".
Enjoy :-)
Posted by Ilan Assayag at 11:52 AM 0 comments
Working with Source Safe over the web
I am working (from Israel) for a company from the US (I haven't finished my thesis yet, but my grant is dry and I still need to feed my family...). Since I'm working on applications related to trading, there is a great emphasis on security so every file transfer is done over a VPN.
Lately I needed to work directly with their VSS database. I agree with most that Source Safe is really something that should be left in the past, but it will take some time before I can convince them to switch and in the mean time work must continue. Everybody who has tried to use VSS over an Internet connection knows it's just impossible to work this way. Add to that the cross-Atlantic delay and a VPN link and you get to wait 10-20 minutes only to open a tree in the viewer (that is, if you're lucky enough not to get link errors, which I'm currently investigating with my ISP). So I've been looking for applications that help accessing an existing VSS database over the Internet.
The major tools I found were as follows:
Posted by Ilan Assayag at 11:36 AM 0 comments
Friday, August 18, 2006
The War
On July 16th I had to flee my house – and came back after having wandered around the country with my family for 30 days. If you want to get a glimpse of what the war has done to my personal life, here you go.
I live in a small village in the Izrael valley. It’s in the northern part of Israel, but quite far from the border with Lebanon (about the same distance as Haifa). I have a magnificent little girl, a beautiful wife 7 months pregnant and a crazy dog. In the proximity of our village is an important air-force base. In normal days, the sound of the planes is not very pleasant, but you learn to live with it. We knew, before moving to that village, that if a war was to happen, our little heaven would be troubled, especially due to the proximity to the base, who is an obvious target. We never figured how much …
To keep things simple, I’ve decided to summarize the impact of this war on our day to day life as follows:
- We don’t have any shelter in our house – so from the beginning we had to flee in order to be safe. So we slept 30 days at other people’s who have been kind enough to open their house for us (6 different places).
- Rockets landed a couple of hundred meters from where I was, and more importantly from my girl’s kindergarten (in the town next-by).
- The first time the rockets landed close by, all phone lines in the neighborhood of the kindergarten crashed. From the road we took to the kindergarten it took a while to understand that the smoke came from behind the kindergarten and not from there exactly…
- At some point my wife decided to go back to work (we were staying at our in-laws at the time who unlike us have a shelter). The sirens caught her when she was in the parking lot, ready to go back home. It’s an open lot, with no cover at all. At first she simply dropped to the ground, in order to try and avoid the deadly bullets. When the rockets started to fall she tried to find something to she could shelter under. She found a place with a 3 millimeter roof where a few other people were taking shelter. The funny thing was that the roof was the last of their concern – there were many gas tanks pilled up right next to them … When the attack was over, she literally flew home. Two hours later there was another attack in the same area. 3 of the rockets landed right in the path my wife uses back home.
- One specific attack was particularly scary – we were at my in-laws and when the siren started my mother in law was in the shower. She didn’t make it to the shelter on time. Suddenly the rockets started falling – REALLY close. My wife, who’s 7 months pregnant sat next to me, in the shelter. At that moment, while we felt the whole house tremble and her mother wasn’t answering our calls – I thought we were about to loose our baby.
- Before that same attack, my 2-year old girl was looking at a DVD of the Teletubies. For those who don’t know what this is – they are little creatures who are happy with everything and laugh for anything. You could rip their heads off and they would still find a reason to laugh and be happy and nice. Anyway – from that moment on, my little girl doesn’t stop telling me that “the Teletubies scared her”. She has become moody, winy, and can’t stay more than a few minutes without seeing us both (don’t even think of leaving her with somebody else). She often wakes up screaming shortly after having fallen asleep, probably due to nightmares.
- Until now, when I hear an ambulance my heart misses several beats – my first reaction is that it’s a siren again. Any strong noise (even a door being slammed) makes me fear a rocket has fallen. When I’m with others, we usually exchange looks and it’s clear to all that we all experience the same thing.
- The thing is – we are among the lucky ones. None of our friends and relatives got killed or seriously injured. Just to show you how lucky we are indeed - a friend of us lives in another village next to us. Two of her nephews who live in the same village got hit by those horrible bullets being propelled by the rocket when it explodes. They each got 2 bullets in the arm. In the past two weeks they endured together more than 10 surgeries and they are not over yet. Imagine if the bullets had hit the abdominal region or the head…
Posted by Ilan Assayag at 10:22 PM 3 comments
Thursday, August 03, 2006
SQL Server 2005 - Frustrated by a good feature
I guess that most people who have been using SQL 2005 for some time already know about this. I, for my part, have worked a lot with SQL 2000 in the past, but never had the chance to really work with SQL 2005 until recently. Now I needed to connect to a server on a remote machine and had this frustrating experience...
1. Scenario: trying to connect to a SQL 2005 server from a remote machine fails. It gives the following message: "Sqlcmd: Error: Microsoft SQL Native Client: An error has occurred while establishing a connection to the server. When connecting to SQL Server 2005, this failure may be caused by the fact that under the default settings SQL Server does not allow remote connections."
2. After getting this message once or twice you take the time to actually *read* it, and start checking the settings for the SQL Server.
3. Like anyone with experience with SQL 2000 would do, I opened the SQL Server Management Studio. There I looked at the properties for the server instance and saw that the configuration seems fine and the server is configured to accept remote connections.
4. Back to square one. From this point I started looking for the culprit...
a. Maybe it's the FireWall? I configured the FW to trust my local network, but maybe it fucked up somehow? Disabling the FW quickly showed me that's not the problem.
b. Maybe there is some problem with the SQL version and I should upgrade? Seemed unlikely, yet I checked it out. Turned out to be irrelevant since I was already using the latest version (SP1).
c. Maybe I did something wrong with the connection string I used? Tried all possible variations - nothing worked...
5. In the back of my mind I started to rethink about this error message I got. It says that by default, SQL 2005 is configured not to allow remote connections. I don't remember having changed that - so how come it's configured to accept remote connections? Could there possibly be some other configuration parameter that has some impact on remote connections?
6. I went over all the configurable parameters for the server instance and for the database (from SQL Server Management Studio) - nada
7. Well then, I'm on the verge of throwing my computer out of the window. I'll give Google a last try. Then I found this: http://support.microsoft.com/?kbid=914277&SD=tech
8. It turns out that there is a much more elaborated way of configuring remote connections in SQL 2005. This is done through the "SQL Server 2005 Surface Area Configuration". It is, of course, a good thing they have added this wealth of configuration options - but why couldn't it be accessible from the Management Studio? Couldn't they add an "Advanced" button on the remote server connections options that opens this Surface Area Configuration??? And if they didn't want to put too much things in the Management Studio - why did they give an option to configure remote connections there, when it can't work on its own anyway???
If you ever run into something similar - please remember this, it will save you some valuable time ...
Posted by Ilan Assayag at 11:53 AM 0 comments