- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Its possible to solve it with something else than A*, though, I'd say A* i easy to implement and gets the job done in a decent time. ;)
Admin
I know, I'm aware of other solutions I was just using A* because it tends to be the best one for most situations.
Admin
Why not? Assuming that, if they are using PHP they are probably using an OSS RDBMS, both MySQL and Postgres have spatial extensions allowing you to query points (latitude and longitude?) and the distances between. I also understand that Oracle and DB2 have it. MSSQL Server added it in 2008.
Even if you can't use these (not likely) or their geospatial extensions (possible), then adding double precision latitude and longitude fields to the hotels tables and doing your calculations doesn't seem out of the question. You can buy databases with zip code to latitude/longitude conversions.
They were, after all, doing the calculations of distance somewhere--it doesn't seem like it couldn't be done in the database. Since we don't have details that's not to say that it would have been the best way.
Admin
Admin
About distances:
I think it depends on your purpose. If you are using the hotel for business and have your own car, you might be interested in driving distances.
When I'm on vacation in some city (London, Rome) I mostly walk or take public transportation, so I'm more interested in the direct distance.
Admin
you still can't use crow-flies distances for that. you still have to use the transit network graph for that - just in this case you're searching the edges labeled "walk" and/or "mass transit"
Admin
Admin
There are so many other real WTF's in this story: why was there no single method for producing the correct, filtered results? Why were they not using Python/Django? What the hell. PHP is the new ASP
Admin
Bah,
For those who say sorting by distance can't be done in the database, you cant be more wrong. Sure you can do it in PHP, as I have done before (with complex text ranking etc), but its muuch more efficient to do in a DB server considering you usually want to limit the amount of RAM your php scripts are using (because a webserver can dish out thousands of php processes at once).
Simply SELECT * FROM coords ORDER BY (len(usercoord - coord)) ASC
coord is a 2d GPS vector DUH! Of course this is as the crow flies, but who is using google maps api to map walking directions PER RESULT?
Admin
Kazan. Your pathfinding exercise is useless to me.
I don't want to know the distance my hotel of choice is to some place I want to visit, I want to know its proximity. Any real commuter will tell you that distance is not measured in kilometers, its measured in time and your pathfinding elements cannot calculate the difference.
For instance if I ride my bike home from work its 11km, if I take transit its 9km if I drive its 14km and yet they all take 35 minutes so each is viable depending on financial and other convenience factors. The same thing applies to searching hotels and most of the time when I'm simply localizing the hotel, not actually looking for a route to and from a destination - I have maps and the brain to use them.
BTW
I'm a user and I don't care.
Admin
If you use that equation to generate that data that you present to the user, that is a different story. But for getting the order it doesn't matter
Admin
google maps traffic data proves you wrong on traffic travel time.
having a complete transportation network map - bike trails, pedestrian bridges, etc - also could solve that problem.
just because you refuse to acknowledge a problem is solvable by computers doesn't make it so.
just because you're fine with "as the crow flies" doesn't mean most users are these days. they've been spoiled by google maps. they want travel time.
Admin
Your search found ZERO results!
END SEARCH RESULTS.
BEGIN SEARCH RESULTS:
Admin
However, "most users" in this thread have said the opposite. I have never booked a hotel based on the driving, walking or other non-"as the crow flies" distance. I really don't care. I know pretty much where I want to go, and I want something that's kinda in that general area. I'm not going to say "Oh, that's 50 feet out of the way - can't stay there..." The brand and price are actually more important to me than distance - I'm willing to go a little further if I have warm fuzzies about the brand, and/or if it's not going to break the bank.
Admin
I have to second Kazan here. There is no reason why a pathfinder AI(or just about any graph traversal AI really) cannot take walking, driving bike or teleporting for that matter into account. You just have to setup the appropriate graph to traverse. Or however the AI is designed to adress the problem.
That being said, I'd think that most people are pretty happy with rough estimates as far as distance goes as a first vetting method. ;) apart from class of hotel and stuff like that.
Admin
Kazan's protestations aside, nobody is saying it can't be done. Very clearly it can be done, and done very well. However, is it really the necessary solution to this problem?
Kazan's points seem to be:
OTOH, Kazan's persistent trolling has managed to keep a bunch of us going for quite a while now. A golf clap for you, sir! Very nicely done, indeed!
Admin
Yeah I usually only decide between one hotel and the next based on the prices at the bar.
These comment threads are always interesting. What I'm thinking after reading these is "this is why programmers don't get invited to meetings"
I find it funny that this article was about someone who was given the algorithm for solving his problem, and decided to complicate it beyond what was given, and the comment thread shows a strong desire to do just that.
Admin
NOTED
Admin
Admin
Admin
Admin
Too clever is dumb.
Admin
It's like (real estate) searches that assume you know the name of every possible suburb in a city, and the only UI is sorted alphabetically in a huge dropdown. Most people only want to point at a map and say 'look here in a 10 block grid'.
With too few results, a smart search will also search a larger grid and say 'I know you wanted x, but I also found y a little outside of what you were asking'
Admin
This isn't that hard to do for small data sets:
steps 2 & 3 can be implemented in a single SQL select statement.
For larger data sets, it makes sense to cull the results of those points not even remotely close without having to calculate the precise distance (or even look at all the points). To do that efficiently, you'd have to use some sort of external quad-tree like structure. (in other words, not SQL, unless your database can handle it, like MS Sql 2008)
Admin
Um, Manitowok, WI is about 60 miles From Ludington, Michigan as the crow files. But most people without a yacht have to go the long way around Lake Michigan's shoreline, making it more like 420 miles.
So you can't use Pythagoras for this.
Admin
In short, complicator's gloves.
Kazan may be agitated, but that doesn't mean he's any less correct. Maintaining complex code written in a different language from the primary language used in the organisation and which is stored in the database (not the best debugging environment) is a maintenance nightmare. I've seen what should be simple applications developed in this way that are bloated bug-beasts after three years of effort. Their specs did not contain anything as complicated as route calculation.
Jamming data processing into the data storage layer like this is a perfect example of doing something stupid to conform with a model that wasn't a good idea in the first place. It has to be remembered that database functions like this often represent a "Schlemiel the Painter's Algorithm", and that it's much quicker to do this in the (in this case) PHP layer. On bigger applications you can be looking at the difference between seconds and hours. As far as I can tell, the main attraction of this approach is as ammunition in interdepartmental turf-wars between the "development" and "database administration" teams.
Bear in mind we're not economists, and the "model" needs to evaluated against it's real world performance if you believe the "science" in "computer science" means anything at all. You should not attempt to develop all your CGI in C because of 15 year old notations of efficiency, but you shouldn't jump straight to uncritical model-worship either.
On a different note, PHP is a very effective web development tool, especially if you observe the software engineering basics and use some of the frameworks out there for it (in moderation, some of them suffer from the old "every array and if-statement should be an object" mania). Unfortunately, there are a lot of mouthbreaters at the bottom end giving it a bad name.
Admin
No. If I'm wanting within 5 miles from somewhere, I mean crows distance. It generally doesn't matter to me about the driving distance (unless there's obviously something geological like a mountain or a river). Hell, I might be walking, so your driving distance calculation wouldn't do me any good.
Admin
I apologize, I do think that the problem is solvable by computers, I just don't think the problem is solvable by programmers.
This is like handing a physicist an n-Body problem where n>3. Giving a programmer real-world real-time geographical, traffic, route, weather and path-finding information isn't going to make a workable system that can be used remotely.
And I think that the issue with your "they've been spoiled by google maps. they want travel time" argument is that no reservation site I've been to offers any kind of adjustment to proximity surfing based on time.
I've not seen a reservation site that says "Okay, you want to stay somewhere within 1 hour of Fort Henry in Kingston, which method of travel would you like to estimate by: Foot, Bicycle, Automobile, Bus, Train or Boat? Do you care to estimate with travel times for Foot, Bicycle, or Automobile adjusted for Ferry usage? If so will your travel times be outside of the Ferry service availability? Please wait while we adjust the travel times based on local weather conditions and announced construction and local emergency services activity."
Cumbersome is the word I would use.
Instead they say - "Hey how close to you want to be?" And give you a drop down list of distances to select from, a distance not measured by a clock.
Simple and it allows the user of the site to make assumptions about proximity based on their own knowledge.
Sure it might be nice to say you want to stay within 30 minutes of Old Montreal by car, but you'd only have choices of hotels in Old Montreal - if you were able to select the option of walking then you could actually stay near the train station. (I've done both - admitedly on a rainy Saturday night in the late spring on either side of a Habs game - not the time to drive in downtown Montreal!)
Has anyone complained to a concierge or hotel manager that the distance predicted by the web site is different than the distance they walk/rode/drove? Likely, but pedantic fools like that should be ignored (maybe even shot) as God intended.
Admin
not nash check out the Canadian Multiple Listings Service, they come really close to capturing that usage. (and I just happened to confirm that house prices have been stable in my area over the past 12 months - whew)
Admin
Or they can take the Manitowok-Ludington ferry which takes the route Pythagoras describes (mostly).
A better example would be comparing the distances between Roseau Minnesota, Baudette Minnesota and Northwest Angle Minnesota. In "great circle" terms Buadette and Roseau are almost equidistant from Northwest Angle but by route map Roseau is an hour (30 miles) by car.
Oddly its 61 miles by car between Roseau and Northwest Angle but only 60 miles walking.
Admin
Sure I can. Crow flies are obviously lower bounds. I'd just gamble whatever actual distance is hopefully within 141% of the driving distance (obviously, circuitous routes and routes taking you out of the circle, would take a lot more)
And I'm also guarantying you I'd be walking in places where you wouldn't nor couldn't legally call walking space (and yes, I can do this all legally, too). You'd be giving me an upper bound I could use, sure, and that's useful, but I doubt your graph would have all the shortcuts I'd take, and potentially has a chance to be quite far off the actual walking distance I will take for a route as compared to the crow's distance.
The point being, if I'm asking somebody about interesting items 5 miles from here, I'm not going to require them to perform an A* algorithm to do it. If I want a SPECIFIC route from A to B, well, then yes, I do want a quick route, and I'm going to expect that route to be longer.
Admin
Neither of these two programmers would last very long at my company. In this scenario, it would always we wise to filter and sort at the database level by way of a parametrized stored procedure and no need to do filtering on the web server. At least they weren't trying to filter, sort and paginate client-side with javascript. :O
Admin
In the real world, different situations call for different responses.
Admin
You know, yeah, this is pretty bad, but I can easily imagine myself getting into a mess like this during one of my college jobs, or even my first couple years out of college. Yeah, if "Greg" has had a long career, there isn't really any excuse for this, but I see a whole lot of programmers overestimating their own brilliance, and there are still a whole lot of death march projects out there.
I've been working full time for about three years now and I'm probably not much better of a "coder" than my senior year of college. But I've definitely learned a lot about how to read the "real" requirements between the lines, how to get help, and how to push back. Sounds like Greg should have swallowed his pride and asked for some more hands-on mentoring.
Admin
Gloves anyone?
Admin
I worked with a Greg, he used to get all records from the warehouse DB using an SP then do all the aggregating in the reporting tool.
eg. Getting 13000 records from the DB and grouping them into months in the report Vs grouping them into months straight from the DB and dumping the rows to the report.
He's gone now, but I still have to fix them whenever I come across them.
Admin
having implemented such a feature in java - i can tell you that the problem described is not the real wtf. the problem, implemented correctly is not simple and not implemented in one day. user request page 5 of X for hotels surrounding point A which are available. steps to make:
there are so many fallacies here, even my captcha:facilisis
Admin
Post processing of results....
In a hotel app, there are many reasons for this. Just two are
remove those with no available inventory for the dates in question [not held on your database - that's a lookup per eligible hotel on external DBs]
prioritize listing order by complex criteria [star ratings, closeness to city center, likely profit margin for you, whether you got a clear result on the availability lookup]
You could theoretically shoehorn the some of the prioritizing into an Order by clause, but such a clause is not really designed for embedding complex business logic. Prioritizing the listing to to "best according to complex criteria, including several user-selected ones" is a separate business subsystem in its own right.
Don't even think about going into the hotel availability business if you under-estimate the complexity of that subsystem
Admin
The only reason it would be cumbersome is that you envisage a horrible user interface. And there is no reason that the graph traversal will be measurably slower when taking weather, traffic and geographic data into consideration as long as you have, as you put it, in real time. If you have craptons of access times for that information then it will of course be slow as a dev just having been woken by a frantic boss support call in the middle of the night after a binge at the local pub. That is, unless you cache it in your graphs and update it either on a rolling schedule or just however you feel like it. ;)
As far as "Is it necessary?" Nope. I'd rather think that my "What hotel will I pick" score system has a very low weight on travel distances from places. I rate quality of service and price (especially where the two have an optimal converging point) a lot higher.
Admin
OK...
The REAL WTF here is fetching the WHOLE database, and filtering out the results you want to display in PHP code, instead of doing the proper query in the database to get only those results you need.
Admin
No reason for all these complicated arithmetics. Break your map into square 5x5 tiles, keep tile IDs for each hotel record. Each hotel record will have 9 fields for tile IDs - its own and 8 neighbor tiles. Then just select all hotels where tile ID=my tile ID. Dump the result list onto the screen. Forget lats and longs.
Admin
Admin
TRWTF is pagination because of a few hundred entries ... Or pagination EVER.
It's such an user unfriendly approach especially when it doesn't work correctly with table sorting. Maybe if I select ALL hotels in New York I really want them all displayed? Or say, I need to extract a thousand userids but the STUPID FUCKING UI won't display more then 500 entries at a time and paginates them in batches of 10.
Admin
"The obvious, simple solution? Split up the results into multiple pages."
I hate, hate, hate Web sites that do this.
Web browser already allow for pagination since they only show the part of the page that is actually visible in the window.
The only thing pagination does for me, like the "summary" view in The Daily WTF, is create a need to load the page (or the next page) once more.
This is particularly annoying on mobile devices when you wait two minutes for the page to load only to find six summaries of stories rather than six stories or ten hotel listings rather than hundreds that can easily be scrolled through.
Yes, loading one very long page takes longer than loading a short version, but it also means that after it has loaded one is DONE with loading and can start reading.
Especially with hotels pagination only causes the user to decide between ten visible hotels or clicking another link not knowing if the next ten results are better or worse than the current ten results.
So my vote is that the real WTF in this story is the idea that the results should appear on more than one page. (Whereas I think that pages with random numbers of hotels being displayed are at least more entertaining than unnecessary ten-hotel lists.)
Luctus?
Admin
Admin
Admin
The first 10 pages of results are irrelevant anyway. All the good stuff is hidden away on the last page because Google doesn't want us to find it.
luctus indeed
Admin
Yebbut, it doesn't matter, as the distances are calculated as if they are flat (if you look at a grid of long/lat, you'll see that the lines are already curved) so it takes into account the curvature in the positions.
Admin
http://dev.mysql.com/doc/refman/5.0/en/select.html
SELECT CALC_FOUND_ROWS * FROM table LIMIT items, 10; SELECT FOUND_ROWS();
Admin
The other issue that comes up is that you can't just precompute for landmarks, because people have this nasty habit of wanting to go to other things. For example, a number of times when I've been on business trips I've optimized for location based on proximity to the office I'm visiting. Most office blocks, even nice ones, aren't tourist landmarks. To complete the picture, it's the business travel market that these hotel finder sites really prioritize anyway because people doing that sort of thing are much more likely to want to find somewhere good enough NOW.
(These days, the good booking sites will show you hotels mashed up with Google Maps and manage the amount of info displayed intelligently as you zoom in and out. Slick.)