- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
I second that. TRWTF is filtering data out AFTER having returned it from the database. Databases are here to return either what you want, or nothing.
Admin
And now imagine for a moment that my city is somewhere far to the North. Say, Alaska, or Siberia.
Admin
Holy hell, did Greg go into hardware support?
We do hardware support, not software development... But I've got a technician just like Greg. We'll give him an easy task, outline how it should be done, and he manages to turn it into the biggest mess imaginable.
I'm currently in the process of cleaning up after one of his disasters...
Admin
I was sort of expecting a nonsensical comment like this. Locally, the difference between flat and spherical (I know, I know, the Earth is not a sphere either) topology is negligible. Even a simple parallel projection will work well.
Note that I intentionally left out taking the square root to determine the distance. Yes, distances will be inaccurate, but the relative magnitude of them will be quite accurate.
You mean when I search for hotels in NYC, my logic isn't able to accurately tell which of
A) a hotel in Oregon B) a hotel in SoCal
is closer to the target? F***ing big deal. Even an international hotel reservation system can easily be implemented using a flat map if the borders and projection method are chosen wisely because all relevant distances will be on the order of a few miles anyway.
Futhermore, observe GalacticCowboy's comment. You can waste as much (computation) time as you want on some super accurate geoidal surface distance estimator (taking terrain heights into account...), but it will still be a proxy for what the user really wants to know: How long it will take to get from the target location to the hotel, and how much it will cost. A full-blown route calculation is, however, computationally expensive and there may also be substantial license fees involved.
Admin
Admin
If he is not an idiot, some parking lot therapy may help. Sometimes it's better than costs of seeking a new guy which might be worse.
Admin
fair point :D
Admin
most peolpe want driving distances. and driving distance can be MUCH LARGER than crow-flies distance.
Admin
no you can't you bleeding moron. yuo don't have infinite freaking storage space and infinite CPU time to write EVERY POSSIBLE source location mapped to every hotel in your database.
STOP ACTING LIKE EVERYTHING IS LIKE YOUR COMPUTER SCIENCE 100 COURSE HOMEWORK!
Fark i hated morons who had no outside world experience when I was in university, i hate them even more now. i have to clean up their messes now.
Admin
Admin
With this little function, you can calculate geodesic distances with accuracy within 0.5mm. For free. Still better than unprojected cartesian. Still giving you something comparable no matter how close to a pole you are.
Admin
A* on a road shapefile.
Hint: those shape files can be acquired free/very cheap from the USGS. I implemented a road conditions map using them.
Hint 2: Implement A* in a DB? pfftt
Admin
and you're still, in your ignorance of actual user desires, still performing crow-flies calculations.
The User wants driving/walking/bussing directions. That involves using the A* algorithm on a network[see note] graph.
Note: the 'network' in this case is road/sidewalk/bus/subway watever - transit network.
Admin
When booking a hotel from another city I for one am much more interested in knowing the hotel's distance from some clear landmarks in the city than the driving distance from point A. The driving distance doesn't give me any idea about the hotel's location within the city whereas the distance from say Times Square tells a lot more.
Admin
Darn, I'm starting to like this kind of behavior: take a random comment and make it look like a personal insult. Start screaming and naming names.
Admin
In general I agree however, this appears to be a case of not following directions. There was nothing to design. Bruno explicitly says, "Just remember to run the front-end filters before sending the data to PEAR::Pager." Run a query, filter the data and send to a plugin. Done.
Admin
still driving distance.
Admin
Here's a fundamental difference in the way I read the stories and the way a lot of other people read them. If the story states that they need to get the list from the database and THEN filter the results, I take that at face value. We're not in some textbook perfect world and NONE OF YOU have ANY idea how their system really works. You have NO idea what you're talking about -- it's all wild speculation. If the story states that the results must be filtered after the query, I take that as fact because I don't have enough REAL knowledge to refute it.
If you want to come up with interesting ways that they could organize the system to avoid the front-end processing, then go for it, but don't argue with others using your speculation...
Oh, and the other guy was talking about availability changing in real time -- not some RV-cluster-hotel that roams around the country...
Admin
you can be annoyed with someone's idiocy without being personally insulted by them you know.
but then there I go again expected you, a poster on TDWTF, to have enough intelligence to realize things.
if you don't know the first thing about real world programing GTFO already.
If you do know the first thing about real world programming stop acting like an idiot and thinking that EVERYTHING can be done in data storage*.
Admin
Admin
you're not finding WTFs though, you're creating new ones with your "everything should be done in the database"
NO. IT SHOULD NOT.
A database is a highly efficient data storage and retrieval system. Full stop. it is NOT supposed to do anything but data storage and retrieval. Just because you can abuse stored procedures, etc to do things that should be done in code doesn't mean you should.
Almost every real world "distance between two points"[see note] is done in distances on the transportation network. That means performing A* on a network graph. I could imagine that you could abused stored procedures on a DB into being able to do that but I would fire you for it.
The Clever solution is never better than the Correct Solution.
note: on travel websites.
Admin
Specifying a constraint in a retrieval phase is a bit of programming already.
You know, people have invented views and stored procedures for a reason. If some theoretical drone comes up to me and says, you cannot do that, because it's a <CONCEPT> and this <CONCEPT> was not meant to do that — I slap him in the face. Because I don't care about high matters and academic works on the <CONCEPT>, if bending a rule in one place will simplify a hundred more places, make me twice as efficient and contribute to my wallet.
This is what the real world acts like if you wanted to hear from me.
Admin
If I go to Hotels.com and search for a hotel close the the convention center in Anywhere, USA, I seriously doubt the hotel distances are going to be listed using driving directions...
A simple as the crow flies radius is the only thing that's important. Also, the curvature of the Earth is irrelevant because if I live in LA and I want a hotel in Chicago, I really don't care how far away the hotel is from LA!! I want to know how far it is from landmarks in the city. And then, as long as all the hotels are using the same calculation (and they would) the relative differences between competing hotels will be valid even if the real differences from the landmarks are slightly underestimated because of any curvature... and I mean very slight.
Admin
So you once used an A* algorithm. That means... wowie zowie... you can really program!!!1111!! Gosh...
Admin
The php code is able to "massage" the list (presumably without more input), so that must mean that the app has all the data it needs to populate the list correctly. If that's the case, then why didn't he build the query to just return the results he wanted, rather than the whole list so that he could then iterate through in code?
Admin
did i say retrieval was completely a-programatic?
nope.
however you should never make a database system do anything that isn't related to data retrieval. A* isn't data retrieval. Calculating a Geodeisic distance isn't data retrieval.
you're going to spend more time creating a Clever Solution to do those things in SPs than it is worth in $ and execution time.
You know what the best thing about stored proceedures is?
You only have to run the query optimizer once, when you save the proceedure. That over a lot of queries will save a LOT of compute time.
The fact remains that if you have to retrieve data that you cannot make assumptions on the contents of [can change from moment to momenet] and you're not a "subscribed listener" to those changes you're going to have to do processing in code not the just the DB.
The fact remains that if you have to filter your results based on things that require some execution time - such as A* searches, etc - you're going to have to do processing in code not just the DB.
Yes use the most efficient solution, however massaging (abusing) the SP system to do something that would be coded faster in code [and would probably execute faster there too] is NOT the most efficient solution.
[edit] yes.. i know the second best thing about stored procedures is security uses of them.
Admin
actually my wife and I had plenty of fun last night before bed thank you very much.
I keep bringing up A* because it's the most commonly used solution to solving path finding. it's the de facto. Games use it, road directions use it, etc etc.
I also keep bringing it up because it's something that shouldn't be massaged into a stinking stored procedure on a database.
oh don't forget that it's relevant to the discussion at hand.
Admin
Dear people on your high horse about returning all the rows from the DB:
You are not taking into consideration the domain of this application. A list of hotels (large, physical structures) is probably just not all that long, and really not all that much data to pull back. Its not like there is the potential for the table to have 50mm rows, and the list of hotels is pretty unlikely to change during a user's session.
Admin
The way I see it, the users probably want to see all results for a given area. They then want to play around with filtering the results based on price/distance/quality of hookers/pictures of rooms etc.
I expect it is faster and less resource intensive based on historical user behaviour to cache the initial search results and quickly iterate through the smaller resultset than to requery the database every time they want to refine by something else.
Admin
Admin
Choosing an appropriate coordinate system is important here. Lat/Long won't cut it. The distance describe by one degree of latitude is constant; one degree of longitude, however, is not.
You'd probably, if you only need crow flies measures, want to use lat long and apply the Haversine formula which is not terribly difficult. Doesn't include anything more difficult than trig functions.
Admin
I can only speculate as to how unpleasant it must be to work with/for Kazan - with his obnoxious and bellicose attitude - I certainly wouldn't wish to employ him.
His blinkered religious zealotry regarding the location of logic in an architecture is absurd. If experience teaches us anything, it must surely teach us that 'hard rules' often run into real-world situations that highlight their inapplicability.
Frex....we have multiple platforms accessing the same database. Do we reimplement the A* algorithm in every platform, or do we push that logic down into the surface of the DBMS, thereby extending its query functionality? Or do we inject a common logic layer?
The real WTF is your attitude, Kazan.
Admin
MySQL does GIS: http://dev.mysql.com/tech-resources/articles/4.1/gis-with-mysql.html
Admin
WTF: Paginating a result set that is subject to live changes in hotel availability - the size of the result set can change between page views, leaving pagination page number/start index meaningless.
Rubbish.
There are many compelling reasons to put "business logic" into the database:
If more than one application, built on more than one technology, accesses the data (which is almost inevitable in a large system) then consistent logic is applied relating to absolute business rules as data is only manipulated by a transactional API which is defined in the database
Security. No need to grant DELETE on APP_USERS as an API based on stored procedures does the work for you and perform authentication and security logic - no risk of a trashed database because of SQL injection
Efficiency (potentially an enormous gain). Not returning any more data than nescessary over the network. Processing data "close to" the data and taking advantages of the RDBMS's set-based operations etc.
Data integrity, consistency and proper transaction handling
Unified and guarenteed logging and auditing of business transactions
"things that should be done in code" - and a stored procedure isn't code? They can even be natively compiled nowadays.
Personally, I'd advocate performing any logic that is universally applicable to all applications accessing the data IN THE DATABASE.
Calling a web-service to check hotel availability (optionally caching in the database)? Sure, why not?
A*? It's not a terribly complicated algorithm, could be fairly trivially implemented in a SP. Not the best idea if this is front-end "application" logic and not fundamental to the way the data is accessed.
RDBMSs have had GIS/spatial processing capabilities for a while now...
This isn't even true - the execution plan isn't hard-wired at SP compile time. This would be a bad thing as the optimizer would never use a newly created index, for example.
SPs can use variable binding to prevent re-parsing, but this would be true of a well-developed PHP application issuing direct SQL too.
Admin
Nonsense, if you need a hotel, then you are probably already driving a distance that is much greater than the differences between different hotels in your target city. Or else you are flying, in which case you not driving anyway. Crow-flies distances give you a quick estimation which is adequate for most purposes.
Admin
Holy Crap! I could have done that from scratch in less than an hour, it's not that difficult!
Admin
Listening to the bickering here, I've just come up with a great plan for deciding the best way to code a project when I have no idea where to start.
I explain the problem in great detail, decide on the worst possible implementation, and post it to TDWTF. Let the one-upsmanship begin, and steal the best code posted.
Admin
Solder your own motherboard together first and we'll talk fightin' spirit. Pussy...
;)
Admin
Wow, talk about taking a simple problem and making it complicated.
When I search for a hotel, I want to know which hotels are reasonably close to the airport that I will be arriving at, or to the place I will be visiting. A close approximation of this distance is fine. I don't need it accurate to the millimeter. Any hotel website I've ever visited shows it rounded off to the nearest tenth of a mile anyway. The difference between the approximation you get by pretending the world is flat to the closer approximation you get by pretending the world is a perfect sphere is going to be less than the rounding error anyway. And you are aware, are you not, that the world is not a perfect sphere? There's that bulge at the equater, and probably more significant on the scale we're talking about, there are hills and valleys all sorts of topographical features.
I suppose there are cases when hotel A is slightly closer than hotel B as the crow files, but when you come to actually drive it, A is on the far side of a lake that you have to go around so it is actually farther away. But then, even if you had exact driving directions, it's possible that the hotel that is closer -- in real driving distance -- takes longer to get to because you have to travel on busier streets. Without knowing the exact time of day that the person plans to arrive so that you can anticipate commuter traffic, the exact timing of all traffic lights, knowing what the weather will be like that day and where there will be accidents or other traffic tie ups, there's no way to calculate all of this precisely. And no one cares.
Do you do this with all your problems? Developing computer systems is tough enough without inventing all sorts of unnecessary requirements to add to what the user really wants.
Admin
But note that that's a MySQL extension to the SQL standard. It doesn't work in most other SQL dialects. Some provide equivalent features, others don't.
Admin
This article reminds me of when I was house hunting online. You'd select "within 5 miles" and get somewhere halfway across the state. "within 1 mile" barely kept you in the city.
Admin
...and calculated in SQL so you can sort backend-side?
Admin
Vincenty distance? Sure thing. This is the reason for me to come up with that.
If you, however, don't need the distance from an arbitrary point on Earth to the front doorbell of a hotel, but the distance from some common landmarks in the hotel's city, it gets fairly simple to store them all in the DB. Even if storage is scarce (hey, what is scarce? 120GB? 250GB?). And you even don't even need the formula. And, yes, all in SQL.
Admin
"...why would I want a hotel in another state?"
Because my conference is in western Washington DC, and I could stay at a hotel in Virginia or Maryland.
Admin
Wow you guys are crazy. My original comment was that this guy shouldn't be mixing presentation with business logic. That's it. Who gives a shit if everything is done in the DB vs. some extra server-side processing? As long as the finalized result set gets delivered to the page responsible for rendering, I'm happy.
I'm approaching this from the point of view where different pages do something similar and need some of the same filtering done. Just push this crap down into the one layer and be done with it.
Admin
Yup, in the DC area a place that's 5 miles away can take over an hour to reach because it happens to be on the other side of the Potomac.
Admin
Admin
;)
Admin
Admin
nothing i've said should be interpreted to contradict anything you said in points 1 through 5. If they did then you misinterpreted my statements rather egregiously.
you could trigger SP recompile on index change.... :D my understanding is that is how a certain major open source database was doing it.
As for the fact that some database programs do GIS? that's stupid. that's not their job.
==================================
to the other posts who think that because I'm exercising my "bite people i consider idiots heads off" side on the internet means i must not be "very nice to work with" you're just being intentionally dense. just because i CAN bite your head off doesn't mean i'm GOING to bite your head off.
In person to people I like I'm polite even if they're totally braindead. some random idiot on the net talking out their arse because their face knows better... not so much.
and yes my 'attitude' that 'the appropriate tool should be used for the job' is COMPLETELY a "what the fuck" i mean.. who would EVER think to use the right tool for the job.
Got a nail that needs pounded into place? let's use a drill!
Got a hole that needs to be made? let's use a piece of sandpaper!