Database design question: Indexing

Posted by phoqoo, 01-15-2008, 12:38 PM
From what I've read, indexing columns helps optimize DB performance by speeding up WHERE clauses. However, I had a question about speeding up a WHERE clause in which a calculation is involved. Currently I am using MS SQL 2005, but plan on migrating my app to a MySQL DB. The purpose of this query is to find all users with an empire_size between half and double size of the user. However, empire_size is not currently a column in the DB but rather a computation that equals planets * 2.5 + moons. Here is a brief summary of my query: strSQL = "Select * from table where ". $userSize/2 . " >= Planets*2.5 + Moons AND ". $userSize*2. " <= Planets*2.5 + Moons" This query is one of the most executed queries in my application, and there will be anywhere from 50,000 to 200,000 records in the table, so optimizing this query is very important. My question is, is the query efficient enough as it is? Or would it be more efficient to add a column called "empire_size" and index it, and have it re-compute every time a record is updated (MS SQL already has computed columns, but I dont think mySQL does)? Or is there a another faster option that I haven't thought of? Any help would be appreciated.

Posted by Dolbz, 01-15-2008, 03:58 PM
Well it'd definately be more efficient for the select if you have an empire_size column that is indexed. The question is how often will that column get updated? Maybe it would be worth updating empire_size for the whole table periodically, say every 15 minutes to reduce server load?

Posted by phoqoo, 01-15-2008, 04:32 PM
Its a column that would get updated a lot. On average (from the beta test) moons/planets gets updated about every 10 minutes per record, but at random intervals / bursts. On a DB of 50,000 rows there will be 300,000 updates to empire_size per hour. Keeping empire size 100% up do date will reduce server load in other ways, however, by preventing the User from clicking on an empire's profile that has recently moved out of their range. Without 15 minute script, each update statement will look like: "Update Planets=$P, Moons=$M, EmpireSize=$P*2.5+$M WHERE UserID=$UserID" Would the 15 minute script, which removes the need for the bolded clause in the update statement, help in this scenario?

Posted by phoqoo, 01-15-2008, 04:50 PM
Also I don't know if this affects indexing or not, but the Query will only return 10 records maximum, and there is one other field looked at in the query. The full query is summarized to be something like this: strSQL = "Select TOP 10 * from table where ". $userSize/2 . " >= Planets*2.5 + Moons AND ". $userSize*2. " <= Planets*2.5 + Moons and Alert<=100 order by checksum(new_identity())" where checksum(new_identity()) is used to randomize the 10 records retrieved. Will indexing empire_size matter at all if Alert isn't indexed? Will indexing even help at all for very dynamic content?

Posted by Czaries, 01-15-2008, 11:40 PM
Just out of curiosity, why are you switching from MS-SQL to MySQL? MS-SQL is a more robust database with many more advanced features that may be very useful in your situation like computed columns, stored procedures, views, etc. Stored procedures and views are available in MySQL 5, but are still a little clumsy to use in practice. On a side note, you will have to change this query: ... to THIS for MySQL: Randomization is also a bit clumsy in MySQL currently... You generally have to use two queries to do it right (since ORDER BY RAND() is *very* slow for large datasets).

Posted by phoqoo, 01-16-2008, 12:12 AM
I'm moving my application onto a LAMP server due to budget constraints. This is a facebook app but I am paying for the server with my own $$$ and decided to go the LAMP route. MS SQL was available when I used a shared host, but a shared host wasn't sufficient for my needs. Stored procedures are important to my application, but I've already learned the stupid mySQL syntax . But I agree - computed columns and SPs made life very easy on MS SQL. Yea I changed that - and I've done some research, I'll probably do the select a random index and then LIMIT from that. I've also read that LIMIT will perform an indexed search anyway, and since I have another more static column in the WHERE clause that I can index (Lvl), my hope is that it will avoid a full table scan and will only have to do ~50-100 computations (Planets *2.5 + Moons) per query within the LIMIT 10 and Lvl constraints. Of course if my assumption is wrong please let me know. Last edited by phoqoo; 01-16-2008 at 12:23 AM.

Esta resposta lhe foi útil?

 Imprimir este Artigo

Veja também

antivirus.exim

Posted by its_joy, 03-29-2008, 07:20 AMhi, i have installed antivirus.exim on server but if...

VMWare IIS and SSL mystry

Posted by WebGuru72, 07-18-2007, 12:58 AMHello All, I have a very weird problem, I have a...

.htaccess problem

Posted by londonhogfan, 09-23-2008, 02:05 AMhi, I'm having quite a problem with my site on...

What is SAP?

Posted by TajByte, 03-22-2007, 05:02 AMWhat is SAP? Posted by 24x7os, 03-22-2007, 05:25...

MySQLDumper Perl Command - It This Correct?

Posted by Greedisgood, 03-26-2007, 10:28 AMHi Guys, I was reading this tutorial:...