|
just another regularban.info web blog |
| MEMBERS: | Efficient SQL Databases
Don't be fooled by seeming simplicity. A lot of developers get comfortable with a certain way of designing a database for their web applications that they miss out on techniques they should rather employ to make things run faster and more efficiently. A lot of developers don't bear in mnd that the small site they are creating now might grow into something incredibly large and complex, and the database they designed has become bloated and doesn't scale well to meet the demands of the increased traffic. This article hopes to provide web developers with a few techniques to help make their database and queries faster and more efficient. 1. Avoid Character Types When you are designing a database, it is so easy to set all data types to the VARCHAR type as it can then contain any data you want; numbers or text. But character data is amongst the most inefficient data type you can get. If a field is only going to contain numbers, then make it one of the appropriate types (INT, DOUBLE, etc). Also, wherever possible in your web development code, try to use numeric data types as opposed to characters. One of the most common things a script has to store are flags like whether someone answered yes or no to a question, etc. You could of course store it as 'Y' or 'N' but why not store it as 0 and 1? The reason this makes a difference is when you have a database, for example, with over 500 000 entries, and are running a SELECT on that field, comparisons are processed a lot faster for numeric data types than character types. Also, if you need to return data to the calling script, numeric data is less memory intensive than character data. In addition, your web development language (PHP, ASP, etc) would also be able to process and perform functions on numeric data better than character data. I am not trying to convince you never to use character data types. Sometimes it is a necessity, but if you can find ways to reduce the amount of character data processed by your SQL database, the better your server will cope. 2. Normalization Normalizing a database is really quite a complex process. It is a process that describes a way to design a database structure to avoid repetition of data in your database and can lead to significant performance benefits if employed correctly. However, the entire process of normalisation is a bit beyond the scope of this article as it can fill books on its own, but any developer designing a database should seriously consider becoming knowledgable about normalisation and employing it in their own designs. For a good tutorial on this process: http://www.keithjbrown.co.uk/vworks/mysql/mysql_p7.php 3. DateTime vs Timestamp fields This actually relates to 1. a bit. The big difference to bear in mind here is that a field of type DATETIME is actually stored as a series of characters. A field of type TIMESTAMP is actually stored as an integer. So therefore, a more efficient way of storing dates is using the timestamp method. The timestamp has its drawbacks however. For one, you cannot store a date early than 1 January, 1970. Also, timestamps in your script will need recalculating to get to the character format. Because of this recalculation, it may not be better to store as timestamp. It really is a case of testing which format works better for your needs. 4. Use LIMIT where possible In your queries, if you are doing a SELECT to a database and you only expect a certain number of results, using the LIMIT statement can speed your query up incredibly. For example, if you have a table of users and you need to run a query to search for one users record, you can use a query like: SELECT user_name FROM users WHERE user_id = 453; This query is perfectly valid and will return the right result. But you also know there will only be ONE result. The query above will search the database, find what you want, but then still continue searching after that. It would run a lot faster if you could tell the query that once it has found what you are looking for to stop searching. LIMIT can do this, as this query shows: SELECT user_name FROM users WHERE user_id = 453 LIMIT 1; Imagine this scenario. You have a table called logins, that records every login from a user. It currently contains over 2 000 000 records, and you want to find the first time a user logged in. Now bear in mind that because this table inserts data over time, it is already sorted for by date. You could do the following query: SELECT MIN(login_date) FROM logins WHERE user_id = 4876; This will return the record you want, but SQL will now have to get all dates for that user, sort them and then return the lowest value to you. Our table is already date sorted simply because of the way it records data for us. So using LIMIT can be more effective: SELECT login_date FROM logins WHERE user_id = 4876 LIMIT 1; Because it is sorted, the first one will always be a users first login. 5. Avoid using LIKE If you have tried to employ 1. above, then hopefully you will be in a scenario where you do not need to use LIKE all that much. LIKE is one of the most inefficient ways of searching a table. LIKE performs a text comparison search in a field and with no wildcards is as efficient as a direct comparison; i.e. WHERE name = 'Jane' is equivalent to WHERE name LIKE 'Jane'. It is when you start introducing the wildcard characters like '%' that things get really hairy. If you do have to use LIKE, then at least try and make efficient use of the wildcards. These are '_' (underscore) and '%'. Let me explain all this with a real world example. In a project I was involved in, we had a SQL database storing logs generated automatically from a mail server. Unfortunately, the mail server pretty much just dumped a very long string of text data into a field that contained the data we wanted. A script had to be written to find all logs that referred to a login by a user into the POP server. The only way we could do this was to search every record for a string in the msg field that had the text "User logged in" in it. The first query developed was something like this: SELECT msg FROM logs WHERE msg LIKE '%User logged in%'; This query took on average of about 35 minutes to process. Obviously not an ideal situation. The way the LIKE worked here was that it had to parse through every single portion of each and every record in the msg field looking for text that matched "User logged in" anywhere in the text. We were able to determine eventually that the text "User logged in" occured at the end of that text in the msg field and so we altered the query: SELECT msg FROM logs WHERE msg LIKE '%User logged in'; The '%' at the end was removed as we do not want to worry about text after because there is none. The query now only compares text to our string in the msg field at the end of the field and no longer parses through the entire piece of text stored in msg. The query now ran in under 2 minutes. (This was actually still too long, but how we optimised from there is a little beyond the scope of this article.) Hopefully with all these elements put into practice on your next web development project, you can have a database that runs quickly, efficiently, uses as little resources as possible and wont grind to a halt when the load suddenly increases.
Web Development and Automated Dating
Your business can benefit from an online presence because of the flexible nature of online presentation. It is possible to establish a cut off date for the sale of certain products and then use the same date to establish a new line. This can be accomplished through an automated dating system in web builder technology. You can even assign a certain product to show up in the clearance section of your site months ahead of time if you wish. In a brick and mortar business environment there is a lot of work that has to be done whenever there is a change in sales promotions or when items need to be repriced and segregated for clearance sales. Staff members are required to stay late or come in early to make sure the new prices and displays are ready for the consumer. In the world of online business you can create all the changes you want whenever it is most convenient. Because content information is tied to a dating system the web builder software will recognize those changes down to the minute. If you've promised your customers a special sale beginning at a specific time it is possible to manage this change hands-free. When the time comes for the sale to begin the new information is assigned to the location you predetermined for it and customers gain immediate access to the sale prices. Imagine the day after Thanksgiving, but without the hassle of actually having to show up for work on time. Many businesses enjoy the flexibility of making changes to their business site well in advance of the date the changes are to be put in place. This idea may stand in stark contrast to the way you run your website. In many cases the only changes that can be made are alterations directed by the web designer. Often a web designer lacks the ability to make immediate and/or timely changes to your website. Businesses that use a web designer often need to make multiple requests for work to be completed on their site. The growth of self-directed site design is due in large part to the need businesses have of being able to make immediate changes to their site if necessary. Due to the immediacy of the web it has also become important for businesses to learn to manage those changes quickly. In a feat that might rival The North Pole and its inhabitants, webs builder technology seems to operate on a magical plane where things systematically change without the need for direct human intervention. If you're a website owner have you ever had trouble managing even the simplest changes when you need changes made? It can be frustrating to know that the alterations of your website are contingent on the ability for someone else to view your needs as a priority. If your web designer doesn't view you as an immediate concern then you simply have to wait until they are good and ready to help. Orchestrate your own business symphony by making sure every instrument comes in on time.
Server vs Client Sides of Web
Things which exist on one's personal computer are referred to as "client side", and on the web host as "server side". The average internet user might have first heard "client" in the context of applications installed on the personal computer, such as "email client". Those mail systems which can be used from anywhere are "web mail", and exist on the server side. In practical terms, all your office suite programs, media players, programs to edit images, most games, and so forth, are probably client side, although "utility" type functions are evolving on the server side. For example, users can now share data on server side spreadsheets and word processors. Most browser function is defined on the client side, perhaps with some JavaScript add-ons for interaction, calendars, multi-level menus, animated graphics, et cetera. Business enterprise level content management, databases, store systems, and much more are on the server side. Server side programming can range from simple CGI scripts ("Common Gateway Interface") written in a variety of languages, such as Perl. Large databases can be built in the popular open-source MySQL, and accessed through interfaces programmed in PHP. First embodiments of such CGI functions started a new copy of the executing module for each command request. To avoid server shutdown from excessive workload, host programmers have evolved better ways, but these need not concern us ordinary mortals. Fortunately for this author, a web site builder does not need to be an expert in all those server side tools in order to use them. Most hosting companies now offer access to pre-installed modules. Persons wanting better features can purchase modules from third parties to upload and install, such as shopping carts, which are backed by support staff, user, or similar. If the site builder lacks a very fast connection to the server, s/he can install client side copies of operating systems for SQL, PHP, other... to emulate behavior on the host. Sometimes the emulation is less than perfect, such as with different release generations, so adjustments may be needed after upload. Why would anyone bother to do this? One reason is that PHP can take over parts of HTML coding, such as with "include files" which represent often used sections of header, footer, body, or serve more robustly than JavaScript for interactivity and utility functions. If the connection is fast, however, present day "shared hosting" and "virtual private/dedicated servers" make it very difficult for one domain owner to break the system for other users. And only privileged employees have access to the power switch. VPS allows power users to get more behind the scenes than can the SH customer. Caveat: Whether your HTML writing is done directly on host account or on personal computer for upload, keep an off-site copy against the day your hosting company drops or back levels your content. It will happen. What ever the approach a person uses for working on the internet, all these elements are examples of "distributed processing", a concept which some large mainframe computer manufacturers had hoped would never be realized. Now that the small guys and gals have forced the issue, by using ever more powerful personal computers in place of dumb terminals, the big dogs have learned to love and profit from it.
|
* About Archives
Categories:
Last Updated: |
| regularban.info
is proudly powered by WordPress MU running
on regularban.info.
Create a new blog and join in the fun! Entries (RSS) and Comments (RSS). |