Best way to log details. Database or file log?

Unlike many articles in Hungred Dot Com where i share valuable web development information with my readers, this article is something that required everyone to debate on. Every system will require a logging system (unless it is a crappy system). Regardless is transaction log, result log, database log, error log and etc., there is always a need have a quick, secure and reliable log to store these information for any further investigation. And logging details usually fall into file or database category. We need to look at three important thing to consider a media to log our details. There are performance, security and reliability. Let me elaborate the importance of each point.

Log Performance

Performance, performance performance! This is something we all want to know about. Whether file base log or database log is better? We will be looking at long run where our log gets really huge! Delay and performance problem might arise and which media will be more resistance against such problem. Another good thing to consider between these two media is the extra cost of HTTP request comparing to a read and write and the problem of delay arise from huge size. We won't want to consider the alternative media only after the problem appear don't we?

Log Security

Another thing that every hacker will be interested with is the log file. Valuable information is being stored in our log file and it is necessary to consider how secure can either media gives us. Log file may even carry sensitive details of our customers which was log by our program. Hence, considering the security risk of having plain text and a database is important to prevent security hole in our system environment. Each media will have its own way to further secure its media but which is better?

Log Reliability

Why we bother to have a log file if it is unreliable. This is necessary for a system that is required to keep track of a system that handle important transaction. An unreliable log might miss a log due to various reason such as manual query termination, file lock, database down during logging and etc. It is necessary to have all our log in order to capture important incidents.

Other log criteria

Scalability and flexibility is another thing some of you might want to mention. Migration of server and ease of searching etc. is also points that is important for us to consider as a log that cannot find its detail is consider a useless log.

Database Logging

Performance wise, database might be slower when log amount is small. But once the log amount became a huge amount, database based logging might really be much faster. The problem i can see is that it will fight with other urgent query which has higher priority to be executed and table locking. This is usually resolve by using MySQL Insert Delay operation. Another issue will be latency which cause the delay o of the logging operation. In term of searching database logging surely have the upper hand. Security of the log depends solely on the security of the server and database. There might be risk of SQL injection but usually this should be taken care of by the developers.

In term of reliability, using insert delay will risk the chances of our log getting lost especially if the system is a very active one. In a very busy system every few millisecond time interval there will be additional query that makes the database super busy until the insert delay log are pile up and have to wait till the database is quiet to be active. Hence, any accident such my sql die or forcefully terminated, the log query are gone. Furthermore, additional overhead to delay such insert will degrade MySQL performance by a little.

Log file

Log file is the simplest way to achieve a logging system. Its basically just a few lines of code (depend how paranoid logger are you).  While the greatest advantage is its simplicity, the worst problem of file based logging is searching. Most developers who move to file based logging end up not relaying on logs. But usually this can be overcome with some formatting and regular expression. Performance wise, it should be directly opposite a database logging where smaller size will be better and larger it gets worst. Nonetheless,  theoretically both should be the same in term of opening and closing of file regardless of size. It should be solve easily by utilizing buffer. In term of security, file based logging usually uses plain text file. Knowing the name of the log file is equivalence to exposing to the public (especially open source apps). But this is usually resolve using file permission setting.

Unlike database logging, file based logging doesn't required a call to the database. Hence, everything is done by the server scripting language you are using and operation is complete regardless of whether the connection is down(as long as the request pass from client to server is complete).

The other more critical part to choose file based logging is the problem of file locking where only one person is allowed to open the log file at one time. Hence, in a active system this might really post a big problem where logging is done intensively. The most expensive part in file based logging should be searching. Hence, regular expression can be really handy (or pain in the ass).

Summary

Some uses both file based logging and database logging with a little help from a external batch program. But it really depends on the need and required of your logging system. But my job here is done; I have started the fire. Now its time to heat it up. 😀

25 PHP Form Validation Snippets

Recently i have involve myself in another application development. Regular Hungred Dot Com visitors will notice that the site currently offer advertisement space through this form. But really, we as a developers are always looking for such snippets or writing them out from scratch every single time regardless of how many time we know we have store it somewhere in our laptop! Man, its really frustrating searching on Google and find all sort of solution and trying to figure out whether the regular expression implemented is expensive or complete. So i came out with an idea to ease my life a bit and other developers by putting up an article such as this for my/our references. (This can be made into a class if you like to)

Validate Email

We can perform an email validation through this function.

	function isValidEmail($email){
		return eregi('^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$', $email);
	}

After fainted for a few seconds when i saw unreal4u finding, i decided to throw up preg_match solution instead.

	function isValidEmail($email){
		return preg_match('/^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$/i', $email);
	}

PHP 5.2 and above.

function fnValidateEmail($email)
{
  return filter_var($email, FILTER_VALIDATE_EMAIL);
}

Sanitize Email

We can further sanitize our email to ensure that everything is alright.

function fnSanitizeEmaill($string) {
     return  preg_replace( '((?:\n|\r|\t|%0A|%0D|%08|%09)+)i' , '', $string );
}

PHP 5.2 and above.

function fnSanitizeEmaill($url)
{
  return filter_var($url, FILTER_SANITIZE_EMAIL);
}

Validate Email Exist

This is not possible but certain validation can be use to validate email existence.

function check_email($email)
{
	$email_error = false;
	$Email = htmlspecialchars(stripslashes(strip_tags(trim($email)))); //parse unnecessary characters to prevent exploits
	if ($Email == '') { email_error = true; }
	elseif (!eregi('^([a-zA-Z0-9._-])+@([a-zA-Z0-9._-])+\.([a-zA-Z0-9._-])([a-zA-Z0-9._-])+', $Email)) { email_error = true; }
	else {
	list($Email, $domain) = split('@', $Email, 2);
		if (! checkdnsrr($domain, 'MX')) { email_error = true; }
		else {
		$array = array($Email, $domain);
		$Email = implode('@', $array);
		}
	}

	if (email_error) { return false; } else{return true;}
}

Validate Number Only

We can use PHP built-in function to validate whether a given value is a number.

function fnValidateNumber($value)
{
	#is_ double($value);
	#is_ float($value);
	#is_ int($value);
	#is_ integer($value);
	return is_numeric($value);
}

PHP 5.2 and above.

function fnValidateNumber($value)
{
	#return filter_var($value, FILTER_VALIDATE_FLOAT); // float
	return filter_var($value, FILTER_VALIDATE_INT); # int
}

Sanitize Number

We can force all value to be only numeric by sanitize them.

function fnSanitizeNumber($str)
{
	#letters and space only
	return preg_match('/[^0-9]/', '', $str);
}

PHP 5.2 and above.

function fnSanitizeNumber($value)
{
	#return filter_var($value, FILTER_SANITIZE_NUMBER_FLOAT); // float
	return filter_var($value, FILTER_SANITIZE_NUMBER_INT); # int
}

Validate String Only

Sometimes to validate name we can use this function to restrict only letters and spaces.

function fnValidateStringr($str)
{
	#letters and space only
	return preg_match('/^[A-Za-z\s ]+$/', $str);
}

Sanitize String

We can sanitize it instead of validate user input.

function fnSanitizeStringr($str)
{
	#letters and space only
	return preg_replace('/[^A-Za-z\s ]/', '', $str);
}

PHP 5.2 and above. built-in function by PHP provides a much more powerful sanitize capability.

function fnSanitizeStringr($str)
{
	return filter_var($str, FILTER_SANITIZE_STRIPPED); # only 'String' is allowed eg. '<br>HELLO</br>' => 'HELLO'
}

Validate Alphanumeric Characters

This validates alphanumeric characters.

function fnValidateAlphanumeric($string)
{
	return ctype_alnum ($string);
}

Sanitize Alphanumeric Characters

This sanitize alphanumeric characters. eg. "HELLO! Do we have 90 idiots running around here?" => "HELLO Do we have 90 idiots running around here"

function fnSanitizeAlphanumeric($string)
{
	return preg_replace('/[^a-zA-Z0-9]/', '', $string);
}

Validate URL Exist

This function will check whether a given URL exist and not only validate it.

	function url_exist($url)
	{
		$url = @parse_url($url);

		if (!$url)
		{
			return false;
		}

		$url = array_map('trim', $url);
		$url['port'] = (!isset($url['port'])) ? 80 : (int)$url['port'];
		$path = (isset($url['path'])) ? $url['path'] : '';

		if ($path == '')
		{
			$path = '/';
		}

		$path .= (isset($url['query'])) ? '?$url[query]' : '';

		if (isset($url['host']) AND $url['host'] != @gethostbyname($url['host']))
		{
			if (PHP_VERSION >= 5)
			{
				$headers = @get_headers('$url[scheme]://$url[host]:$url[port]$path');
			}
			else
			{
				$fp = fsockopen($url['host'], $url['port'], $errno, $errstr, 30);

				if (!$fp)
				{
					return false;
				}
				fputs($fp, 'HEAD $path HTTP/1.1\r\nHost: $url[host]\r\n\r\n');
				$headers = fread($fp, 4096);
				fclose($fp);
			}
			$headers = (is_array($headers)) ? implode('\n', $headers) : $headers;
			return (bool)preg_match('#^HTTP/.*\s+[(200|301|302)]+\s#i', $headers);
		}
		return false;
	}

Validate URL Format

This function will validate a given url to ensure the format is correct.

function fnValidateUrl($url){
return preg_match('/^(http(s?):\/\/|ftp:\/\/{1})((\w+\.){1,})\w{2,}$/i', $url);
}

PHP 5.2 and above.

function fnValidateUrl($url)
{
  return filter_var($url, FILTER_VALIDATE_URL);
}

Sanitize URL

PHP 5.2 and above.

function fnSanitizeUrl($url)
{
  return filter_var($url, FILTER_SANITIZE_URL);
}

Validate Image Exist

This function will check whether a given image link exist and not only validate it.

	function image_exist($url) {
	if(@file_get_contents($url,0,NULL,0,1)){return 1;}else{ return 0;}
	}

Validate IP Address

This function will validate an IP address.

function fnValidateIP($IP){
	return preg_match('/^(([1-9]?[0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5]).){3}([1-9]?[0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$/',$IP)
}

PHP 5 and above. This can also specific validation for IPV4 or IPV6.

function fnValidateIP($ip)
{
  return filter_var($ip, FILTER_VALIDATE_IP);
}

Validate Proxy

This function will let us detect proxy visitors even those that are behind anonymous proxy.

function fnValidateProxy(){
	if ($_SERVER['HTTP_X_FORWARDED_FOR']
	   || $_SERVER['HTTP_X_FORWARDED']
	   || $_SERVER['HTTP_FORWARDED_FOR']
	   || $_SERVER['HTTP_VIA']
	   || in_array($_SERVER['REMOTE_PORT'], array(8080,80,6588,8000,3128,553,554))
	   || @fsockopen($_SERVER['REMOTE_ADDR'], 80, $errno, $errstr, 30))
	{
		exit('Proxy detected');
	}
}

Validate Username

Before we validate whether a given username is matches the one in our database, we can perform a validation check first to prevent any unnecessary SQL call.

function fnValidateUsername($username){
	#alphabet, digit, @, _ and . are allow. Minimum 6 character. Maximum 50 characters (email address may be more)
	return preg_match('/^[a-zA-Z\d_@.]{6,50}$/i', $username);
}

Validate Strong Password

Another good thing is to validate whether a particular password given by the user is strong enough. You can do that using this function which required the password to have a minimum of 8 characters, at least 1 uppercase, 1 lowercase and 1 number.

function fnValidatePassword($password){
	#must contain 8 characters, 1 uppercase, 1 lowercase and 1 number
	return preg_match('/^(?=^.{8,}$)((?=.*[A-Za-z0-9])(?=.*[A-Z])(?=.*[a-z]))^.*$/', $password);
}

Validate US Phone Number

This function will validate US phone number for US users.

function fnValidateUSPhone($phoneNo){
	return preg_match('/\(?\d{3}\)?[-\s.]?\d{3}[-\s.]\d{4}/x', $phoneNo);
}

Validate US Postal Code

This function validate US postal code.

function fnValidateUSPostal($postalcode){
	#eg. 92345-3214
	return preg_match('/^([0-9]{5})(-[0-9]{4})?$/i',$postalcode);
}

Validate US Social Security Numbers

This function validate US Social Security Numbers.

function fnValidateUSSocialSecurityCode($ssb){
	#eg. 531-63-5334
	return preg_match('/^[\d]{3}-[\d]{2}-[\d]{4}$/',$ssn);
}

Validate Credit Card

This function validate credit card format.

function fnValidateCreditCard($cc){
	#eg. 718486746312031
	return preg_match('/^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|6011[0-9]{12}|3(?:0[0-5]|[68][0-9])[0-9]{11}|3[47][0-9]{13})$/', $cc);
}

Validate Date

This is a date format MM-DD-YYYY or MM-DD-YY validation which validate from year 0000-9999.

function fnValidateDate($date){
	#05/12/2109
	#05-12-0009
	#05.12.9909
	#05.12.99
	return preg_match('/^((0?[1-9]|1[012])[- /.](0?[1-9]|[12][0-9]|3[01])[- /.][0-9]?[0-9]?[0-9]{2})*$/', $date);
}

This is a date format YYYY-DD-MM or YY-MM-DD validation which validate from year 0000-9999.

function fnValidateDate($date){
	#2009/12/11
	#2009-12-11
	#2009.12.11
	#09.12.11
	return preg_match('#^([0-9]?[0-9]?[0-9]{2}[- /.](0?[1-9]|1[012])[- /.](0?[1-9]|[12][0-9]|3[01]))*$#'', $date);
}

Validate Hexadecimal Colors

This is a good validation for people who allows their user to change color in their system.

function fnValidateColor($color){
	#CCC
	#CCCCC
	#FFFFF
	return preg_match('/^#(?:(?:[a-f0-9]{3}){1,2})$/i', $color);
}

Make Query Safe

This function help sanitize our data to be SQL injection safe.

function _clean($str){
return is_array($str) ? array_map('_clean', $str) : str_replace('\\', '\\\\', htmlspecialchars((get_magic_quotes_gpc() ? stripslashes($str) : $str), ENT_QUOTES));
}

//usage call it somewhere in beginning of your script
_clean($_POST);
_clean($_GET);
_clean($_REQUEST);// and so on..

Make Data Safe

This function help to keep us protected against XSS, JS and SQL injection by removing tags.

function _clean($str){
return is_array($str) ? array_map('_clean', $str) : str_replace('\\', '\\\\', strip_tags(trim(htmlspecialchars((get_magic_quotes_gpc() ? stripslashes($str) : $str), ENT_QUOTES))));
}

//usage call it somewhere in beginning of your script
_clean($_POST);
_clean($_GET);
_clean($_REQUEST);// and so on..

Summary

A paranoid way to perform a form validation would be to validate first then sanitize your values for precautions. If you think the above snippets were suck or you have any good or awesome snippets to share. Please throw your comment and share with us!

8 jQuery Optimization Tips and Tricks

Its been quite a while since i write anything about jQuery. Furthermore, i have been writing many different optimization articles and jQuery is something that i have not write for quite sometime. Since many uses jQuery quite intensively in either design or application, why don't i give some tips on jQuery too. I think it will be beneficial that such optimization tips can be share and utilize in future application. Therefore, in this article you will see 8 jQuery optimization tips and tricks that might just benefit you during your jQuery coding.

1. Profiling

Once of the most important thing in every optimization, profiling. We can easily perform profiling using firebug. Everyone who is a developer or even designer should be well aware of firebug that could help assist in this task. Profiling allow us to see exactly how many times each function has run and how long does it take. With the help of profiling you will be able to see which function is the bottleneck in your jQuery functions.

2. Benchmarking

Another thing that every optimization will have to come across eventually is benchmarking. In programming, it is very common to use different ways to achieve the same result when we are writing our codes. In order to know which ways give us the best result, benchmarking will be required in order to justify correctly that certain way of achieving the same result is better than the other. Benchmarking across different JavaScript frameworks is the only way we can see which framework run faster than the other in certain ways. Hence, benchmarking is really required in order to justify the efficiency of two or more methods or script.

3. Specific Selector

In jQuery v1.3 below, specific a selector is necessary to optimize jQuery selector speed. A selector that are more specific will result in a faster result found. Hence, an id such as this:

jQuery('#someid').attr('value');

id is always the fastest compare to class or tag where classes is the worst (we do not have document.getElementByClasses don't we). Nonetheless, we can improve the efficiency of the selector by being more specific. Hence, instead of just calling out classes like this:

jQuery('.classes').html();

we can improve it by telling it where it exactly located at

jQuery('div ul li.classes').html();

By being specific in jQuery, we can improve the selector speed. However, this is what happen before jQuery v1.3. After the introduction of 'Sizzling' selector engine, this is no longer the case. We can see that by doing a selector test that the selector has speed up than before. This means that being specific with classes will not have significant differences after v1.3 (because it required more call to be specific) unless you are doing a rather odd selector such as this

jQuery('#id .class div').html();

It will be rather faster with just

jQuery(' .class').html();

Try to avoid id, classes and attribute combination that might just hit your selector badly. The point is to specific a selector but not over specific that might cause a degrade in selector performance.

4. Avoid Unnecessary Selector

Common subexpression elimination is a common way to optimize any programming code. Doing unnecessary selector is expensive. Doing something like this:

jQuery('.class').each(function(){
	jQuery(this).html();
	jQuery(this).find('div').each(function(){
		//etc.
	});
});

which required many selector it is best to use 1 instead since we are repeating ourselves and doing some redundant selector.

jQuery('.class').each(function(){
	var obj = jQuery(this);
	obj.html();
	obj.find('div').each(function(){
		//etc.
	});
});

This is something that we often see in many plugin. Many plugin contains unnecessary selector which degrade the perform of the plugin little by little.

5. Avoid Unnecessary Styling

This is another common mistake that many jQuery developer made. jQuery provides us with the ability to manipulate styling through styling methods. Although it is very convenient to do that but it also contribute to the size of the script and makes maintenance difficult. Furthermore, some of the selector is made solely for styling purposes which makes it even undesirable. To make thing worst, the styles made with jQuery are all inline style which doesn't help caching. Hence, instead of styling everything with jQuery. It is much more advisable to give classes to your tag and style it on a external stylesheet. Instead of writing jQuery styling such as these,

jQuery('div').each(function(){
	var obj = jQuery(this);
	obj.css({'background-color' : 'yellow', 'font-weight' : 'bolder', 'color': '#000', 'font-size': '15px'})
});

Where styling is being applied through jQuery, we can just add a class or even better do it on the HTML file itself!

jQuery('div').each(function(){
	var obj = jQuery(this);
	obj.addClass("mystyling");
});

This way we can easily eliminate a lot of extra code in our script. Unless, absolutely necessary to manipulate styling through jQuery, it is better to leave styling to the stylesheet.

6. Optimize Selector

We can also optimize our selector other than being a bit specific on our selector. The key to optimize our selector is simple. Naive JavaScript provides two method to get id and tag (getElementById and getElementByTagName). Hence, it is always faster to use tag or id on your selector. No matter what algorithm used for classes and attributes selection, the naive built-in JavaScript method will always have the upper hand. Nonetheless, classes and attribute selector have also been improved dramatically through the introduction of 'sizzling' selector engine. However, it is always advisable to reduce the number of attribute selector as they are the slowest in many cases. On the other hand, making your selector as simple as possible is good for pure tag and id selector but may not be true for attribute and selector. Hence,

jQuery('div')
jQuery('#id')
jQuery('.classes')
jQuery('p:last')

is considered as good selectors while the below might not give you very good selector result.

jQuery('body div div')
jQuery('div#id')
jQuery('#id div.classes')
jQuery('.classes p:last')

It is important to treat jQuery as a helper of JavaScript instead of a new language.

7. Avoid DOM Manipulation

Since we are using jQuery as a framework, making life difficult for ourselves is the most silly thing anyone can make. Example, using jQuery to write a table in DOM style~

var table = $('<table></table>');
for (var i = 0; i < 6; i++) {
  var tr = $('<tr></tr>');
  for (var j = 0; j < 7; j++) {
    tr.append('<td></td>');
  }
  table.append($tr);
}

Leaving aside the additional operation on append() each time we call, we are also doing some expensive DOM creation each step when we can really help ourselves by doing a one line query and leave the rest to jQuery. For customize table you might do this.

var table = '<table>'
for (var i = 0; i < 6; i++) {
  var tr = '<tr>';
  for (var j = 0; j < 7; j++) {
    tr +='<td></td>';
  }
  tr +='</tr>';
  table += tr;
}
table += '</table>';
var table = $(table);

For a static table, its really one line.

var table = $('<table><tr><td><td>...</tr><tr>...</tr>...</table>');

But if you really want a table, do it in HTML. You don't need to create everything using jQuery. Treat jQuery as an expensive action and use it when absolute needed. (same goes with styling)

8. Balance between JavaScript and jQuery

jQuery is excellent to speed up ANY front-end developer job. However, there are times when jQuery may take more time to perform certain action than using JavaScript. For example,

	css: function( key, value ) {
		// ignore negative width and height values
		if ( (key == 'width' || key == 'height') && parseFloat(value) < 0 )
			value = undefined;
		return this.attr( key, value, "curCSS" );
	}

The above is a CSS method in jQuery that takes a key and a value which we are familiar using in this way

jQuery('#id').css('display', 'block');

We can see that jQuery present certain overhead for doing something simple where JavaScript can also achieve.

document.getElementById('id').style.display = 'block';

Other methods such as show(), hide(), hasClass() and etc. also present some overhead which you should balance between complexity and efficiency. Hence, we must understand some of the cost of jQuery action and balance between JavaScript and jQuery. jQuery is build for general public and making a general public framework might means additional operation to meet everyone needs. Hence, if you truly going all out on optimization, you should consider the trade-off of some action between JavaScript and jQuery.

Summary

jQuery is a great framework that ease all our development life. Operation that gets complicated can easily become simple with jQuery. However, like i stress the important of keeping out unnecessary jQuery code will certainly help generating a more efficiency and optimize jQuery script.

JavaScript Framework Selector Benchmark Tool

Many front-end developers nowadays are all using javascript framework such as jQuery, Dojo, MooTools, Prototype, etc. Most of us will be wondering which framework is more efficient than the other. Mootools brings us a great tool called Slickspeed to benchmark different framework selector performances against each other. You can even download the source code and test it on your own local drive or server. However, i notice that the result was outdated and a new test should be performed. Ever since jQuery new 'Sizzle' selector engine was released on its version 1.3, many people has been asking how much of a performance it has gain. Furthermore, Resig has been inviting other JavaScript frameworks to implementing Sizzle in its codebase and i was left wondering who did implement Sizzle in its codebase. Out of curiosity, i decided to find out.

Looking at how JavaScript frameworks change as time goes, i think it will be good to have a more dynamic benchmark tool. Therefore, using Slickspeed, i modify the source a little to bring you a more dynamic Slickspeed to benchmark your curiosity.

The tool is fairly easy to use. It is exactly the same as Slickspeed with the additional function of 'add' and 'remove' to benchmark future frameworks. The criteria is simple,

  • 'Add' will required the three textbox to be filled
  • 'Remove' will only required the name to be filled.
  • No name should be the same
  • All path can be take from Google Ajax Framework CDN
  • Function is the symbol used for the framework.

Currently, additional selectors can't be added at the moment but might be in the future so that we can perform benchmark on real selector situation with real document. Below is the updated program but the page must be fully loaded before it is ready to use.

You can also access the full screen version of JavaScript framework selector benchmark tool.

15 Ways to Optimize Your SQL Queries

Previous article was on 10 Ways To Destroy A SQL Database that sort of teaches you what mistakes many company might make on their database that will eventually lead to a database destroy. In this article,  you will get to know 15 ways to optimize your SQL queries. Many ways are common to optimize a query while others are less obvious.

Indexes

Index your column is a common way to optimize your search result. Nonetheless, one must fully understand how does indexing work in each database in order to fully utilize indexes. On the other hand, useless and simply indexing without understanding how it work might just do the opposite.

Symbol Operator

Symbol operator such as >,<,=,!=, etc. are very helpful in our query. We can optimize some of our query with symbol operator provided the column is indexed. For example,

SELECT * FROM TABLE WHERE COLUMN > 16

Now, the above query is not optimized due to the fact that the DBMS will have to look for the value 16 THEN scan forward to value 16 and below. On the other hand, a optimized value will be

SELECT * FROM TABLE WHERE COLUMN >= 15

This way the DBMS might jump straight away to value 15 instead. It's pretty much the same way how we find a value 15 (we scan through and target ONLY 15) compare to a value smaller than 16 (we have to determine whether the value is smaller than 16; additional operation).

Wildcard

In SQL, wildcard is provided for us with '%' symbol. Using wildcard will definitely slow down your query especially for table that are really huge. We can optimize our query with wildcard by doing a postfix wildcard instead of pre or full wildcard.

#Full wildcard
SELECT * FROM TABLE WHERE COLUMN LIKE '%hello%';
#Postfix wildcard
SELECT * FROM TABLE WHERE COLUMN LIKE  'hello%';
#Prefix wildcard
SELECT * FROM TABLE WHERE COLUMN LIKE  '%hello';

That column must be indexed for such optimize to be applied.

P.S: Doing a full wildcard in a few million records table is equivalence to killing the database.

NOT Operator

Try to avoid NOT operator in SQL. It is much faster to search for an exact match (positive operator) such as using the LIKE, IN, EXIST or = symbol operator instead of a negative operator such as NOT LIKE, NOT IN, NOT EXIST or != symbol. Using a negative operator will cause the search to find every single row to identify that they are ALL not belong or exist within the table. On the other hand, using a positive operator just stop immediately once the result has been found. Imagine you have 1 million record in a table. That's bad.

COUNT VS EXIST

Some of us might use COUNT operator to determine whether a particular data exist

SELECT COLUMN FROM TABLE WHERE COUNT(COLUMN) > 0

Similarly, this is very bad query since count will search for all record exist on the table to determine the numeric value of field 'COLUMN'. The better alternative will be to use the EXIST operator where it will stop once it found the first record. Hence, it exist.

Wildcard VS Substr

Most developer practiced Indexing. Hence, if a particular COLUMN has been indexed, it is best to use wildcard instead of substr.

#BAD
SELECT * FROM TABLE WHERE  substr ( COLUMN, 1, 1 ) = 'value'.

The above will substr every single row in order to seek for the single character 'value'. On the other hand,

#BETTER
SELECT * FROM TABLE WHERE  COLUMN = 'value%'.

Wildcard query will run faster if the above query is searching for all rows that contain 'value' as the first character. Example,

#SEARCH FOR ALL ROWS WITH THE FIRST CHARACTER AS 'E'
SELECT * FROM TABLE WHERE  COLUMN = 'E%'.

Index Unique Column

Some database such as MySQL search better with column that are unique and indexed. Hence, it is best to remember to index those columns that are unique. And if the column is truly unique, declare them as one. However, if that particular column was never used for searching purposes, it gives no reason to index that particular column although it is given unique.

Max and Min Operators

Max and Min operators look for the maximum or minimum value in a column. We can further optimize this by placing a indexing on that particular columnMisleading We can use Max or Min on columns that already established such Indexes. But if that particular column is frequently use, having an index should help speed up such searching and at the same time speed max and min operators. This makes searching for maximum or minimum value faster. Deliberate having an index just to speed up Max and Min is always not advisable. Its like sacrifice the whole forest for a merely a tree.

Data Types

Use the most efficient (smallest) data types possible. It is unnecessary and sometimes dangerous to provide a huge data type when a smaller one will be more than sufficient to optimize your structure. Example, using the smaller integer types if possible to get smaller tables. MEDIUMINT is often a better choice than INT because a MEDIUMINT column uses 25% less space. On the other hand, VARCHAR will be better than longtext to store an email or small details.

Primary Index

The primary column that is used for indexing should be made as short as possible. This makes identification of each row easy and efficient by the DBMS.

String indexing

It is unnecessary to index the whole string when a prefix or postfix of the string can be indexed instead. Especially if the prefix or postfix of the string provides a unique identifier for the string, it is advisable to perform such indexing. Shorter indexes are faster, not only because they require less disk space, but because they also give you more hits in the index cache, and thus fewer disk seeks.

Limit The Result

Another common way of optimizing your query is to minimize the number of row return. If a table have a few billion records and a search query without limitation will just break the database with a simple SQL query such as this.

SELECT * FROM TABLE

Hence, don't be lazy and try to limit the result turn which is both efficient and can help minimize the damage of an SQL injection attack.

SELECT * FROM TABLE WHERE 1 LIMIT 10

Use Default Value

If you are using MySQL, take advantage of the fact that columns have default values. Insert values explicitly only when the value to be inserted differs from the default. This reduces the parsing that MySQL must do and improves the insert speed.

In Subquery

Some of us will use a subquery within the IN operator such as this.

SELECT * FROM TABLE WHERE COLUMN IN (SELECT COLUMN FROM TABLE)

Doing this is very expensive because SQL query will evaluate the outer query first before proceed with the inner query. Instead we can use this instead.

SELECT * FROM TABLE, (SELECT COLUMN FROM TABLE) as dummytable WHERE dummytable.COLUMN = TABLE.COLUMN;

Using dummy table is better than using an IN operator to do a subquery. Alternative, an exist operator is also better.

Utilize Union instead of OR

Indexes lose their speed advantage when using them in OR-situations in MySQL at least. Hence, this will not be useful although indexes is being applied

SELECT * FROM TABLE WHERE COLUMN_A = 'value' OR COLUMN_B = 'value'

On the other hand, using Union such as this will utilize Indexes.

SELECT * FROM TABLE WHERE COLUMN_A = 'value'
UNION
SELECT * FROM TABLE WHERE COLUMN_B = 'value'

Hence, run faster.

Summary

Definitely, these optimization tips doesn't guarantee that your queries won't become your system bottleneck. It will require much more benchmarking and profiling to further optimize your SQL queries. However, the above simple optimization can be utilize by anyone that might just help save some colleague rich bowl while you learn to write good queries. (its either you or your team leader/manager)