Unlike many articles in Hungred Dot Com where i share valuable web development information with my readers, this article is something that required everyone to debate on. Every system will require a logging system (unless it is a crappy system). Regardless is transaction log, result log, database log, error log and etc., there is always a need have a quick, secure and reliable log to store these information for any further investigation. And logging details usually fall into file or database category. We need to look at three important thing to consider a media to log our details. There are performance, security and reliability. Let me elaborate the importance of each point.
Performance, performance performance! This is something we all want to know about. Whether file base log or database log is better? We will be looking at long run where our log gets really huge! Delay and performance problem might arise and which media will be more resistance against such problem. Another good thing to consider between these two media is the extra cost of HTTP request comparing to a read and write and the problem of delay arise from huge size. We won't want to consider the alternative media only after the problem appear don't we?
Another thing that every hacker will be interested with is the log file. Valuable information is being stored in our log file and it is necessary to consider how secure can either media gives us. Log file may even carry sensitive details of our customers which was log by our program. Hence, considering the security risk of having plain text and a database is important to prevent security hole in our system environment. Each media will have its own way to further secure its media but which is better?
Why we bother to have a log file if it is unreliable. This is necessary for a system that is required to keep track of a system that handle important transaction. An unreliable log might miss a log due to various reason such as manual query termination, file lock, database down during logging and etc. It is necessary to have all our log in order to capture important incidents.
Other log criteria
Scalability and flexibility is another thing some of you might want to mention. Migration of server and ease of searching etc. is also points that is important for us to consider as a log that cannot find its detail is consider a useless log.
Performance wise, database might be slower when log amount is small. But once the log amount became a huge amount, database based logging might really be much faster. The problem i can see is that it will fight with other urgent query which has higher priority to be executed and table locking. This is usually resolve by using MySQL Insert Delay operation. Another issue will be latency which cause the delay o of the logging operation. In term of searching database logging surely have the upper hand. Security of the log depends solely on the security of the server and database. There might be risk of SQL injection but usually this should be taken care of by the developers.
In term of reliability, using insert delay will risk the chances of our log getting lost especially if the system is a very active one. In a very busy system every few millisecond time interval there will be additional query that makes the database super busy until the insert delay log are pile up and have to wait till the database is quiet to be active. Hence, any accident such my sql die or forcefully terminated, the log query are gone. Furthermore, additional overhead to delay such insert will degrade MySQL performance by a little.
Log file is the simplest way to achieve a logging system. Its basically just a few lines of code (depend how paranoid logger are you). While the greatest advantage is its simplicity, the worst problem of file based logging is searching. Most developers who move to file based logging end up not relaying on logs. But usually this can be overcome with some formatting and regular expression. Performance wise, it should be directly opposite a database logging where smaller size will be better and larger it gets worst. Nonetheless, theoretically both should be the same in term of opening and closing of file regardless of size. It should be solve easily by utilizing buffer. In term of security, file based logging usually uses plain text file. Knowing the name of the log file is equivalence to exposing to the public (especially open source apps). But this is usually resolve using file permission setting.
Unlike database logging, file based logging doesn't required a call to the database. Hence, everything is done by the server scripting language you are using and operation is complete regardless of whether the connection is down(as long as the request pass from client to server is complete).
The other more critical part to choose file based logging is the problem of file locking where only one person is allowed to open the log file at one time. Hence, in a active system this might really post a big problem where logging is done intensively. The most expensive part in file based logging should be searching. Hence, regular expression can be really handy (or pain in the ass).
Some uses both file based logging and database logging with a little help from a external batch program. But it really depends on the need and required of your logging system. But my job here is done; I have started the fire. Now its time to heat it up. 😀