Installing a database
The database we will use on this book, as already explained, is MySQL.
Now, considering that you have a Red Hat server, the procedure to install MySQL from the RPM repository is quite easy:
$ yum install mysql mysql-server
Now, you need to set up the MySQL service to start automatically when the system boots:
$ chkconfig --levels 235 mysqld on $ /etc/init.d/mysqld start
Tip
Remember to set a password for the MySQL root user
To set a password for the root, you can run these two commands:
/usr/bin/mysqladmin -u root password 'new-password' /usr/bin/mysqladmin -u root -h hostname-of-your.zabbix.db password 'new-password'
Alternatively, you can run:
/usr/bin/mysql_secure_installation
This will also help you to remove the test databases and anonymous user data that was created by default. This is strongly recommended for production servers.
Now, it's time to create the Zabbix database. For this, we can use the following commands:
$ mysql -u root -p $ mysql> CREATE DATABASE zabbix CHARACTER SET UTF8; Query OK, 1 row affected (0.00 sec) $ mysql> GRANT ALL PRIVILEGES on zabbix.* to 'zabbixuser'@'localhost' IDENTIFIED BY 'zabbixpassword'; Query OK, 0 rows affected (0.00 sec) $ mysql> FLUSH PRIVILEGES; $ mysql> quit
Next, we need to restore the default Zabbix MySQL database files:
$ mysql -u zabbixuser -pzabbixpassword zabbix< /usr/share/doc/zabbix-server-mysql-2.4.0/create/schema.sql $ mysql -u zabbixuser -pzabbixpassword zabbix < /usr/share/doc/zabbix-server-mysql-2.4.0/create/images.sql $ mysql -u zabbixuser -pzabbixpassword zabbix < /usr/share/doc/zabbix-server-mysql-2.4.0/create/data.sql
Now, our database is ready. Before we begin to play with the database, it's important to do some consideration about database size and heavy tasks against it.
Considering the database size
Zabbix uses two main groups of tables to store its data:
- History
- Trends
Now, the space consumed by these tables is influenced by:
- Items: This is the number of items you're going to acquire
- Refresh rate: This is the mean average refresh rate of our items
- Space to store values: This depends on RDBMS
The space used to store data can vary due to the database, but we can resume the space used by these tables in the following table:
Type of measure |
Retention in days |
Space required |
---|---|---|
History |
30 |
10.8 G |
Events |
1825 (5 years) |
15.7 GB |
Trends |
1825 (5 years) |
26.7 GB |
Total |
NA |
53.2 GB |
This calculation is, of course, done considering the environment after 5 years of retention. Anyway, we need to have an environment ready to survive this period of time and retain the same shape that it had when it was installed. We can easily change the history and trends retention policy per item. This means that we can create a template with items that have a different history retention by default. Normally, the history is set to 30 days, but for some kind of measure (such as in web scenarios) or other particular measures, we need to keep all the values for more than a week. This permits us to change this value on each item.
MySQL partitioning
Now that we are aware of how big our database will be, it's easy to imagine that housekeeping will be a heavy task and the time, CPU, and resource consumed by this one will grow together with the database size.
Housekeeping is in charge to remove the outdated metrics from the database and the information deleted by a user, and as we've seen the history, trends, and events tables are, after some time, huge tables. This explains why the process is so heavy to manage.
The only way we can improve performances once we have reached this volume of data is by using partitioning and disabling the housekeeper altogether.
Partitioning the history and trend tables will provide us with many major benefits:
- All history data in a table for a particular defined window time are self-contained in its own partition. This allows you to easily delete old data without impacting the database performance.
- When you use MySQL with InnoDB, and if you delete data contained in a table, the space is not released. The space freed is marked as free, but the disk space consumed will not change. When you use partition, and if you drop a partition, the space is immediately freed.
- Query performance can be improved dramatically in some situations, in particular, when there is heavy access to the table's rows in a single partition.
- When a query updates a huge amount of data or needs access to a large percentage of the partition, the sequential scan is often more efficient than the index usage with a random access or scattered reads against this index.
Unfortunately, Zabbix is not able to manage the partitions. So, we need to disable housekeeping, and use an external process to accomplish housekeeping.
What we need to have is a stored procedure that does all the work for us.
The following is the stored procedure:
DELIMITER $$ CREATE PROCEDURE `partition_maintenance`(SCHEMA_NAME VARCHAR(32), TABLE_NAME VARCHAR(32), KEEP_DATA_DAYS INT, HOURLY_INTERVAL INT, CREATE_NEXT_INTERVALS INT) BEGIN DECLARE OLDER_THAN_PARTITION_DATE VARCHAR(16); DECLARE PARTITION_NAME VARCHAR(16); DECLARE LESS_THAN_TIMESTAMP INT; DECLARE CUR_TIME INT;
Until here, we have declared the variable we need after. Now, on the next line, we will call the stored procedure responsible to check whether a partition is already present and if not, we will create them:
CALL partition_verify(SCHEMA_NAME, TABLE_NAME, HOURLY_INTERVAL); SET CUR_TIME = UNIX_TIMESTAMP(DATE_FORMAT(NOW(), '%Y-%m-%d 00:00:00')); IF DATE(NOW()) = '2014-04-01' THEN SET CUR_TIME = UNIX_TIMESTAMP(DATE_FORMAT(DATE_ADD(NOW(), INTERVAL 1 DAY), '%Y-%m-%d 00:00:00')); END IF; SET @__interval = 1; create_loop: LOOP IF @__interval > CREATE_NEXT_INTERVALS THEN LEAVE create_loop; END IF; SET LESS_THAN_TIMESTAMP = CUR_TIME + (HOURLY_INTERVAL * @__interval * 3600); SET PARTITION_NAME = FROM_UNIXTIME(CUR_TIME + HOURLY_INTERVAL * (@__interval - 1) * 3600, 'p%Y%m%d%H00');
Now that we have calculated all the parameters needed by the create_partition
procedure, we can run it. This stored procedure will create the new partition on the defined schema:
CALL partition_create(SCHEMA_NAME, TABLE_NAME, PARTITION_NAME, LESS_THAN_TIMESTAMP); SET @__interval=@__interval+1; END LOOP; SET OLDER_THAN_PARTITION_DATE=DATE_FORMAT(DATE_SUB(NOW(), INTERVAL KEEP_DATA_DAYS DAY), '%Y%m%d0000');
The section that follows is responsible to remove the older partitions, using the OLDER_TAN_PARTITION_DATE
procedure, which we have calculated on the lines before:
CALL partition_drop(SCHEMA_NAME, TABLE_NAME, OLDER_THAN_PARTITION_DATE); END$$ DELIMITER ;
This stored procedure will be the core of our housekeeping. It will be called with the following syntax:
CALL partition_maintenance('<zabbix_db_name>', '<table_name>', <days_to_keep_data>, <hourly_interval>, <num_future_intervals_to_create>)
The procedure works based on 1 hour intervals. Next, if you want to partition on a daily basis, the interval will be 24 hours. Instead, if you want 1 hour partitioning, the interval will be 1.
You need to specify the number of intervals that you want created in advance. For example, if you want 2 weeks interval of future partitions, use 14. If your interval is 1 (for hourly partitioning), then the number of intervals to create is 336 (24*14).
This stored procedure uses some other stores procedures:
partition_create
: This creates the partition for the specified tablepartition_verify
: This checks whether the partition is enabled on a table, if not, then create a single partitionpartition_drop
: This drops partitions older than a timestamp
For all the details about these stored procedures, see Appendix A, Partitioning the Zabbix Database.
Once you've created all the required stored procedures, you need to change two indexes to enable them in order to be ready for a partitioned table:
mysql> Alter table history_text drop primary key, add index (id), drop index history_text_2, add index history_text_2 (itemid, id); Query OK, 0 rows affected (0.49 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> Alter table history_log drop primary key, add index (id), drop index history_log_2, add index history_log_2 (itemid, id); Query OK, 0 rows affected (2.71 sec) Records: 0 Duplicates: 0 Warnings: 0
Once this is done, you need to schedule the partition_maintenance_all
stored procedure with a cron job. For more details about the partition_maintenance_all
procedure, please check the instructions contained in Appendix A, Partitioning the Zabbix Database. The cron job needs to execute the following command:
mysql -h <zabbix_db_host> -u<zabbixuser> -p<zabbixpassword> zabbixdatabase -e "CALL partition_maintenance_all('zabbix');"
Once this has been set, you need to bear in mind to disable the housekeeping for history and trends. Verify that the Override item <trend/history> period Zabbix configuration is checked for both history and trends. Here, you need to set the Data storage period (in days) box for history and trends to the value you've defined in your procedure, our example in Appendix A, Partitioning the Zabbix Database is of 28 and 730.