FBQ: GridPane monit killing MySQL / MariaDB and Tuning MySQL / MariaDB

FBQ: GridPane monit killing MySQL / MariaDB and Tuning MySQL / MariaDB

Tags
mysqlTutorialfacebookmariadb
Created Date
November 29, 2021

Table of Contents

What are Facebook Questions (FBQ)

When I see a Facebook post that I think deserves a write up or I respond with a huge paragraph. I usually end up putting it up somewhere because I think the content is useful for all.

GridPane monit killing MySQL / MariaDB and Tuning MySQL / MariaDB

image

What is monit?

If you don't know about monnit, then here's a quote from their website

Monit is a small Open Source utility for managing and monitoring Unix systems. Monit conducts automatic maintenance and repair and can execute meaningful causal actions in error situations.

You can read more about monit on their website. https://mmonit.com/monit/

GridPane monit Killing MySQL / MariaDB

GridPane uses monit to monitor system processes and alert or process an action based on specific rules. Specifically for MySQL / MariaDB monit is configured to restart MySQL / MariaDB if it reaches a specific memory usage on the server.

MySQL / MariaDB GridPane monit Configuration

Here's the GridPane configuration for MySQL / MariaDB on GridPane servers.

if mem > 1513 MB for 10 cycles then exec "/usr/local/bin/gpmonitor MYSQL MEM_HIGH warning" AND repeat every 10 cycles
if mem > 1513 MB for 20 cycles then exec "/usr/local/bin/gpmonitor MYSQL MEM_RESTART error"

As you can see in this example. If the MySQL / MariaDB process reaches above 1513MB for 10 cycles (interval configured for monit to check, set in /etc/monitrc) then restart MySQL / MariaDB.

Why did GridPane set it up this way?

I've talked to Jeff about why this is in place. He's stated that it's to ensure a server doesn't literally fall to it's knees and require a cold reset (can't reboot gracefully).

I've argued that restarting MySQL / MariaDB induces an outage as non-cached requests will display an error.

Will I be notified of MySQL / MariaDB Restarts by monit?

You will only be notified if you login to the GridPane interface and receive the alert, or if you have Slack notifications setup.

Modifying GridPane Monit Configuration

You can change the monit configuration that GridPane manages using the following article on their knowledge base.

You don't want to modify the files directly.

What happens if I upgrade my servers memory?

As per Jeff from GridPane. If you upgrade your server's memory then there is a script that runs every 15 minutes to adjust monit appropriately. You can in-fact run it yourself, as per the article above.

gpmonit mysql \
-cpu-warning-percent 120 -cpu-warning-cycles 10 \
-cpu-restart-percent 180 -cpu-restart-cycles 5 \
-mem-high-mb 2692 -mem-high-cycles 10 \
-mem-restart-mb 3365 -mem-restart-cycles 10

You can also simply run gpmonit mysql and the above settings will be configured automatically. This is what runs every 15 minutes on GridPane servers.

However, I've observed servers with 15GB of memory have a monit configuration to restart when 1500MB of memory is used by MySQL / MariaDB. Which of course is problematic.

All of GridPane's settings are stored in /root/gridenv which is set to write protected using chattr. There is a file called "monit.env" that has the following configuration settings.

mysql-mem-warning-threshold-mb:2790
mysql-mem-warning-cycles:10
mysql-mem-restart-threshold-mb:2790
mysql-mem-restart-cycles:20

The above file is just an example, yours might be different.

Restarting MySQL / MariaDB with monit, bad idea?

If your server does see an increase in traffic, restarts might start pre-maturely if monit isn't configured properly or in some case configured properly. The algorithm provided by GridPane to set the monit configuration to restart MySQL / MariaDB is as follows

System Memory
monit Config
=1500
500MB
> 1500
70% of system memory
> 7000
90% of system memory
> 16000
50% of system memory

Diagnosing MySQL / Mariadb High Memory Issues

If you do find that MySQL /MariaDB is taking up a considerable amount of memory. You can run mysqltuner.pl which may be on your server or may not. You can also download it on Github.

Reading mysqltuner.pl output

Looking at the mysqltuner.pl output that the user provided on this Facebook post.

[OK] Maximum reached memory usage: 179.5M (4.55% of installed RAM)
[OK] Maximum possible memory usage: 341.4M (8.66% of installed RAM)
[OK] Overall possible memory usage with other process is compatible with memory available
[OK] Slow queries: 0% (0/41K)
[OK] Highest usage of available connections: 6% (10/151)
[!!] Aborted connections: 3.98%  (184/4625)
[OK] InnoDB File per table is activated
[!!] InnoDB buffer pool / data size: 128.0M/144.5M
[!!] Ratio InnoDB log file size / InnoDB Buffer pool size (156.25 %): 100.0M * 2/128.0M should be equal 25%
[--] Number of InnoDB Buffer Pool Chunk : 1 for 1 Buffer Pool Instance(s)

InnoDB Buffer Pool

I saw that the InnoDB buffer pool was lower than the stored data in MySQL / MariaDB. This means some data will reside in memory and in local disk. This can cause performance issues as memory is faster than disk. Increasing your InnoDB Buffer Pool to 256MB or 512MB in this instance would be a good idea. You can do this using the GridPane CLI commands located here https://gridpane.com/kb/configure-mysql/

InnoDB Log File Size

Also, change innodb_log_file_size to 128MB as it should be InnoDB Buffer Pool (512MB) / 4 this is pretty straight forward.

Innodb Thread Pool Size

You can change thread_pool_size to the number of cores you have on your server. Right now it's set to 1 which means that if multiple requests come to SQL only 1 will be executed at a time. So you might want to set this to 2 or 4 depending.

Aborted Connections and max_connections

I'd also change max_connections down to 100 from 151 as your aborted connections might be related to the thread_pool_size only being 1 and a surge in SQL queries backing up. But even then I'd doubt that you're getting 151 queries on the number of brochure sites you have. Unless of course something is bypass cache. This is where Newrelic and Netdata could help track your connection count issues. Perhaps it's your backup concurrency you've set for GridPane backups.

Restart MySQL / MariaDB

Once you make the above changes. Restart MySQL and then run the mysqltuner.pl to figure out your Max MySQL Memory and set monit just above it.

Outro

Make sure you setup the appropriate Slack Notifications to catch MySQL / MariaDB restarts. Otherwise if it goes unnoticed, you might have your clients notifying you that they had problems with their site. Or worse, you end up troubleshooting the issue and never realized it's monit causing the issue.