In this blog, I will provide answers to the Q & A for the Identifying and Solving Database Performance Issues with PMM webinar.
First, I want to thank everybody for attending the September 15 webinar. The recording and slides for the webinar are available here. Thanks for so many good questions. Below is the list of your questions that I wasn’t able to answer during the webinar, with responses:
Q: PMM has some restrictions working with metrics from RDS instances (AWS)? Aurora for example?
Query analytics can be done only using performance_schema as a query source for RDS/Aurora. No slow log is possible at this moment, as it’s not in the file but the mysql.slow_log table. As for metrics, only MySQL-related ones can be collected, thus only MySQL graphs are available. No system metrics. However, we can look at what we can fetch from CloudWatch to close this gap.
Q: How many ports are needed for each MySQL client? This is related to how many ports need to be open on the firewall.
One metric service per instance requires one port (e.g., mysql:metrics, linux:metrics, mongodb:metrics). The MySQL query service (mysql:queries) does not require a port to open, as the agent connects to the server — unlike the server to client connection in the case of metric services. Usually, the diagram looks like this.
Q: Is it possible to add a customized item for additional monitoring on the client OS?
It is possible to enable additional options on the linux:metrics service (pmm-linux-metrics-42000, node_exporter process). However, it requires “hacking” the service file.
Q: Does PMM have any alerting feature built-in? Or, is there any way to feed alerts to other monitoring framework such as Nagios?
Currently, it doesn’t. But we have plans to add alerting functionality.
Q: Can pmm-client be delivered as an RPM instead of a tar ball?
It is delivered as packages, see the instructions. We recommend using system packages to simplify the upgrade process.
Q: You said it does SSL and Auth. I have 1.0.4 installed, but I do not know a way to do SSL or Auth with the install. My solution is to run it behind an nginx proxy.
Yes, 1.0.4 supports server SSL and HTTP basic authentication. The corresponding documentation is published now.
Q: If you change the Grafana username and password will it be persistent across reboots and upgrades of PMM?
Currently no, but we are working to make it possible very soon.
Q: In Percona cloud tools – we can sort the queries by the sum of the number of queries, or the max time taken by the queries. Is there a way to do that in PMM?
Unfortunately, there is no such possibility, but I have just filled an internal issue so it’s not forgotten.
Q: Can you show us how the explain query works in PMM?
Query Analytics web app calls the API, which asks the corresponding agent connected to the server from the client side to run EXPLAIN on MySQL in real time.
Q: Does PMM track deadlocks?
We can capture metric such as mysql_global_status_innodb_deadlocks but we do not currently graph it. Thanks for pointing out, we will consider adding it.
Q: Is it possible to set up alarm thresholds and notification?
Q: Is there any plan to add sending alerts by e-mail or SMS? This would be an excellent feature.
Currently, there is no alerts functionality. But we have plans for that.
Q: If we do not have a service manager installed like systemv or systemd, upstart , I am using Gentoo Linux on my database server , it has no service manager
Unfortunately, we didn’t test PMM client on Gentoo Linux. PMM is an open source project, so any person is welcome to implement support for Gentoo Linux and propose the patch.
Q: What kind of resources are needed by the “server” if we have a moderately active “client” it will be monitoring? (db with millions of rows, thousands of queries per minute)
Regarding metrics, it should not be a problem. If there are thousands of schemas or tables, then per table stats are disabled automatically (10000+ tables). You can also disable them explicitly. In terms of query analytics, it depends on the volume of slow queries ( long_query_time settings ) and how many of them are safe to write and later process. I would say MySQL options should be tuned to help in monitoring if possible (as long as they don’t impact performance).
Q: What is the ETA for the Amazon EC2 image for the PMM server?
We have no ETA, but we acknowledge that not everyone can afford using Docker. We try to do our best to have other shipping methods available.
Q: Does the client have to be installed on the mySQL or Mongo target?
No. However, for best results we recommend running the PMM client on the same host that is being monitored. You can setup remote monitoring when it is not possible.
Q: What are the pros and cons of running the PMM client on the same system as the DB?
Running PMM client locally, you can monitor system metrics (linux:metrics) and MySQL queries from the slow log, which is not possible remotely due to not having access to the filesystem. When the client runs remotely, say MySQL metric service connects to the db host to get metrics, then the server scrapes from the client. So you may get some network latency issues as the metric has to make two hops. Running locally, the metric exporter may pull metrics using MySQL socket.
Q: And if you have any plans for the PMM work in MySQL cluster, not in Galera, but in the same MySQL Cluster.
There is a limited support now, you can monitor MySQL Cluster SQL nodes. However, we currently do not do anything with data nodes or management nodes.
Q: I’m aware that Percona focuses on MySQL and MongoDB, but do you have any plans for supporting plugins capable of monitoring other databases beyond ones supported now? Is this feasible now in the current state of PMM?
There are plans to monitor backups, to support HA tools (e.g., ProxySQL), alerting functionality, etc. We will also think about plugins in some future version, so it is easier for the community to expand PMM for a complete monitoring of their database infrastructure.
Q: Can the query monitor display query execution plan?
Yes, usually you can run EXPLAIN on the query. However, sometimes it requires typing the database name. We plan to improve this behavior.
Q: For metrics monitor, what is the main difference between PMM and Observium?
I don’t know much about Observium, but it looks to me it’s mainly a network device monitoring tool. PMM is focused on database performance.
Q: With the Query Analytics tool, is there any way to capture the notes about an individual query, along with a query review status, such as whether it’s “marked as reviewed”, or “needs attention”?
Unfortunately no, but it would be a great feature and it is on our roadmap.
Q: If possible to monitor the replication MySQL (Slave status ..)?
Yes, there is MySQL Replication dashboard.
Q: Is this possible to store metrics for a long time period to a custom time series database such us openTSBD?
We acknowledge that long-term storage of metrics is very important. However, we can’t offer a complete solution at this moment. Something is in progress. It is possible to point Prometheus to remote storage such as OpenTSBD, but we can’t guarantee how reliable it would be. The main problem is you will need to downsample data and create another dashboard to query from OpenTSBD (which is not flexible).
Q: can PMM have any impact on MySQL server performance?
Q: What is the additional overhead to the “clients” with PMM?
We try to make PMM as safe as possible. In particular, pmm-admin has options to disable the most critical things that might cause performance issues. For example, per table stats when there are thousands of tables, processlist when there is an abnormal count of process states, etc.
Q: Are there graphs for TokuDB storage engine as well?
Yes, there is a TokuDB Metrics dashboard.
Q: Is it recommended not to run PMM server on Production MySQL server, but instead run it on a separate server?
We do not recommend running PMM server on the database server, or on the production one. It should be run on a separate machine, acting a monitor server. It does not require MySQL.
Q: Can you install PMM Server w/out docker?
Currently, no. But we have plans to distribute PMM server as VM, Amazon EC2 image, etc.
Q: Is there a way to integrate with a pre-existing Grafana server?
Yes, you can add Prometheus data source pointing to “http://server/prometheus/: in your Grafana configuration, but you will have to import the dashboards separately.
Q: The graphs plotted using perfromance_schema tables, you have chosen not to display all the events from performance_schema.file_summary_by_event_name. Like there are only Read, Write and Misc events plotted, but Wait is never plotted. So, is there a criterion on which you decide which values are important and which are not for plotting. Also, what is the significance of each of them?
Looks like the amount of events plotted would depend on the performance schema instrumentation settings.
Q: I am trying to figure out how big of a container/VM/machine I need to create for the PMM server and/or client…
Q: What do you recommend to consider as the databases being monitored scales up? (i.e., going from 10 instances monitored to 500 instances)
Q: What are the hardware recommendations to monitor 1000 DB server?
Depending on the PMM settings and MySQL workload, the load on the PMM server might vary a lot. Fast CPUs, plenty of memory and SSD storage are what provide the best performance. If you have a single PMM instance overloaded, you can use multiple PMM instances (i.e., one in each data center). We will be working on a solution for very large scales in the future versions.