content/en/docs/troubleshooting/performance.md
First and foremost, if something goes seriously wrong during Falco deployment, it's usually noticeable immediately. On a longer time scale, continuous performance monitoring and quality assurance, driven by the right metrics, ensure Falco functions as expected 24/7.
As a security tool, Falco requires checking for a healthy deployment on multiple dimensions:
The Falco Project provides guidance on some of these aspects, and there are ongoing long-term efforts, including a partnership with the CNCF TAG Environmental Sustainability, aimed at enhancing Falco's performance and assessing its impact on the system. These efforts are intended to make it easier for adopters to assess the actual impact on their systems, enabling them to make informed decisions about the cost-security monitoring tradeoffs.
The Falco Project provides native support for measuring resource utilization and statistics, including event drop counters, kernel tracepoint invocation counters, timeouts, and internal state handling. More detailed information is given in the Falco Metrics Guide.
On top of the mind for SREs or system admins is how much CPU and memory Falco will utilize on their hosts. They need to assess whether the cost is justified. To maintain excellent relationships with infrastructure teams, setting resource limits for your Falco deployment is crucial. This can be done through systemd or daemonset limits in a Kubernetes environment.
This is an essential consideration because running a kernel tool always comes with specific challenges and considerations. For example, it could slow down the kernel or the request rates of applications.
Top metrics:
memory_used metric natively exposed in Falco metrics.Beyond monitoring the tool's utilization, check if your applications perform as before. This evaluation could include overall network, I/O, or general contention metrics.
Read Falco Metrics next.
A common misconception is to think that Falco has constant resource utilization. However, that is not accurate. Falco's utilization is directly dependent on the current workload on the host. The more system calls the applications make, the more work Falco has to handle. You can read our Kernel Testing Framework Proposal for more insights into this topic.
Read Falco Is Dropping Syscalls Events next.
Top metrics: