Question #37881 “Scalability, compression, encryption, high avai...” : Questions : Graphite

Revision history for this message

Pat LeSmithe (qed777-deactivatedaccount) said on 2008-07-01:

#1

Oops! The "Of course, this does depend to a large extent of the capabilities of the database" sentence should go [also] under the high availability point.

Revision history for this message

chrismd (chrismd) said on 2008-07-01:

#2

Scalability - Both Graphite's frontend and backend can scale horizontally, meaning if you already have 5 Graphite servers and you buy 5 more you should get roughly twice the throughput. However, one of the things the Graphite backend is not designed to do itself is multiplex a large number of high volume client connections. For this, we use another tool I wrote called PypeD, which is designed exactly for the purpose of multiplexing and aggregating a large number of high volume streams into a single stream for each backend Graphite server. PypeD is not currently open sourced, but releasing to the public is one of my next projects. It will definitely be available by the end of the year, hopefully much sooner. Another consideration with the scalability of Graphite though is the scalability of your underlying storage. We use a very large and very fast RAID-5 SAN with a Veritas clustered VXFS filesystem. I am not a storage engineer myself, but I do know that our SAN is quite sophisticated and expensive. If you have more detailed questions on this I would be happy to discuss them with our storage engineers for you.

Compression / Encryption - Graphite uses specialized custom database I wrote called Whisper (http://graphite.wikidot.com/whisper). It is very similar in design to RRD and it is inherently fixed-size, just like RRD. It is not feasible to have the database files directly compressed or encrypted for performance reasons. Basically, when Graphite's backend, Carbon, is writing data into the database, its throughput is highly dependent on latency (how long it takes to do the write() operations) and this is in turn highly dependent on the kernel's I/O caching. If the database files were compressed or encrypted, a single write() would change the content of a significant portion of the file, invalidating the kernel's cache and hence killing the throughput of the system. One possible solution (this is just an idea) would be to use filesystem-level compression or encryption, that may avoid the problem I just described but again you'd need to confirm it with a systems or storage engineer.

High Availability - If you have multiple servers running Graphite then it is very fault tolerant in that the loss of one machine will instantly cause traffic that was going to it to be distributed evenly among the remaining machines, and return to normal once the lost machine is restored. However Graphite's backend is not transactional, and does not duplicate data for redundancy, so when a machine goes down, whatever it had stored in memory is lost. I have not put much thought into solving this yet since I imagine it would add a great deal of complexity to the system, but I am definitely open to the possibility if anyone has some good ideas.

Process events and trigger actions in near real-time - Actually, here's how our system works. We have a lot of java applications that run our website, each of which uses ERMA for instrumentation (that is, ERMA gives us all of the relevant metrics from our applications) and ERMA is configured to send these metrics into a CEP engine we wrote using Streambase (http://www.streambase.com). The CEP engine does our alarming / triggered actions, etc. Once the data is processed by the CEP engine, it is forwarded on to Graphite where the data is stored for later visualization. So when an error occurs in one of our applications, ERMA captures the details of the error and sends it into our CEP engine which would then send an alarm to our operations center and forward the error statistics to Graphite. When our ops center receives the alarm, they click a button and they're presented with a Graphite graph of the error statistics. We use Graphite for visualizing a lot of other things as well, but this is the way we do complex event processing.

I hope that answers your questions.

Scalability - Both Graphite's frontend and backend can scale horizontally, meaning if you already have 5 Graphite servers and you buy 5 more you should get roughly twice the throughput. However, one of the things the Graphite backend is not designed to do itself is multiplex a large number of high volume client connections. For this, we use another tool I wrote called PypeD, which is designed exactly for the purpose of multiplexing and aggregating a large number of high volume streams into a single stream for each backend Graphite server. PypeD is not currently open sourced, but releasing to the public is one of my next projects. It will definitely be available by the end of the year, hopefully much sooner. Another consideration with the scalability of Graphite though is the scalability of your underlying storage. We use a very large and very fast RAID-5 SAN with a Veritas clustered VXFS filesystem. I am not a storage engineer myself, but I do know that our SAN is quite sophisticated and expensive. If you have more detailed questions on this I would be happy to discuss them with our storage engineers for you.

Compression / Encryption - Graphite uses specialized custom database I wrote called Whisper (http://graphite.wikidot.com/whisper). It is very similar in design to RRD and it is inherently fixed-size, just like RRD. It is not feasible to have the database files directly compressed or encrypted for performance reasons. Basically, when Graphite's backend, Carbon, is writing data into the database, its throughput is highly dependent on latency (how long it takes to do the write() operations) and this is in turn highly dependent on the kernel's I/O caching. If the database files were compressed or encrypted, a single write() would change the content of a significant portion of the file, invalidating the kernel's cache and hence killing the throughput of the system. One possible solution (this is just an idea) would be to use filesystem-level compression or encryption, that may avoid the problem I just described but again you'd need to confirm it with a systems or storage engineer.

High Availability - If you have multiple servers running Graphite then it is very fault tolerant in that the loss of one machine will instantly cause traffic that was going to it to be distributed evenly among the remaining machines, and return to normal once the lost machine is restored. However Graphite's backend is not transactional, and does not duplicate data for redundancy, so when a machine goes down, whatever it had stored in memory is lost. I have not put much thought into solving this yet since I imagine it would add a great deal of complexity to the system, but I am definitely open to the possibility if anyone has some good ideas.

Process events and trigger actions in near real-time - Actually, here's how our system works. We have a lot of java applications that run our website, each of which uses ERMA for instrumentation (that is, ERMA gives us all of the relevant metrics from our applications) and ERMA is configured to send these metrics into a CEP engine we wrote using Streambase (http://www.streambase.com). The CEP engine does our alarming / triggered actions, etc. Once the data is processed by the CEP engine, it is forwarded on to Graphite where the data is stored for later visualization. So when an error occurs in one of our applications, ERMA captures the details of the error and sends it into our CEP engine which would then send an alarm to our operations center and forward the error statistics to Graphite. When our ops center receives the alarm, they click a button and they're presented with a Graphite graph of the error statistics. We use Graphite for visualizing a lot of other things as well, but this is the way we do complex event processing.

I hope that answers your questions.

Revision history for this message

Pat LeSmithe (qed777-deactivatedaccount) said on 2008-07-02:

#3

Thanks! That helps quite a lot. A possibly useful but not necessarily high priority suggestion: A nested-boxes-and-arrows diagram showing the essential physical and logical units --- clients, multiplexer, *ends, agents, caches, storage, webapp, bagels --- and the directed lines of communication among them.

Graphite: The Evolution of Blinkenlights.

Sorry.

Graphite

Scalability, compression, encryption, high availability, and complex event processing?

Question information

Subscribers

Graphite

Scalability, compression, encryption, high availability, and complex event processing?

Question information

Related bugs

Related FAQ:

Subscribers