Summary
Do you know what is happening in your production systems right now? If you have a comprehensive metrics platform then the answer is yes. If your answer is no, then this episode is for you. Jason Dixon and Dan Cech, core maintainers of the Graphite project, talk about how graphite is architected to capture your time series data and give you the ability to use it for answering questions. They cover the challenges that have been faced in evolving the project, the strengths that have let it stand the tests of time, and the features that will be coming in future releases.
Preface
- Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
- I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable.
- When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters.
- If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind.
- Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email hosts@podcastinit.com)
- To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media.
- Now is a good time to start planning your conference schedule for 2018. To help you out with that, guest Jason Dixon is offering a $100 discount for Monitorama in Portland, OR on June 4th – 6th and guest Dan Cech is offering a €50 discount to Grafanacon in Amsterdam, Netherlands March 1st and 2nd. There is also still time to get your tickets to PyCascades in Vancouver, BC Canada January 22nd and 23rd. All of the details are in the show notes
- Your host as usual is Tobias Macey and today I’m interviewing Jason Dixon and Dan Cech about Graphite
Interview
- Introductions
- How did you get introduced to Python?
- What is Graphite and how did you each get involved in the project?
- Why should developers be thinking about collecting and reporting on metrics from their software and systems?
- How do you think the Graphite project has contributed to or influenced the overall state of the art in systems monitoring?
- There are a number of different projects that comprise a fully working Graphite deployment. Can you list each of them and describe how they fit together?
- What are some of the early design choices that have proven to be problematic while trying to evolve the project?
- What are some of the challenges that you have been faced with while maintaining and improving the various Graphite projects?
- What will be involved in porting Graphite to run on Python 3?
- If you were to start the project over would you still use Python?
- What are the options for scaling Graphite and making it highly available?
- Given the level of importance to a companies visibility into their systems, what development practices do you use to ensure that Graphite can operate reliably and fail gracefully?
- What are some of the biggest competitors to Graphite?
- When is Graphite not the right choice for tracking your system metrics?
- What are some of the most interesting or unusual uses of Graphite that you are aware of?
- What are some of the new features and enhancements that are planned for the future of Graphite?
Keep In Touch
- Jason
- @obfuscurity on Twitter
- Website
- obfuscurity on GitHub
- Dan
Picks
- Tobias
- Jason
- Dan
- Home Assistant
- GrafanaCon €50 discount with PODCASTINIT2018
Links
- Graphite
- Sensu
- Monitorama
- RainTank
- Grafana Labs
- Librato
- GitHub
- Dyn
- Telemetry
- Perl
- PHP
- React
- O’Reilly Graphite Book
- Time Series
- RRDTool
- InfluxDB
- Adrian Cockcroft
- NVMe
- Prometheus
- CNCF
- ASAP Smoothing
- PyCascades
The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA
Hello, and welcome to podcast dot in it, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you're ready to launch your next project, you'll need somewhere to deploy it, so you should check out linode at ww w.podcastinnit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your app or experimenting with something that you hear about on the show. You can visit the site at www.podcastinit.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show, please leave a review on Itunes or Google Play Music, tell your friends and coworkers, and share it on social media.
Now is a good time to start planning your conference schedule for 2018. To help you out with that, guest Jason Dixon is offering a $100 discount for Monterama in Portland, Oregon that's happening on June 4th to 6th, and guest Dan Chek is offering a €50 discount to GrafanaCON in Amsterdam, Netherlands happening March 1st 2nd. There's also still time to get your tickets to Py Cascades in Vancouver, British Columbia happening January 22nd 23rd. All of the details are available in the show notes. Your host as usual is Tobias Macy, and today I'm interviewing Jason Dixon and Dan Cech about graphite. So, Jason, could you start by introducing yourself?
[00:01:29] Unknown:
Sure. My name is Jason Dixon. I currently work for Sensu. We do, monitoring solutions, and I also run a monitoring conference called Mona Trauma. I'm starting up another conference for remote remote teams and distributed workers. That's called Scatter. And I've worked for a lot of companies I think probably a lot of folks are familiar with. Besides SENSU, I've worked for Raintank, Grafana Labs. I've also worked for, Librato, GitHub, Heroku, Dyn, and probably like a lot of folks in our industry, quite a few others. So, yeah, it's a little bit about me. Said, I guess I'm a little bit of an engineer, little bit of operations, and currently, I'm actually the VP of business development for us into.
[00:02:14] Unknown:
And, Dan, how about yourself?
[00:02:17] Unknown:
I'm Dan Czech. I definitely don't have Jason's depth of of work history. I've spent, about 10 years working in, the web hosting industry and developing billing, account management, network monitoring, type systems there. These days, I'm working for Grafana Labs, and as part of that, doing development work on graphite. It's a it's a large part of of our stack, but, obviously, our our main project at Grafana Labs is Grafana, which is a dashboarding and alerting system that works with Graphite as well as a huge number of other, time series databases and other data sources and, works to bring those together and help people to be able to visualize the data that they're storing in graphite or or similar systems.
[00:03:13] Unknown:
Yeah. Grafana is another project that's very high on my radar of things to start working with as soon as possible, and, I'm sure that we'll be talking some more about it as we go through the conversation because I know that it's become sort of the recommended front end for graphite. But before we go too much farther, if, Jason, if you could just briefly talk about how you first got introduced to Python.
[00:03:35] Unknown:
I wanna say it was probably graphite of all things. You know, going back to my early career as an operations person, I had done a lot of Perl. And then, you know, as time went on and Ruby became more popular and you saw more folks building operations tools out of Ruby, I started doing more of that. But then inevitably I kept running into this graphite project. And and monitoring software and logging and just telemetry in general had always kind of been an interest of mine, kind of a hobby. And I had built a lot of just, you know, kinda hacky, open source, very, you know, sharp purpose driven tools using hurl.
And then I encountered this graphite thing, which allowed you to really easily collect, you know, measurement data from applications, services, hosts, and and graph them really quickly, whether it's for prototyping or for, you know, trending and historical data. It just allowed anybody to really quickly gather arbitrary data, visualize it, and share it with others. And it was a hugely powerful tool. And inevitably, like a lot of open source projects do, is you you start to realize limitations of it, and you realize that this is something that I need to extend if I wanna do, you know, other really interesting things, you know, my job or whatever. And, so that led me to kind of start looking in the code, getting more familiar with Python, and just, you know, trying to hack little things here and there, whether it's adding a render function. And I guess we can hit on that later, or whatnot.
I guess that was kind of how I got into it. I mean, I I you know, unlike a lot of people, I would never say I jumped into it because of just a natural affinity for it. It was just like I, you know, ran into this thing that I enjoyed working with a lot. And and really the only way to level up my expertise was to really get into the source code and understand it and and, you know, hack it up a little more.
[00:05:30] Unknown:
And, Dan, do you remember how you first got introduced to Python?
[00:05:33] Unknown:
Yeah. I do, actually. I first ran into Python in about 2001, at university. And at the time, all of the cool web development kids were using PHP, so I, I kind of poo pooed it a little bit. And then after, 10 years or so of professional PHP development, changed jobs, and and it was time for something new. So I I went back to Python and spent 2 years or so building APIs in Python and, found out I really enjoyed it. And now that I'm at Grafana Labs, I have I have really been enjoying the opportunity to to do more Python in working on graphite, and it's it's definitely the largest Python project that that I've been involved with. But, yeah, it's it's been really interesting, to do that along, alongside also trying to learn, Golang and, doing a lot of modern, front end web front end development with, React and things like that. So I'm I'm definitely enjoying that mix of languages right now and, appreciating the differences between, the different languages.
[00:06:51] Unknown:
Yeah. It's always interesting working across different languages and trying to think of, how they play off of each other and the different ideas that you can bring back and forth between them and maybe try to gain a better fundamental understanding of how it all fits together at the sort of computational layer. Absolutely. And we've briefly mentioned, graphite and a little bit about what it is, and you both, touched on how you got involved with the project. So I'm wondering if you can just give a bit of a deeper description about what the graphite project is and what its purpose is, within a computing stack, and then also briefly just what your role within the graphite community is or has been?
[00:07:37] Unknown:
Sure. I'll I'll take that 1, I guess.
[00:07:39] Unknown:
So You did write the book on it, Chase.
[00:07:42] Unknown:
How do I how did I give an introduction and fail to mention that I actually wrote the O'Reilly graphite book? So graphite, I I I guess I touched on this. It's it's a, it's a well, it's a time series, storage engine. And what I mean by that is, you know, it's it's a way of storing, measurements of data recording the performance of something over time. So, you know, let's say you wanna see how something is happening every second. It'll record those measurements and then allow you to retrieve them later on and visualize them as graphs or export them as data or whatnot. And that's basically what time series is. It's taking these measurements over precise regular intervals.
You know, why do we do that? I think, you know, there's a lot of good answers to why collect metrics. And when I refer to a metric, I'm referring to, like, you know, the actual thing you're measuring and the value that that it represents. Why should we do that? I think the easiest way is saying, you know, like, how do you know whether something is wrong unless you know what it looks like when it's healthy? I think there's a better way of saying that, but that's the basic gist of it. It's like we need to kind of monitor what these things look like so we understand when they're either hitting a regression or we're being overloaded or whatever the case. And there's lots of ways of doing monitoring. I think that's probably outside of the scope of that, of this, you know, general podcast. But, you know, I think graphite is 1 important part of that, which is, you know, gathering that time series data, allowing you to do graphs and dashboards and just kind of visualize what the overall health of your services and your applications and your systems are. And, Dan, can you talk a bit about what your role has been since you started working with the Graphite project?
[00:09:20] Unknown:
Sure. My role has mostly been making sure that we need to have a, a new revision of the graphite book by by kicking things over and and, generally causing trouble. So when I when I came into graphite at the time, there was a a big push to get, the the graphite 1.0 release, out the door, and for those who aren't really familiar with the project, it's been around for a very long time but has has been I guess, the unkind way to say it would be stalled, but the the kinder way would be would be to say that, you know, it was it was at a 0.9 release for a long time. There was a lot of work being done,
[00:10:01] Unknown:
and that work Would it safe would it be safe to say it was kind of sideways progression?
[00:10:06] Unknown:
Yeah. Yeah. So there there was there there was work there was work going on towards 1.0, at the same time as there was a lot of maintenance being done on the 0.9 release, and that caused issues with being able to get 1.0 out the door because the 2 co bases were diverging. So Jason was in the middle of a big push to try and make sure that the 1.0 code base would would have all the features that were in 0.9 and would would be able to be a drop in replacement. And he he called on me to help with that effort and to try and untangle some things that were a little bit on the messy side in the in the main 1 development branch.
So that was kind of how I I got into it. I had a little bit of experience with time series databases. I'd spent, many years working with Rrdtool, which is the other kind of granddaddy of of time series databases that's that's been around for a very long time at this point and that was definitely helpful. And since then, I have been doing a lot of work on graphite just because it it does form a very important part of our stack, both internally, at Grafana Labs. We use it as part of our monitoring stack and to keep an eye on all of our production systems. It also forms a part of the hosted time series database offering that we make available to our clients. So as part of that, I've been doing a lot of work on the different functions that are available within graphite that are used for analyzing and massaging and generally making sense of all of the raw data that's that's coming in, from the different collectors.
And then also the kind of the biggest biggest work that I've been doing, recently has been to add support for tagging into graphite. That's been 1 thing that you know, a lot of the more, quote unquote modern time series databases have have added, that wasn't available in graphite 0.9, and will be coming out in the next graphite release. So the ability to tag metrics, and not have to come up with a rigid naming hierarchy makes things a lot more flexible and makes it a lot easier to drill down into the data that's coming out of systems. So I'm I'm very excited about that.
[00:12:41] Unknown:
Yeah. The tagging capability is definitely something that I've heard a lot of people asking for who have been using Graphite for a while and talking about some of the ways that they've managed to do various back flips with the hierarchical organization scheme that the, graphite system has had for a while in order to be able to embed certain bits of information in the metrics as they're being collected. So, it's and and I was actually reading through the docs and saw the tagging capability listed there, so I thought that was interesting. But it's, makes sense now that you're saying that it's a new feature that's coming going to be available. So I'm sure there are a lot of people who'll be happy to hear that.
[00:13:20] Unknown:
Yeah. We're we're very excited to to see how that is received by the community once it's officially available. I I do know there are a number of people that are experimenting with it already, and, you know, we are internally at Grafana Labs. And, yeah, it's it's very exciting.
[00:13:40] Unknown:
You mentioned briefly the RRD tool in the round robin databases that it has relied on for a long time. And I know that for quite a while, that was sort of the de facto standard if you wanted to be able to track any sort of time series metrics for your servers. And I know that Graphite used that as its, storage engine when it first got started, but has since switched to using the whisper database. And I'm wondering what you think the graphite project has contributed to or how it has influenced the overall state of the art in systems monitoring, particularly given the fact of how long it's been around and how broadly it's been used?
[00:14:24] Unknown:
I think it's influenced pretty much everything, in its wake. I mean, you'd be hard pressed to find any monitoring or systems related project now that doesn't support graphite's line level metric format I think that's fantastic I think it's enabled a lot of projects I think it's helped to forward this idea of composable monitoring systems, which is, you know, it's something I was begging for many years back and advocated strongly for it at places like Heroku and GitHub. And so to see newer systems like Influx and Prometheus and others take this approach you know, Prometheus is a great example now that they're actually advocating a lot of cases for people to use Influx as Prometheus' own, time series back end. So I I think that's fantastic. Yeah. Absolutely.
[00:15:15] Unknown:
I would say the same thing, and I have, I have an entry on my to do list here to add support for Prometheus to use graphite as a back end time series database. But, yeah, I would certainly say that, you know, carbon line protocol is effectively the lingua franca for sending metrics around. And there is a huge ecosystem that's grown up around being able to to work with data being passed in a very simple protocol, but the ease with which you can implement a tool that can consume or produce carbon line protocol has made it very, very common for people to to use it and made it very easy to to compose different components together, which allows you to build some really powerful tools, by piecing together different pieces and and building workflows, around that.
[00:16:19] Unknown:
Yeah. And the composability of the overall project is definitely an interesting piece of it, and I'm sure 1 that has allowed it to persist as long as it has because of the fact that you can swap in and out different components and fit them together in a in order to fit your particular installation. So I'm curious if you can just briefly run down what all the different layers of the project are and how they fit together to form the working whole.
[00:16:46] Unknown:
Dan, do you wanna take that 1 or should I? You you go for it, Jay. Okay. I guess historically, as you mentioned, Graphite originally used our d tool as the storage engine, that would eventually replaced by Whisper, which is still the time series on disk file format. Carbon is both a a listener for metrics coming in, as well as a client to write out the the database files. It also actually has a cache service that allows the web service, which we'll touch on in a second, to query metrics from it. Basically, it's a hot cache for for metric storage. Then you have the the application, which represents, like, the the composer, which is a UI for prototyping your own graphs using metrics, as well as a dashboard for, you know, composer collecting a bunch of those composed graphs together into a single view.
Later on, you started to see projects like series, which was kind of a next generation. It was kind of an in between generation time series format, which was intended to replace whisper. Never really got to the point where it was finished enough that we could kind of push over the edge. Originally, it was intended to be the 1 0 database. But the tooling, because it doesn't actually do roll ups roll ups natively, the tooling that we needed to make it a first class citizen never really, you know, came to be in. It's not to say that it's completely deprecated. It's just that you don't see a lot of advancement. At least I haven't, and maybe Dan can speak to that. But I haven't seen a lot of development focused on series. I think people are really taking advantage, which was kind of the next thing, which was, you started to see pluggable back end. So they they added in this ability to add in, pluggable storage back end. So you could use, say, a Cassandra based back end or basically anything else by adding this, you know, a shim and then whatever the thing is on the back end. And, you know, the the cool thing about it is that, you know, that the render API, which actually composes the graphs that normally, you know, pulls the the data from whisper or from the the hot you know, the carbon cache, so it can just talk to this other thing instead. It doesn't care.
And then it gets the metrics back and and renders a graph or returns JSON. So those are, I think, most of the components off the top of my head. Did I miss anything, Dan? That's that's most of it. You know, in 1.1,
[00:19:12] Unknown:
we also have the the tag database. So that's a that's a new component that's been added, and, that's responsible for storing and allowing you to look up, series by tag. So that's that's the that's the new thing. But, yeah, definitely, you know, today, the pluggable finders is really the killer feature for graphite. AgriPana Labs, we use that we use that ourselves Cassandra backed, Metric Store. You can also use that. There is a project which allows you to use the Graphite, render API on top of InfluxDB. So you can get access to the richness of the graphite rendering API and all of the different functions that are available there without being tied to a particular storage back end.
And, you know, Whisper is is definitely getting long in the tooth, but it it does its job, and it does it very well. So there are definitely a lot of people out there still using Whisper, and, it's gonna be interesting to see, how Whisper develops if we can get some some more people to to continue contributing. And, certainly, there's more activity happening, within the graphite community. The release of 1.0 has definitely helped, and I'm I'm hopeful that the release of 1.1 is gonna bring, some more interest and, hopefully, some more people to help out.
[00:21:02] Unknown:
Yeah. It's funny. We we've talked on all the different components, and we we've discussed some of the functionality. And I know we've we've mentioned a couple times now the render API, but I find it humorous that, like, we failed to really kind of give it its own give it its due considering that's probably the 1 thing above anything else that that people really get a lot of value out of when it comes to graphite. And what I mean by that is, you know, I mentioned this web application that you can go in and you compose your graphs. Like, 1 of the things that always attracts, I think, probably pretty much any user, it really gets them hooked into Graphite, is this, I don't know how you want to call it. But it's this API which allows you it's kind of an input output stream, that allows you to say, hey, here's a metric. Apply the statistical formula or function to it. That's mathematical function.
You can take that output, and then you can plug it into another function. And it's it's almost like chaining shell commands, in UNIX, and it makes it just immensely flexible and hugely powerful. And even, you know, you see a lot of these newer systems, and people are like, well, you know, it's, they really like a lot of the more modern takes on tagging and some of the NoSQL type database back ends. But at the end of the day, if it's got a really crappy query API, they're just not happy using it. And so that's why I think you tend to see a lot of this composability and this pluggable, these these, you know, interactions, these these bridges between services so that people can have their cake and eat it too. They can have their render API for Graphite, and then maybe have something that's a little bit, you know, operationally easier to maintain on the back end.
[00:22:42] Unknown:
Yeah. No. Abs absolutely. And that's again, at Grafana Labs, that's that's what we use Graphite for. And it's funny the the mention of chaining functions, like you would bash commands. That's something that's coming in 1.1 is a a new syntax for graphite function calls that allows you to write them as a set of chains.
[00:23:07] Unknown:
Now I did not get that memo, Dan.
[00:23:10] Unknown:
You need to keep up on your pull request, Jason.
[00:23:15] Unknown:
And is that syntax gonna be something akin to some of the functional languages where you're creating a function pipeline for being able to process the metrics for being able to render them to the UI? Yeah. Absolutely.
[00:23:29] Unknown:
So in, in Graphite 0.9 and 1.0, you would build up a a chain of function calls by, essentially wrapping the functions, within within each other, which is perfectly serviceable, but it makes it a little bit difficult to follow when you have, you know, a deep set of nested functions. And Grafana has, for a long time, had a a query editor built into it that would unwrap those and turn them into more of a pipeline. So you select this metric, and then you take a moving average on it, and then you take the top x number of query number of series, and then you say, okay. I'm gonna alias them by this particular node or or do something like that.
And Grafana would unwrap that out into a chain so you could read it that way, where in Graphite, it would be a a bunch of nested function calls. So this syntax allows you select a set of series. I pipe it into a function. I pipe the output of that into another function, and and, you can continue the chain arbitrarily long until you end up with a a final set of series that gets output either onto a a graph within the Graphite, Graphite web front end or output to a tool like Grafana that then takes it and and renders it onto a, onto a graph in that dashboard or to another tool that's just taking the data and, doing some of the processing on it.
[00:25:12] Unknown:
Now hold on a second, Dan. Are you implying that a nested set of parentheses 8 levels deep is a bad thing? I'm not implying. I'm saying.
[00:25:26] Unknown:
Let's just hope there aren't too many listeners listening to this episode.
[00:25:31] Unknown:
No. That's, I found it really exciting to to be able to write queries that way, and I'm looking forward. I'm I'm working with some of the guys on the Grafana team to update Grafana to be able to support internally writing its queries that way so that under the covers, it it's gonna be sending the the queries in that that new piped syntax, if you're using Graphite 1.1. And that's gonna make it a lot easier when you get into a scenario where you want to do something that's even too complex for the Grafana UI and you want to drop back into the raw edit mode.
It'll be, a lot easier to follow what's going on.
[00:26:18] Unknown:
1 of the early design choices that has obviously proven to be quite beneficial in managing its staying power is the composable nature and the ability to swap components in and out. 1 of the things that I'm curious about are some of the early design choices that have proven to be more problematic while you're trying to evolve the project and things that have raised a inordinate number of issues?
[00:26:45] Unknown:
Django? I wasn't gonna say it. But
[00:26:49] Unknown:
I will. Gosh. You know what? The funny thing is, you know, we talk about early design choices as if we were the 1 making the early design choices. You know, I probably was the 3rd or 4th kind of lead maintainer. I mean, the project's been active for many, many years, and I've largely handed off to to Dan and a and a small group of of core developers now. I mean, I don't know if we went back and asked, ask, you know, Chris m Davis if there are any early design choices that have proved to be problematic. I mean, I think in in hindsight, you know, I mean, again, this was, you know, years before we had anything resembling NoSQL or or time series optimized databases.
I think if they came back you know, if someone from the past came in and saw what we had now, they'd probably kick themselves for doing a a file based, you know, rrd style database. But, again, that was that was state of the art. Even even what Chris did with with Whisper compared to what they could do with r r d, I mean, at the time, you you had to have sequential measurements coming in. If if something came in out of sequence, it would just drop it on the floor, and you'd never have that measurement. I mean, the the fact that the Whisper could could accept out of order, out of sequence data, that was that was a huge step forward. You know, I I again, not to not to pick on Django, but I think, you know, locking themselves to that train for as long as, you know, we have, that's, you know, that was a challenge. I mean, it's it's a full blown web stack, lots of components, lots of functionality.
And with that goes a lot of security updates, a lot of breaking API or API changes, things that you really have to stay on top of. And with a project as as big and as complex as graphite that can be challenging especially when folks step away. You have you know turnover within companies within the project itself. It's it's definitely challenging. I mean, you know, I think in hindsight now, you know, we've seen graphite itself. Like I still think of the graphite project. And we're talking about graphite web, carbon, you know, the carbon caches, the carbondavans, whisper.
I still think of those as the the, you know, kind of canonical version of graphite. There's a lot of knock offs or, you know, they're not forks. I mean, they've been complete rewrites emulating the graphite stack, and I think it's fantastic. I you see them written, you know, parts of them in c, a lot of them in Go, you know, other languages. And I think that's I think that's fantastic. I think it's important to still have a canonical specification, you know, the 1, you know, that that Dan and and others are are still maintaining.
[00:29:33] Unknown:
Alright. I I can jump in there and and defend Django a little bit. Yeah. Go for it. So, yeah, the decision to use Django is definitely definitely something that that has has caused pain. At the same time, it has made it possible to compose Graphite even more in terms of the the middleware and things like that that, that you wouldn't get if you had a, simpler web stack. That said, I I think sooner or later, the, Graphite, API side of Graphite Web is likely to make a move away from Django, just to something a little more lightweight, because it it does bring a fair bit of heft with it. There is an existing project, called Graphite API, which is a a Python project that has tried to to separate things out. Unfortunately, that has has drifted away to the point where it's it's very difficult to keep it up to date with all of the changes that are going on in the the render API, which I I think is unfortunate. I'd I'd like to see that come closer to mainline graphite, and I think ideally it would be great if the if there was an option to to run the API side of Graphite Web, without having to use Django, but there's there's definitely a lot of work to be done there.
So I don't think that was necessarily a bad decision going in, because it it definitely provided a lot of of structure and framework, that we that we use heavily. All that test suite is built on the Django test suite, things like that. 1 thing that that I would say is definitely a pain point would be not so much a a criticism of Whisper itself. Like Jason said, it was definitely state of the art at the time that it was developed, but more that there's no provision within the whisper file format for version, which makes it very difficult to iterate on the, on the file format and add features to it.
So that's something that we're we're definitely wrestling with today, in terms of of trying to keep whisper relevant and to give users an upgrade path so that they can take advantage of some of the updates and and new techniques that have come along, in the meantime without having to go and do a forklift upgrade on their systems. So that's probably the single biggest thing that I would say is not not having that upgrade path for Whisper to make it make it possible to have different just just have a version indicator within within the files is definitely something that that's causing us pain.
But as with all things, I think that there are definitely ways around that, and that's a challenge that that I'm I'm interested in taking on. And I I think that there's a lot that we can do, with Whisper, to evolve it, and I think that it works very, very, very well for hobbyists and people who use graphite in all sorts of interesting ways. We see a lot of people, doing things with home automation, with small home labs, all sorts of relatively low volume, monitoring, and, you know, the whisper file format works great for that. It's it's very simple and very easy to handle.
And, you know, if you if you mess up your hierarchy, it's very easy to go in and and tweak the files and fix things up. It's very accessible much more so than when all the metrics are going into, you know, a a set of of much more opaque data structures like you see in a a modern TSDB. So, you win some, you lose some.
[00:34:04] Unknown:
And do you see any issues with the expressiveness that you're able to get out of the line protocol
[00:34:13] Unknown:
for being able to submit metrics? I wouldn't say that there's an issue with expressiveness. I mean, we can get into conversations over normalization and tagging. I think within the scope of this conversation and graphite, I think it's biggest problem in terms of name spacing and just the general life cycle of metrics comes down to kind of the state of the art where we, you know, where we are in terms of systems and cloud, in terms of ephemeral systems, containerization. You know, we're so used to thinking of metrics cattle, but we're not we're still talking about cap kettle, but we're not concerned so much about individual systems because they don't live very long.
So, you know, in, you know, some companies, you see them starting to talk about, you know, structured logging and and events and observability. And that's obviously important stuff to to be tracking in terms of your applications and hosts. I think in terms of time series, it's still very useful to be tracking the overall health, the trending of these these systems and services. But we have to think of this data in new ways because again these systems don't last long. I mean, I I remember a few years back I was watching 1 of, Adrian Cockcroft's talks at re Invent. I mean, in in in some examples, like, the average life cycle of these containers was a few seconds or less. You know, how do you how do you maintain that? You know, like, I think the traditional graphite cluster that I would run, was, you know, I don't know, 10, 000, 000 metrics.
And that was back when we had a static number of systems, when you're talking about a few 100 systems. You know, what do you do now when you have clusters of many, many thousands of of containers and they rotate on the hour? So that's that's a very difficult challenge. That's definitely a next generation challenge. Yeah. And I I think that,
[00:36:13] Unknown:
the answer to that challenge becomes doing doing aggregation at the ingest layer. It it's no matter how you tackle that problem, as as we scale out and when you get to the point where you have metrics blinking in and out of existence on a time frame of seconds, it just isn't realistic to think about storing each of them individually. And, you know, at at that point, there are, again, you know, a lot of tools that are coming up around the, carbon line protocol to allow for taking metric streams like that and aggregating them on the fly as as part of just processing them as a stream and then emitting aggregated streams that can then go into a storage system.
And I think that's that's definitely going in the direction of more of the observability aspect of things, and I think we'll see more of that where you have stream processing going on as the metrics come in. And, indeed, as log data comes in, and then then pulling useful metrics out of those and storing storing useful metrics rather than storing everything that's coming in raw on the assumption that you're going going to maybe wanna look at it later.
[00:37:50] Unknown:
Are there any other challenges or difficulties that you've been faced with while maintaining and improving the various graphite projects that you'd like to talk about that we haven't already covered?
[00:38:02] Unknown:
Oh, I have a very special pain in my heart as it as it pertains to graphite and Python. Dan, would you care to take a stab at that?
[00:38:12] Unknown:
You wanna you wanna swallow that pill, or should I? Go go for it, Jason.
[00:38:16] Unknown:
I you know, again, you have to keep it the context that I'm not I'm not a, an expert Python developer. I'm a guy who, you know, struggled mightily for, you know, some years to maintain a project, to to grow a project, to juggle multiple repositories within this, you know, this stack of of a project. Yeah. Let me just go ahead and pull the band aid off. Python packaging, that's that's always been a very painful issue for me, whether it's the packaging tools themselves, whether it's, I guess, PyPy. You know, I guess, case in point well, I should also add the documentations. And I think part of that is the decisions you make as a maintainer and then how you choose to do your documentation, how you choose to do your versioning.
But, you know, in our case, if you made a mistake to a doc, you had to cut a new version. And I know this those are philosophical issues, you know, that that that the Python community at large is generally in agreement. Like, there's a right way to do this and a wrong way to do this. Well, there's a right way to do it, and then everything else is the wrong 1. I'm not gonna touch that 1. Right? Again, correct me if I'm wrong. But, you know, like, you know, if we if we made a if we made a mistake in packaging 1 thing, you pretty much have to start over. Or not even start over, because you can't roll it back. You have to you have to cut a complete new release. And as someone who's done packaging other languages, that's well, let's just say that's a that's a big bit of a confusing state of of, you know, things.
Yeah. I I don't know. Like, that I guess that was my biggest thing, was just the general pain around Python packaging. And I I don't think I'm the only person to have ever experienced that. And I'm sure if I was more experienced, I probably would have either avoided those things or found better ways around them. But I think that's I think if you have any user that's used it for some period of time, if it's not obvious enough that they can't avoid shooting their foot off, then that probably speaks to the user experience at large. Yeah. Yeah. Packaging is definitely 1 of the pain points that gets brought up oftentimes
[00:40:27] Unknown:
when talking about Python as a language in an ecosystem. I think that that's still 1 of the areas that the community at large is working to try and improve upon. Develops over time and how the sort of state of the art has been evolving there. Yeah. Definitely. Managing managing dependencies
[00:40:48] Unknown:
without a tool like virtual ends or the new super virtual end, Docker containers, definitely gets exciting. So and, certainly, that's it's been 1 of the big hurdles for people to get over in terms of being able to use graphite. It has just been getting all of the different components installed with all the right, dependencies and talking to each other. So I think, you know, Docker has has helped a lot there in in terms of allowing people to really quickly get graphite up and running and helps it it's very helpful for people to be able to get over that that hump and be able to feed in some data and make some queries and get that data back again. I think that there's still definitely a lot of rough edges with the way that we do the do the install.
Django definitely complicates things a little bit. I'd say that there is a lot that we can do within the graphite project to to make that packaging and install process more streamlined and easier for people to to tackle. Certainly something that I'm interested in in helping with. Again, it's it's a matter of triage. There are only so many hours, to dedicate, to the project, and I'm trying to contribute, you know, where I can get the most bang for my buck as it were.
[00:42:18] Unknown:
I think what frustrates me is as an x maintainer let's just say x maintainer of Graphite with Python in general is that installing Graphite and upgrading Graphite for certain scenarios can go very seamlessly. You know, upgrading Django, upgrading dependencies, it can be relatively painless if you do it certain ways. The problem I feel is that the language doesn't make it universally consistent as far as what those experiences are like. And that's yeah, I wrote a tool specifically to demonstrate how easy it can be to install graphite. It's this project called synthesize, which has turned out to be pretty popular, because there are there's a lot of acute pain around around these tasks, with graphite and with Python.
But, you know, again, if you have if you're very opinionated about how you do it, it's not that bad. But unfortunately, as a as a successful and ubiquitous open source project, you can't afford to be that opinionated because people have a wide variety of different environments because there's so much flexibility in the way you can deploy, you know, Python was virtualenv or system Python or, you know, just any number of different things. So, you know, it's kind of your responsibility as a successful project maintainer to support all the things roughly equally.
[00:43:41] Unknown:
And that's that's where a lot of the pain is for me. I'd I'd say that's fair. At the same time, I think that there are definitely some things within graphite that make it more difficult than it needs to be, and, that streamlining the process doesn't mean we need tools that don't exist in Python or, that there's something wrong with Python itself. I think there are a few things that graphite does that are unusual, and, you know, I I I think that there's definitely a lot of potential there to to make things more streamlined. There's an upper limit for how streamlined things can get within Python, but I don't think we're near that limit yet.
[00:44:21] Unknown:
Hopefully, we will be. Yeah. Python doesn't deserve all the blame for that. It definitely has its share of warts, but you know, the graphite maintainer, you know, you can definitely point at people like me who made really questionable decisions along the way. So I'll I'll I'll fall on that sword.
[00:44:38] Unknown:
And speaking of challenges presented by the Python runtime itself, I was noticing in the documentation that Graphite is still Python 2 only. So I'm curious what's going to be involved in the process of porting Graphite to run on Python 3. That's that's a good question. And you yes. You're you're correct. It is currently really 2.7
[00:44:59] Unknown:
only. We we've made some strides toward compatibility with 3. 1 of the big things that I was involved with was updating the internal function, render function library to work with 3. So we're chipping away at it. And, again, not enough hours in the day, unfortunately. But, I I don't think there's any major structural issues that that would prevent the move. I certainly have had my fair share of frustration with some of the decisions that were made in the in the change, within Python going from 2 to 3 and, some things that, I think everyone who's who's worked with Python for a while has has had to deal with, but they're all they're all things that, that we can overcome, and I look forward to the day when we we can be fully compatible with 3. I believe that, Carbon is already compatible with both 2.73.
And, yeah, we anyone who wants to lend a hand on Graphite Web, is more than welcome.
[00:46:17] Unknown:
And if the Graphite project were starting over from scratch with greenfield development today, do you think you would still use Python as the language for implementing it? The biggest problem
[00:46:28] Unknown:
that we have in Graphite today that can be pegged, on Python is unfortunately also 1 of Python's strengths. The the performance of a lot of the render functions, I think, could be better in a different language. Like, the the flavor of the month, seems to be Golang, and, certainly, we've seen a couple of different graphite render library clones written in Go, none that are a 100% complete, and it's definitely a moving target, with the amount of activity that's there's been in the project right now. But if we were going to to go greenfield, that would definitely be the the biggest factor for me would be the the performance that we could expect us to get out of the rendering engine. You know, Python gives you a huge amount of help in terms of how easy it is to, write analytical functions, but the performance is definitely not as good as you could get with, some other languages.
[00:47:44] Unknown:
So I'd actually like to reframe the question a bit for Dan because I'm I'm I have my own answer. But it's a little bit off track from where he went with that. It sounds like your primary concern is with the rendering engine. But given that Grafana handles so much of the rendering now, are you referring to basically the read and formatting before it hits performance? When I say rendering,
[00:48:09] Unknown:
function library, I'm I'm talking about the the function, the data processing functions. Okay. So you're not talking about a No. Not at all.
[00:48:18] Unknown:
Okay. Okay. I I Okay. More the mathematical and how it works. Absolutely. Absolutely. Okay. Okay. That makes sense. Okay. So I would say, would you still use Python? I could say I would say that you could still use Python. If you redesigned it knowing what we know now in terms of, you know, we let's avoid using heavy web stacks. Let's you know, we know which which parts to optimize in terms of the the carbon, you know, the metrics bus, so to speak. There's a lot of design decisions that you could make now where I think Python would still be suitable. I think the big advantage that that go Golang has over Python for a project like this is the the portability and the the ease of deployment.
And that there's no question. I mean, that's you see any successful competing time series, you know, database these days. And 1 of its huge you know, 1 of the large differentiators is that ability to quickly and easily deploy in Influx or Prometheus without having to worry about all these dependencies and packaging and whatnot. So I I would say that's probably the the main competition wise between using graphite and Python in a different language like like Golang.
[00:49:34] Unknown:
And recognizing that there are a lot of dimensions along which scaling operates and a lot of nuance to the question, what are the options for being able to scale graphite and making it and make it highly available, particularly given the level of importance within a, operations context for being able to, track and view metrics?
[00:49:57] Unknown:
I think that's, you know, I was thinking about that about that particular thing earlier. That's still 1 of the things that impresses me most of all. When you look back at how, how long graphite's been around and how long the general architecture has existed, it's impressive that it can continue to scale to the degree that it can today using the same basic architectural primitives. The fact that you can horizontally scale out the inbound metrics traffic with the carbon relays, the the carbon caches, and the carbon aggregators, that you can scale out the rendering to a degree. I mean, there's still limitations.
You can scale out, you know, you can scale out the rendering side of the rendering back end. You can scale out if you're using, like, the pluggable storage back end. You can scale those out. Ironically, probably 1 of the biggest limiters in these types of highly clustered graphite stacks is like your stat st, where where you might do, you know, aggregation before it comes in because you have to have affinity with those metrics. You know, when you're adding these things up, you have to make sure that they're all added up in the same place before they get passed along. So that actually tends to be the the hardest part. I mean, I found with even a single, you know, NVMe, you know, modern VPS system, you can do, you know I mean, I don't remember the exact numbers. I wanna say roughly 400, 000 metrics per second. I mean, with any reasonably sized cluster, you can do many millions of metrics per second. You know, that's again, that's not to say that you necessarily want to. You you can have all kinds of discussions around tagging and namespace issues and and, you know, the the type of ephemeral host that I that I alluded to earlier. But I think by large, you can scale it out. I think it's amazing that the architecture has has aged as gracefully as it has. Yep. I'd I would definitely agree.
[00:51:48] Unknown:
And, you know, it it scales even better in the the new releases than it did in the past. We've spent a fair bit of time there optimizing, some of the paths for dealing with clustered deployments and being able to have the front end Graphite Web nodes be able to communicate efficiently with a large number of of back end, dot storage graphite web nodes. So I I think that that's going to to continue to give Grapho the ability to to scale out by sharding and by load balancing. And, yeah, it's like Jason said, it it's the the core architecture there scales really quite well.
[00:52:40] Unknown:
And with how important it is for being able to track the health of the of the overall system across a company's infrastructure deployment. What are some of the, design choices or development practices that you use to ensure that Graphite can operate reliably and fail gracefully?
[00:53:01] Unknown:
Along with the the the way that Graphite scales to handle load is closely related to the way that it, can be deployed to handle failure. So you can absolutely have a low balance pool of graphite web front ends. And at the same time, the carbon pipeline gives you the ability to have the same metric be stored on all 4 relevant hosts. So you, you can definitely, with a a little bit of HAProxy or, you know, similar technology in between the tiers, you can quite easily put together a very resilient system. Yeah. 1 of the nice things, and I I forget exactly which version this was. It might have been back around 90912.
[00:53:48] Unknown:
You know, they we 1 of the contributors added the ability to basically coalesce, metrics from different back end. So at at some time in the past prior to this particular change, if you had mismatched metrics on different back ends and your graphite, you know, render API would go out and fetch the data. Well, if if 1, you know, if 1 had missing data and the other 1 had everything. If it just happened to get the missing data back first, that series, then that's what you'd get. And obviously, not not surprisingly, less data tends to return faster than more data.
So you would you know, at times, if you had an outage and your your data wasn't synchronized properly across 2 back ends, you tend to get the 1 with with gaps in your data. After that particular change, it really didn't matter. You still wanna keep your things in sync manually, but it was smart enough to take both sets and and coalesce them or merge them together and give you a complete set. So that, yeah. Again just it speaks to both graphite's ability to scale and and and fan out using the the tools that are native in terms of the, like, the carbon relays. But also even when you don't have a chance to, you know, recover or repair your your clusters, you can still get accurate data, which is which is nice. I mean, it's 1 of the things I think that speaks to the maintainer philosophy that we've carried over the years, which is, you know, be stable, you know, avoid surprises.
And, you know, to put in you know, like, I I forget the gentleman's name that used to work at Etsy. It's like, you know, boring technology is it was good. Like, don't don't surprise the user. Don't frustrate the user. And, you know, gosh, don't try not to, you know, just drop data on the floor, which not unfortunately, not all time series engines adhere to.
[00:55:36] Unknown:
Yeah. Nobody wants too much excitement when they're dealing with the availability and visibility into their infrastructure. You know, graphite has been around for a long time and the importance and capability for being able to have that visibility and observability into your infrastructure has, grown in terms of, you know, adding new tools and new services. So I'm wondering what are some of the other open source projects or, you know, software as a service platforms that you view as being the biggest competitors to Graphite?
[00:56:10] Unknown:
Oh, I don't think there are any. No. I'm kidding. I'm kidding, of course. I would say, and this is probably not gonna surprise many folks, I think Prometheus is probably the biggest competitor. I think it's very popular. I think it's it's taking a a cloud centric approach, 2 time series. That said, I think both projects still have their own pros and cons. They take very different approaches to how all the metrics are actually collected. Prometheus takes kind of more of a traditional polling, approach where, you know, it has to reach out and and gather the poll you know, pull the metrics from those endpoints, from those services, and then store them locally. Whereas graphite is is I tend to think of it as more of an elegant architecture and that it's it's kind of this metric stream where, you know, you've got you've got emitters that are emitting metrics onto some butt onto some, metrics bus. And then you've got consumers that are pulling them off and storing them somewhere.
And that that model, that push model tends to work better naturally within, you know, within, different types of networks or firewalls, things of that nature. It's just an easier thing to kind of maintain operationally in my experience. I know folks are pretty you know, the the folks that use it are very, you know, they're very excited with Prometheus. It was, you know, I guess it came out of SoundCloud. It's a lot of people. You know, it's it's heavily supported by the the Cloud Network Computing Foundation. So it's definitely got its strengths. I think probably the biggest thing that's going for right now are the patterns around data collection.
You know, even more than the actual Prometheus service, I see a lot of people, using their endpoints, their their libraries for actually admitting these metrics. And in a lot of cases, you'll have them using other tools, but using this Prometheus model and these Prometheus library. So I think that's interesting. I think there's definitely a lot of, of uptake in that project, and I'm glad to see them doing well. But me, personally, I still like to push them all.
[00:58:17] Unknown:
Interesting. I I see Prometheus not really as a direct competitor to Graphite Web. You know, it it solves some of the same problems, but it's definitely more focused on the collection side and kind of real time. Prometheus 2.0 has a little more in the way of longer term storage, but is still definitely not a direct competitive graphite and, hopefully, again, hours in the day, once once we can get the Prometheus external storage connector for Graphite up up scratch, with the tag support in Graphite 1.1. I think that we're going to see a lot more people using Prometheus and Graphite side by side and especially with with then Grafana layered over the top, to be able to give you a unified view into both systems.
I definitely agree with Jason when it comes to the push versus pull model. We deal with we deal with that at Grafana Labs all the time, where we're operating a time series database as a service, and the reality is that there are very, very commonly, network issues on the Internet. Who would have thought it? And, you know, with the with the push model, it's very easy to buffer the data on the producer side, and then, once they're able to connect to the ingest endpoint, to go ahead and catch back up again. And so you might see a delay, but you don't see any lost data, where with a with a poor model, you'll tend to have a lot more complexity there in in terms of dealing with those kinds of outages. And and what happens when the server can't pull the data for a given, amount of time? Does the producer then have to buffer it? If so, how do they do that? It it adds a lot of complexity there. With the system, again, where you're going over the Internet, it also opens up a whole can of worms with having to deal with security and firewalls and so on, where your collector has to be able to go and reach into it reach into systems, needs to be needs to be allowed versus a system being able to to push data, back out again. So I I think for me, this is is a super interesting project.
I think it has a lot of really interesting applications. I see it as more of a complement to to graphite in terms of handling that collection part and the real time part rather than where graphite is more on the storage and analytics side. As far as a direct competitor, I would say the the biggest direct competitor for Graphite today is probably InfluxDB. It solves a lot of the same sorts of problems that Graphite solves, and it's a it's a really great project. You know, we we know a lot of people through the Grafana community that use it. It's, those are the definitely the 2 time series databases that we see the most would be Influx and Graphite.
[01:01:37] Unknown:
I have to admit I'm scratching my head a little bit when it comes to you you describe, Prometheus as largely being focused on the collection, and I don't disagree with that. But if you were to ask me, you know, what if you're building a new monitoring system, what what is the 1 thing you don't need to worry about? What's the 1 part that's probably handled or or, you know, done pretty well out there already? I would say collection is probably it. Yes and no. I I think yeah. I mean, I think I think for me personally, I think Prometheus is glaring, you know, weakness right now is that it doesn't handle you know, we talk about Graphite's, high availability and scalability, and it doesn't have that. Prometheus doesn't do federation.
You can run duplicate, you know, duplicate clusters, but there's no there's no failover.
[01:02:28] Unknown:
Yeah. That's definitely been 1 of the things that's been holding me back from spending a lot of time looking into it as a potential, service that I would wanna run is that lack of high availability,
[01:02:40] Unknown:
being built into the system. Yeah. And that's where coupling it with a tool like Graphite can can kinda give you the best of both worlds, where you can use it as a collector, but then push the data back into a longer term storage system. And at that point, yeah, absolutely, you can run multiple, Prometheus instances, have them all push the data into a centralized highly available graphite cluster and be able to query the data from there. And that's certainly a a very viable architecture.
[01:03:10] Unknown:
And when is graphite not the right choice for tracking your system metrics?
[01:03:14] Unknown:
You know, the the the biggest thing that that graphite has issues with, today is the ability to deal with ephemeral series. That's that's absolutely something that, that graphite just does not do well today. That said, the pluggable nature of graphite that we've talked about throughout the podcast has gives people the ability to to deal with that problem, and we're seeing, you know, more and more projects popping up that are solving that problem while still using graphite and and just replacing the parts of graphite that don't work so well for that for that use case. So the answer there would be, I think, that there's always some value to be found in graphite, and it's it's really a matter of of how you use it and finding the right tooling around the parts of of graphite that are that are useful. And I I think right now, I haven't come across another time series database that has a richer querying language.
That's that's definitely the biggest the biggest thing that graphite graphite brings, and the reason that we see tooling being built like, you know, a connector to be able to use InfluxDB as a TSDB behind Graphite, but use the Graphite query language on top of that. That's
[01:04:42] Unknown:
that's really the strength of the graphite project is is the ability to to use it in that composable way. Yeah. I would I would add that as you talk about, you know, ephemeral systems being it's, you know, probably its weakest point, I I would I would state that that's probably the case for any time series system. I think, you know, you you hear vendors talk about, oh, we can do 100 of you know, or we can support, cardinality up to 100 of of, you know, nodes or tech you know, per tag. Everything has its limit. Like we're not talking about magic. It's math. It's computing.
There's always trade offs. And I I think if you're going to be doing any type of high volume containerized systems of high turnover, you're gonna hit limitations. And I think you're gonna see every system, you know, doing some type of aggregation or loss of fidelity to support those environments. So again, I don't I don't think Graphite's unique in that regard. I think it's the same problem we've had with data storage for time series, for logging, for, you know, most monitoring data for, you know, as far back as you can recall. So I don't see that changing. I think it's just gonna be that we we we think of the data in different ways now. You know, we're less focused on on capturing system metrics at scale than we are in, monitoring the load, the the success of our, you know, of our our of our work transactions, our work flows. Is this thing doing the thing that it's supposed to be doing? Not, you know, oh, it has a load of of 45.24, and it's increasing slightly.
[01:06:21] Unknown:
What are some of the most interesting or unusual or unexpected uses of graphite that you're aware of?
[01:06:28] Unknown:
I I imagine somebody's used whisper to source something that resembles time series, but I can't think of anything off the top of my head. I mean, I've certainly seen people, like, you know, do dashboards of their kegerators, but I think I mean, heck. They did that at Librado, and I know I've seen other vendors do that too. So I would say that's
[01:06:47] Unknown:
unusually unique. Yeah. I mean, in the end, when you have a a system that's designed to take anything that you can measure and store it over time, it's really hard to point to a particular use case and say, this is a really weird way of measuring something and storing it over time. Anything that you can measure is a valid use case. And whether you think it's a strange use of of graphite or not really comes down to whether you think that the thing being measured is odd. Graphite doesn't presume to say, okay. This is a time series that this is focused on system metrics or we're focused on this particular type of metrics.
It's it's really Graphite doesn't doesn't judge where the metrics are coming from. It just takes them and stores.
[01:07:34] Unknown:
Yeah. I I think that it doesn't help that that me and Dan are are so close to both the problem and solution space that, again, as Dan mentioned, like, Graphite doesn't care. Like, it's just gonna take whatever you throw at it, and it's gonna give it right back to you when you ask for it. I I think there's a lot of cool uses of graphite. I mean, I I I can't name specifics, but I mean, certain certain space agencies. You know? Gosh. What else? I mean, I you know, companies that that I'm I, you know, proud to have had on my resume using further status pages, things like that that maybe people didn't realize were behind the scenes that if they knew, we're we probably would have been like, oh, that's that's kind of a cool use. But to me, it's just it's kind of just another place where Graphite has excelled,
[01:08:14] Unknown:
over the years. So And we've talked about a lot of the upcoming features and some of the next releases. I'm wondering if there are any other new features or enhancements that you have planned for the future of graphite that you'd like to share with the audience.
[01:08:27] Unknown:
There are always new things coming down the pipe. At the moment, the big stuff is really the tagging support. That's and the associated updates to the graphite query language, both syntax, updates to make it easier to query Graphite and and easier to read a query and understand what it's doing without having to to jump around through nested function calls, then the the tag support and all that that implies as far as the ease with which you can get going and start sending metrics and the the different ways you can slice and dice the data after the fact. Those are those are really the big things that we're focused on right now. We hopefully, once, Graphite 1.1 is out the door, then it's gonna be time to start looking forward at, you know, what comes next. I have a a laundry list of more analytics functions that I'd like to see built in, and I've seen some some really interesting work presented at Monterama in terms of the ASAP smoothing, which is a a way of isolating signal from noise, in the data that's that's that's, stored.
So that's 1 thing that that I'm really interested to take a shot at implementing within Graphite. There's always work to be done, and we, yeah, it's gonna be really interesting to see how the road map comes together for the the 1.2 release. And I think a a lot of that is going to get, driven by what happens with with 1.1 and and what the users bring back to us, in terms of what they are looking to see.
[01:10:12] Unknown:
Are there any other topics or questions that you think we should cover before we start to close out the show? I I think we've covered a a lot of ground. Okay. Alright. Well, I really appreciate you both taking the time for talking about graphite. And so for anybody who wants to follow what you're up to and keep up to date with, yourselves and the projects that you're involved with, I'll have you add your preferred contact information to the show notes. And with that, I'll move us to the picks. And for my pick this week, I'm gonna choose archery because I just recently got my bow and arrows out of storage and, I've been doing some target practice with my son in the backyard. So it's been good fun and, you know, good way to get some exercise and hand eye coordination training and step away from the computer for a bit. So I definitely recommend, you know, going out and shooting some bow and arrows if you have the space and the inclination. And so with that, I'll pass it to you, Jason. Do you have any picks for us this week?
[01:11:08] Unknown:
I guess. And folks that I'll be on Twitter might get a laugh out of this. But, so probably my my big, you know, my big guilty pleasure outside of outside of working on graphite and in sense you is, I I love Rocket League. It's, it's a game with cars and soccer balls and and turbo engines, and, it's a lot of fun. I've been playing it for a couple years now on PS4 and PC. It's actually coming out to Nintendo switch next week, so if you play the game or have any interest on playing against me, check it out, pick it up,
[01:11:42] Unknown:
and look for me at, Office Security. Alright. And, Dan, do you have any picks? Yeah. Absolutely. I thought of something
[01:11:48] Unknown:
that is applicable both to Python and to Graphite, and my pick for the week is a a Python project called Home Assistant. So that's a a home automation open source hub did designed and and built using Python and really neat open source project that I've been playing with for a little while, and it gathers an awful lot of data from things like smart thermostats, light switches, stuff like that, that I have not yet had time, but hopefully soon will write, write a little script to start taking all of that stuff from around my house and stashing it all in graphite so that I can graph it all in Grafana and see when my thermostats are are calling for the the heat or the AC to run and all that kind of stuff. So really neat little project and open source
[01:12:45] Unknown:
Python. Yeah. It's definitely an interesting project and 1 that I had on the show a little while ago that's actually proven to be 1 of the more popular episodes. So it's definitely fun to hear it keep being mentioned by people as I talk to them.
[01:12:58] Unknown:
Yeah. Yeah. And the the the project seems to be seems to be really active and definitely a lot of a lot of good work going on over there.
[01:13:06] Unknown:
So actually if anybody is interested in joining us for monotrauma 2018 next year, it's, I believe it's June of next year in Portland, Oregon.
[01:13:15] Unknown:
We'll drop a discount code in the show notes, so I'll look for that. Alright. Well, I really appreciate the both of you taking the time out of your day to join me and talk about the storied past of graphite and its bright future. So, it's definitely a project that's been on my radar for a while. And after speaking with the both of you, I think that it's gone back up to the top of my list of systems to choose for when I, deploy my own, metrics stack for my job. So I appreciate that. It's been very educational, and I hope you enjoy the rest of your evenings. Thanks a lot. Thank you. Thanks for having us. It's been great fun.
Introduction and Conference Announcements
Guest Introductions: Jason Dixon and Dan Cech
Getting Started with Python
Overview of the Graphite Project
Graphite's Influence on Systems Monitoring
Components of the Graphite Stack
The Power of the Render API
Challenges with Early Design Choices
Maintaining and Improving Graphite
Porting Graphite to Python 3
Scaling and High Availability
Ensuring Reliability and Graceful Failure
Competitors to Graphite
When Not to Use Graphite
Interesting Uses of Graphite
Future Features and Enhancements
Closing Remarks and Picks