Summary
The Redis database recently celebrated its 10th birthday. In that time it has earned a well-earned reputation for speed, reliability, and ease of use. Python developers are fortunate to have a well-built client in the form of redis-py to leverage it in their projects. In this episode Andy McCurdy and Dr. Christoph Zimmerman explain the ways that Redis can be used in your application architecture, how the Python client is built and maintained, and how to use it in your projects.
Announcements
- Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
- When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
- And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. Podcast.__init__ listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial.
- Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email hosts@podcastinit.com)
- To help other people find the show please leave a review on iTunes and tell your friends and co-workers
- Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
- You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with O’Reilly Media for the Strata conference in San Francisco on March 25th and the Artificial Intelligence conference in NYC on April 15th. Here in Boston, starting on May 17th, you still have time to grab a ticket to the Enterprise Data World, and from April 30th to May 3rd is the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register.
- Your host as usual is Tobias Macey and today I’m interviewing Andy McCurdy and Christoph Zimmerman about the Redis database, and some of the various ways that it is used by Python developers
Interview
- Introductions
- How did you get introduced to Python?
- Can you start by explaining what Redis is and how you got involved in the project?
- How does the redis-py project relate to the Redis database and what motivated you to create the Python client?
- What are some of the main use cases that Redis enables?
- Can you describe how Redis-py is implemented and some of the primitives that it provides for building applications on top of?
- How do the release cycles of redis-py and the Redis database relate to each other?
- How closely does redis-py match the features of the Redis database?
- What are some of the convenience methods or features that you have added to make the client more Pythonic?
- Redis is often used as a key/value cache for web applications, in some cases replacing Memcached. What are the characteristics of Redis that lend themselves well to this purpose?
- What are some edge cases or gotchas that users should be aware of?
- What are some of the common points of confusion or difficulties when storing and retrieving values in Redis?
- What have been some of the most challenging aspects of building and maintaining the Redis Python client?
- What are some of the anti-patterns that you have seen around how developers build on top of Redis?
- What are some of the most interesting or unexpected ways that you have seen Redis used?
- What are some of the least used or most misunderstood features of Redis that you think developers should know about?
- What are some of the recent and near-future improvements or features in Redis that you are most excited by?
Keep In Touch
- Andy
- @andymccurdy on Twitter
- andymccurdy on GitHub
- Christoph
- chrisAtRedis on GitHub
Picks
- Tobias
- Andy
- Christoph
Links
- redis-py
- Redis DB
- Redis Labs
- PHP
- Django
- Reflective Operating System Architectures
- TCL
- Perl
- Linux
- Memcached
- NextCloud
- C programming language
- uWSGI
- Flask
- Gevent
- PyPy
- re-json
- redis-graph
- Redis-search
- MongoDB
- Bloom Filter
- hiredis
- Redis Sentinel HA plugin
- Lua programming language
- OpenWRT
- LuCI
- MicroPython
The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA
Today, I'm interviewing Andy McCurdy and Christoph Zimmermann about the Redis database, Redis py, and some of the various ways that it is used by Python developers. Hello, and welcome to podcast dot in it, the podcast about Python and the people who make it great. When you're ready to launch your next app or want to try a project you hear about on the show, you'll need somewhere to deploy it. So say hi to our friends over at Linode. With 200 gigabit private networking, scalable shared block storage, node balancers, and a 40 gigabit public network, all controlled by a brand new API, you've got everything you need to scale up. Go to python podcast.com/linode, l I n o d e, to get a $20 credit and launch a new server in under a minute. And And don't forget to thank them for their continued support of this show.
And if you're like me, then you need a simple and easy to use tool to keep track of all of your projects. Some project management platforms are too flexible, leading to confusion of workflows and days' worth of setup, and others are so minimal that they aren't worth the cost of admission. After using Clubhouse for a few days, I was impressed by the intuitive flow. Going from adding the various projects that I work on to defining the high level epics that I need stay on top of and creating the various tasks that need to happen only took a few minutes. I was also pleased by the presence of subtasks, seamless navigation, and the ability to create issue and bug templates to ensure that you never miss capturing essential details.
Listeners of this show will get a full 2 months for free on any plan when you sign up at pythonpodcast.com/clubhouse. So help support the show and help yourself get organized today. And don't forget to visit the site at python podcast.com to subscribe to the show, sign up for the mailing list, and read the show notes. And don't forget to keep the conversation going at python podcast.com/chat. Registration for PyCon US, the largest annual gathering across the community, is open now. So don't forget to get your ticket, and I'll see you there.
[00:02:03] Unknown:
You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers, you don't want to miss out on this year's conference season. We have partnered with O'Reilly Media for the Strata Conference in San Francisco on March 25th and the Artificial Intelligence Conference in New York City on April 15th. In Boston, starting on March 17th, you still have time to grab a ticket to the enterprise data world. And from April 30th to May 3rd is the open data science conference.
Go to python podcast.com/conferences
[00:02:41] Unknown:
to learn more and to take advantage of our partner discounts when you register. Your host as usual is Tobias Macy. And today, I'm interviewing Andy McCurdy and Christoph Zimmerman about the Redis database, Redis py, and some of the various ways that it is used by Python developers.
[00:02:56] Unknown:
So Andy, could you start by introducing yourself? Sure. I'm Andy McCurdy. I'm a software engineer. I've been doing this for 20 years, and, have worked with Redis and Redis Pie for the last, 10 or so. And Christophe, could you introduce yourself? Yeah. Hi. My name is Chris Zimmerman. I've been using Redis since the second commit, I think. I've been using open source for the for the last 30 plus years, And, I also took up a job, at Radis Labs for for the last half year. And the main functions are here, looking after the community in Central Europe as well as, oh, Plano sales support as in presales.
[00:03:32] Unknown:
And going back to you, Andy, do you remember how you first got introduced to Python? I do. I actually had 2 different introductions. The first, was when I was working in the video game industry in the early and mid 2000. We were embedding a Python interpreter in our in our game engines so that level designers could, script behavior much easier than kinda writing it in c plus plus. And so I I kinda got a cursory knowledge of Python through that. But I'd say my real deep dive into Python was in, 2, 007 and 8 when we decided to abandon our homegrown PHP framework and, adopt Django.
[00:04:11] Unknown:
And so I I kinda took the deep dive there and and, haven't looked back. And, Christophe, do you remember how you first got introduced to Python? Oh, I absolutely do. It was in the early days when I did my PhD on reflective operating system architectures, and we were looking for a scripting language to drive, mostly the QA effort that we were doing on the experimental microcurrent that we implemented about 25 years back. We were looking at Tcl and Perl, and they didn't really fit the bill. And then this new language called Python came up. It was kind of love at at first sight, and I've never really looked back. That was the first kind of touchpoint when I also got introduced to Linux, which we use as the development and QA environment for the for the operating system kernel that we were doing.
And over the last, well, 25 years, I've grown to love Python more and more and more given the fact that, I'm hosting system system administration tasks, and I love it. And and I use it wherever I can, essentially. So full marks to I called Guido van Rassen to come up within it in the first place. And so
[00:05:20] Unknown:
the conversation today, we're gonna be centering around the Redis Python client and some of the ways that it is used with in conjunction with the Redis database. So can you start by giving a bit of an overview about what the Redis database is and how you each got involved in the project?
[00:05:37] Unknown:
Sure. If that's okay, well, I I I would I would like to start. Salvatore and Filippo did the her first commit in 20 09. So we're just looking at at the 10th anniversary of of the project on GitHub. When Salvatore was looking for something that he could use as part of real time, well, reporting basis, if you will, he took a look at at MySQL. He took a look at other SQL databases which were around at the time, but none of them really fit the bill. So he sat down and started to implement a memcached alternative initially.
Over the years, this has grown into a fully functional multimodal database, not just using key not just using keys keys and values, but rather offering many, many more data types. I'm sure we will we will go into the further in into more detail later on. About 2 years later, 1 and a half years later, my I myself was looking for a PHP op opcode cache for some FPM based application servers that I was running at home. And these things were mostly running on embedded systems. So I chose NGINX as the main driver, FPM, as the main PHP interpreter.
And I came across Memcached, of course, and also across something called Redis. And I've been using Redis as a PHP opcode cache ever since. Just to give you some impression, 1 of the production systems for want of a better word runs on a 6L 325 that is machined with 512 megabytes of main memory, hosts about 3 NGINX instances, 3 FPMs, and 1 Redis instance driving all of these FPM instances from an opcode cache perspective. And this thing now drives about 4 own cloud or next I'll just upgrade the code base, next cloud instances. And Redis is the main driver there with regards to speed and and performance.
[00:07:29] Unknown:
Yeah. It's an impressive amount of efficiency that you can get out of it. I know that a lot of the sort of more, quote, unquote, production grade databases that people will use outside of the relational engines are typically written in Java, which is much more heavyweight. So it's always nice being able to have, something that's lightweight and efficient
[00:07:49] Unknown:
and, easy to get up and running with for facilitating those types of use cases. I mean, when you take a look at the code on GitHub, this is highly compact, highly, highly efficiency. This is highly compact, highly efficiency implementation. I took a look at the code base when I first decided to go for it. And over the years, Salvatore and and the contributors, of course, have done an excellent job at at maintaining, the conciseness, the, the compactness of that of that code base. Needless to say, every now and then, I run it through through certain code analysis tools like, Sonar SonarQube and all the rest, and code checks out. There there's kind of no hidden surprises.
There's no kind of gotchas in there. And if you take a look at the at the issues list, yes. Like every other code base there, there are, of course, bugs and so forth. But in comparison to other projects, it's
[00:08:39] Unknown:
it's it's manageable. Let's put it this way. And, Andy, how did you first get involved in the Redis project?
[00:08:45] Unknown:
It was probably around late alpha or early beta, so shortly after Christophe got involved. And, we were heavy users of, Memcached at the time. And we were we were having some issues around scaling 1 specific section of our site. And I I was looking for something that was like Memcached, but actually had, like, native list types and and atomic list operations. And I was all set to write this myself, you know, foolishly, of course, because I I couldn't have done nearly as good of job as, Salvatore did. But in kind of, doing a little bit of research, fortunately, I stumbled upon Redis, and this was back when I think there were 4 or 5 client libraries bundled within the main server repository. 1 of those was a a very minimalist Python client. It wasn't maintained. It was basically a proof of concept.
So I picked that up and, started sending Salvator patches. And, I think some of the other folks that were working with the client libraries in other languages were doing the same, and I he he probably got inundated with patch emails at that point and, decide said, hey. Why don't you guys just all, you know, split these off and and, you know, run them yourselves, and and I'll link to you? So that's how Redis Pie was actually born. And so can you talk a bit more about what the Redis Pie project is and how it relates to the redispy database in terms of the features that it exposes
[00:10:14] Unknown:
and the way that it can be used within Python applications? Certainly. So,
[00:10:19] Unknown:
you know, at at at the core, redispy is is the main Python client for redis. It it's not terribly difficult to build a a minimalistic Redis client in any languages, and, you know, there there's a number of them in in Python specifically, but bread as pie definitely is the most popular and has the highest install base. I I'd say that, you know, the the the primary mission is to build kind of a canonical Pythonic implementation that that makes it natural for experienced Python developers,
[00:10:55] Unknown:
to communicate with Redis. And so can you go over some of the main use cases that Python developers have used the Redis database for and some of the features that are unique specifically to Redis in terms of the application architectures that it enables?
[00:11:12] Unknown:
Sure. So I I would say by far and away, the the biggest use case that we see is people using it as as a key value store in for for caching well, I guess, for caching in data in a similar way that that memcached would. There's a lot of, you know, interesting other ways to use it as well. Like Christophe said, you know, Redis exposes a lot of other data types besides just plain byte strings. So sorted sets, for instance, are, something very unique to Redis and something that can, be used to highly optimize, specific access patterns.
[00:11:55] Unknown:
And so in terms of the redispy project itself, can you talk a bit about how it's implemented and some of the difficulties or challenges associated with tracking the features of the upstream Redis project? Sure.
[00:12:12] Unknown:
So Redis Py tries to follow the Redis database lead as as closely as possible. What I aim for out of the box is a library that's easy to get started with and has sane defaults. So by default, it is thread safe in in typical Python workloads, you know, whether you you're using, like, gunicorn or uWSGI, serving a Django or Flask application, you simply create a a singleton instance, of the Redis client, and you're good to go. I also try to provide functionality to make it easy to work with less typical architectures, such as gevent or eventlet or, you know, running on PyPy or other Python interpreters that are, you know, less mainstream.
[00:13:03] Unknown:
And what are some of the feature primitives that are available in the Redis database and the Redis Py library
[00:13:11] Unknown:
that application developers can build on top of? So you start off with with basic, you know, key value strings. That that's kinda where everyone, I think, starts off with. And and that's very similar to the functionality that that someone might be used to with memcached. But there are other data types that Redis exposes as well, and there are atomic operations around each of those primitives. So Redis has a a list data type. So you can, similar to a Python list, you can insert a value at a particular location or you can, append something to the end of the list. You can pop things off either the head or the tail of the list. There's, even concepts like blocking pop where the client can choose to block on, pulling an item off of of the list. And if there are no items on the list, it will block until another process puts an item on on the list. So with that, you can build something like a, a message queue. Taking it a step further, there's, sorted sets that like lists, but each element, has an accompanying, score value. And those scores, Redis keeps that, the the sorted set sorted in memory so you can do very quick operations, like, for instance, pagination.
If if you have millions of items in assorted set, you can very efficiently get page 2, 395, similar to a a slice in Python. Redis provides, hashes, which are similar to Python dictionaries. So it's a single key that can have then other keys and associated values. Redis also provides more advanced systems like publish subscribe, where you can have a number of clients listening for events on a particular channel, and you can have, publishers that publish events to those channels, and and those events are replicated to all of the consumers. This would allow someone to build a, like, a chat system. In fact, I've I've built several chat systems, with this, using WebSockets and, you know, consumers that listen to chatroom messages.
A a newer a newer concept that Redis just launched with, version 5 is streams. Streams are somewhat similar to to publish subscribe in in that you do have publishers and you have consumers, but the messages are more more durable. So the the problem with, typical publish subscribe is it it's fire and forget. So if a publisher publishes a message and a client happens to not be connected at that time, the client just doesn't get that message. With streams, messages are published with key value pairs, and they're added to a queue, and a client can connect and consume those messages. And if the client becomes disconnected, the server remembers where the client left off so the client can reconnect
[00:16:13] Unknown:
and then pick up where it left off. Maybe I can chime in here. Streams is 1 of the recent addition where onto the server side where the gap between application requirements and what the server has to offer us narrows it more and more down. And the implementation in RedisPy is similar what you see at any or in any client library that talks to Redis instance because the program languages or the client set libraries would mirror the the data types that are offered by the server. So streams would be the latest edition, and I actually started to extend the code base of redis pi with streams Lab took it over, and he accepted the pull request, I think, in October or November. And then it was the official launch, I think, was in November, Andy, of of streams in in Redis Pie. Yeah. That sounds right. So and this is what you see when you take a look at c sharp support, at, Pearl support, at Rust support, at Go support. If the data types are not supported by the client side library as in programming language, you will have you will more often than not have a compound header type reflecting the architecture or or the or the structure on on the server side. So most programming languages, for example, would offer lists. But when it comes down to sorted lists, more often than not, you will find you will find a compound data type where a list item is amended with the score reflecting the server side implementation.
And there is even another concept that we haven't touched upon yet called modules. Modules allow you to extend the functionality of the of a Redis server by writing custom codes in c or is our language that allows c bindings and then to insert that module when you start up the server into into the copays of the server, so to speak. So in Linux Parler, you would have something called a shared object that you load at server startup time. And this shared object then typically, as part of its module implementation, would implement a specific application data type. There are quite a few modules around these days.
Many of them actually implemented by Redis Labs. So for example, there's something around called re JSON that implements a JSON based document type in c, which allows you to store documents in a in a rather server instance and index them to some extent using JS query, and to retrieve them finally. There are other modules in the area of graph databases. So there's something called Redis Graph that turns a Redis instance into a graph database where you simply create a couple of nodes, create a couple of edges, stack some properties on these items, and then finally connect the nodes through the edges forming a graph similar to to neo4j or other graph based databases.
The idea is, again, if you and all of these modules typically have Python clients or library implementations that are based on redis minus pi. So the idea is, again, for these modules to provide application specific data types that are reflected on the client side libraries by typical implementation. So redis graph, as in the the the Python client side implementation, provides data types like edges, nodes, and finally graphs. So creating a graph in Redis graph Python is quite straightforward. You define nodes. You define edges. And when you build them into a graph, as in when you connect nodes to, with edges, you just have to say commit, and then the graph is instantiated as in committed on the server side for any client with with the right credentials to traverse. And this is what you see in all of the module implementations on the client side where simply we the structure of the data types on the server is reflected by similar compound or native scalar data types on the client side implementation. And this is something I think that differentiates Redis from from the rest of the pack because the module SDK specification is is also, of course, published because Redis is open source. It's under BSD license, so everybody can take a look at the code. The SDK is specify is is is specified on Redis dot io. And there are a couple of, as I said, module implementations on GitHub where you can simply take a look at how Redis is extended using modules. And, of course, you're free to use you're free to create your own module and then slot this into your server implementation.
This makes, I think, Redis pretty flexible when it comes down to even narrowing down the gap between application requirements and something that the server has to offer more and more. Particularly in terms of the,
[00:21:09] Unknown:
JSON module plugin, that definitely simplifies a lot of the application requirements because I know that in my own experience using Redis as an object cache, you would often need to serialize and deserialize complex Python dictionaries to and from JSON in order to be able to store and retrieve it properly. So being able to add the ability to offload that responsibility to the database layer as well as provide some indexing into those objects adds a lot more power than trying to do string matching of the value in Redis in order to determine which key values you need to retrieve. So the module capability is is something that I wasn't previously very familiar with, so it's nice to see that it has that plug in capability so that you can, as you said, extend the server layer with some of these application concerns and thereby simplify the code that each developer needs to write on their own. And it gets even better.
[00:22:05] Unknown:
Come Redis Conf that takes place in April, we gonna launch an open source infrastructure called Redis Gears that allow you to connect modules to each other so you can combine, for example, Redisearch, which is a full blown real time text search engine as an indexing engine with, re JSON giving you something very similar in functionality to ordinary document based databases like Mongo or for Couchbase.
[00:22:32] Unknown:
And in terms of the release cycles for RedisPy, how does that compare to the upstream challenges associated with maintaining feature parity between the Python client releases of the upstream database? I try to,
[00:22:51] Unknown:
be vigilant in in, you know, reading the mailing list and and finding, you know, changes. It it was a lot more complex, years ago when it it seemed like every month or or every few weeks, new commands were coming out or existing commands were being changed to, you know, add a new functionality or behavior. I I think 1 of the the biggest challenges is that I I've really been impressed by, a number of packages in the Python community such as Django that adhere to really strict backwards compatibility requirements. And I I tried to make that promise to users of Redis Pie that, you know, upgrading to a new version isn't going to break them, or or will at least warn them first for, you know, a few versions before, it it does break them. Unfortunately, some of the changes that that that are made to the, database side are are are hard to make it hard to keep that promise. And, I I think that's 1 of the the the bigger challenges where, you know, folks in the Python community really do appreciate that backwards compatibility guarantee.
And,
[00:24:06] Unknown:
you know, it sometimes we just have to break it. You mentioned earlier that you tried to keep the client fairly Pythonic. So I'm wondering what are some of the cases where you break from the syntax that the, upstream Redis project itself supports and some of the convenience methods that you provide to simplify developers' efforts in interfacing with the Redis database?
[00:24:31] Unknown:
So there's a there's a number of of subtle things. I I I try to make the client, you know, have the same argument order and argument types. But oftentimes things like if you're attempting to put, an expiry on a key. So you can tell Redis, save this key and mark it, to be expired in, say, 10 seconds or an hour from now. Redis doesn't have any concept of, like, a a date time primitive, but I know a lot of people in in Python are oftentimes working with dates and times. And so I'll provide, you know, convenience methods that will accept either an integer to specify that time in, you know, a a second offset, or you could supply a date time type, and it would mark the ex the the expiration date, you know, precisely at that time. I'll I'll accept, star hearts for, you know, a list of items, to say the delete command or the, m git command where you can get a lot of, the values of a lot of keys in 1 request, or star star keywords to implement, different options, like options in, say, there's a command called set, which is, you know, by default, it it you just save, you you give it a key and a value, and and that's persisted to the database. But, there are a number of options that you can also pass to set. Those are expressed as keyword arguments instead of the user having to remember what the precise incantation is of the Redis command.
[00:26:16] Unknown:
And that goes for the, for the streams interface as well. I can remember the discussions on on Slack that we had internally with with Andy, when he, took a look at the pull request and then, yeah, verified that the parameters reflect actually the documentation on Redis. Io when it comes down to streams. So, again, this is a very Pythonic approach, I think, to mod to modeling the client side and kind of maintaining the spirit of Python. Its simplicity, its elegance. Is that is is that the word I'm looking for? And it's concise for 1st and foremost.
So if you take a look at radis.au, which is the website basically where all the server side commands are documented and the options and all the rest of it, you'll see the exactly the same keywords in the corresponding parameters when you invoke Redis Pie on the client side. And I think that philosophy makes it very coherent for somebody reading up on the server side documentation than sitting down implementing not only his first, but maybe his second or third client using redispy, because it simply takes a look at the documentation, sees the parameter that he has to use, maybe spend some time initially to to review the redispark implementation or documentation on GitHub, and I think the documentation is is on read the docs. Right? It is. Yes.
Yes. So, and then he's he immediately noted notices the the similarities between the client side implementation and the server primitives. And I think that's what makes Redis Pie a very powerful client and very easy to use client because you can you can take literally the server side primitives and direct the slot into Python on the client side. Cutting down the implementation times and QA I
[00:28:05] Unknown:
think. Yeah. And it definitely helps too in terms of being able to translate your understanding of how to work with Redis across multiple languages because, well, we all might like to stick with Python for everything. There are occasions where we need to either go into JavaScript or Ruby or PHP or whatever it might be, and Redis is equally useful in all of those contexts, but it increases the barrier to entry if you have to relearn a new API every time you want to do it even though you're interacting with the same back end. Is any is anybody still using PHP? I'm only kidding. I'm only joking It still powers a significant portion of their web, but
[00:28:47] Unknown:
This is what I this is what I heard. Yes. Unfortunately, all the people have switched over to Django or Flask or whatever. Yes. That's correct, unfortunately.
[00:28:57] Unknown:
And you were mentioning that 1 of the first reasons that you came across Redis was to use it in a caching context, similar to what memcached provides. So can you give a bit of a compare and contrast in terms of the caching context as far as the feature set that Redis has that might influence someone to choose that instead of memcached and some of the edge cases that users should be aware of when using it as a caching layer? Sure. Well, originally, as I said,
[00:29:28] Unknown:
Redis was very similar in in functionality to memcached as in you have a key and you have a value associated with that key. But now I see this both in in in in sales situation, but also when I look at the community. People have moved on big time from the simple caching to much, much more sophisticated caching where actually the variety of the server data types that Redis has to offer these days comes into play. For example, there's a company unfortunately, I cannot mention the name yet. It's a betting company that uses Redis in as a user session cache for price modeling.
If you go to the website, you can place bets. These bets, of course, have a price. If a user and and these and these and and and these price calculations are time sensitive. So imagine your typical punter watching a game watching a soccer game on TV and then seeing that, for example, Gomez, which is popular, soccer star here in in Europe, has just scored a goal. And he goes to the website and say, now look. What's the probability of Gomez scoring a second goal within the remainder of of the game? So this betting company then takes your source IP, where you come from, essentially, when you wanna place the bet, takes data from your basic maybe connected social blue social profile, but also takes, into account the, the opponent that Goms is playing against on down on the field. And all these factors actually contribute to the overall modeling of the prize. And when I took a look at the prize models, these were pretty sophisticated, and these were pretty complex. Needless to say, if you wanna do this for each and every bat that a user is is is trying to get a price for, you rapidly run into performance problems. So this is what the company noticed when they were actively looking for an intelligent caching layer. So they chose Redis, and now Redis drives this in terms of caching some of the factors and then aggregating the remainder of the model in real time in memory.
So the so the user doesn't have to wait for for a couple of seconds to get a price for a better but rather gets this instantly because the caching layer directly sits on top of the web server. So turnaround times and and thus latency are very, very short. And funny enough, the company is more than happy with the increased conversion rates because more and more people are actually able to put down money for bets on that website. And I think this is a very good example for how caching in Redis has has has evolved from a simple key value store to a much more sophisticated
[00:32:02] Unknown:
caching layer that drives business and and keeps customers happy. I'd I'd also say that in in a in a more general sense, I remember personally using, Memcached in a way where I wanted to persist things more than just simple strings. And oftentimes, you lean on some sort of serialization library, whether that be JSON or Pickle, if you're using Python, or Messagepack, or something similar. You know, it's very common to to see folks that use Memcached relying on that stuff, And then you run into these issues where you wanna modify some specific item, you know, within that data structure. You have to do the the the whole you know, get the entire data structure out, deserialize it, update the value, reserialize it, send it back to the database, and there's no guarantee of that is happening in any sort of atomic sense. The richness of the Redis data types really allow for a lot of that kind of behavior to be written in an atomic way and a much more efficient way where you're you don't have to do the the get process and and set routine. You can just say, hey. Update this specific attribute of this hash or this specific list.
The the the other kind of odd edge case that, I I think not very many people know about or run into, Memcached actually has a hard limit on the size that any value can be. And it's somewhere around a megabyte or maybe 2 megabytes, but you cannot change this value based on any sort of configuration. The only way to change it is to actually alter the constant in the the the code and and rebuild memcached from from source. And and so if you're trying to actually store any larger kind of payload, Memcached will just silently refuse to store it, and you're you're left wondering where my data went. You mentioned edge cases. Edge cases are pretty important point because
[00:34:07] Unknown:
needless to say, you cannot implement everything in Redis. Well, you can try, but Redis is only good for for for quite a few use cases. For example, typically, you wouldn't use a a Redis instance to store petabytes of data in a typical traditional warehouse that data warehouse scenario. But caveat is actually, if you wanna do something fancy, take a look at the ecosystem of of things that are already out there. Example, again, a company, wanted to you wanted to do their own content delivery network. And as part of of the implementation, they started to implement something like a bloom filter. A bloom filter is essentially a probabilistic data structure that is able to tell you with a certain probability of a or if a certain element is member of a given set without having to store the complete members in main memory. There's a beautiful Wikipedia page on it. So if you're interested in the mathematical background, feel free to check this out.
But what but what it wants to do essentially is that this bloom filter already had this module it's module based implementation. So when the company kind of got around of doing their homework, they found out, oh, yeah. We don't have to do this ourselves because, bloom filters are very handy if you wanna check if something is on your local cache in terms of you don't have to go upstream as a content delivery network provider, but kind of serve this particular piece of data right from your from your local cache if you can establish the fact that with a certain probability, it's already part of this it's it's already part of this memory. So Redis is more powerful than you think, especially if you kind of take a look beyond of what is obvious in terms of what is kind of on the on the GitHub page and what the server provides, chances are that somebody has done some work already. And if it's down to a module, that probably goes very near of what you're looking for. This is kind of a kind of a hint. And this is the beauty of open source, right, standing on the shoulders of foot of giants. Chances are somebody has done the work for you, given the right license. You simply take the code base, extend it, and then give it back to the community.
[00:36:14] Unknown:
And another 1 of the edge cases that I've come across as well, particularly using the key value capabilities, is that if you're trying to delete a large number of keys using the wild card capabilities, then it can end up being a significant performance hit if you have a large number of keys because of needing to do a full scan. And so if you namespace things using the, hashes that are implemented at Redis, it can speed things up a lot and give you the ability to get a lot more utility out of the key value capabilities.
[00:36:47] Unknown:
Absolutely. Actually, when we start Slack internally, there's a warning say, don't use key star in production. Very important.
[00:36:59] Unknown:
And when you're talking to users of Redis and Redis Pie, what are some of the common points of confusion or difficulties that they encounter when they're first getting started with using Redis, whether it's as a cache or as a queue or for some of these more complex and feature rich capabilities?
[00:37:16] Unknown:
I think the biggest issue is probably the idea around data types and what you can store in Redis kind of implicitly and what you need to mangle on the client side, or on the application side before sending it. So prior to redispy 3 dot o, the the client took a very liberal standing in terms of what kind of data values it would accept when it it was encoding your data to, prior to be sending sending it to Redis. If it was a data type that it knew about, so like a string or a byte string or an integer or a floating point number, all those are fine. It it just knows how to, you know, convert those to a string and and send that off to Redis. For anything else, it very, liberally just called, you know, the Python string function on the value and sent that off. And that confuses a lot of people because now all of a sudden, if I say pass a Boolean type, well, the the string representation in Python of a bullion type is either the string t r u e or the string f a l s e. Likewise, this, the string representation of a none value, in Python or or a null value in in other languages, is the string n o n e. And when you then refetch that value, you actually have, you know, that string representation, rather than, you know, something that is a little more intuitive to interact with.
And and so, RedisPy 3 introduced a a backwards incompatible change, that prevents users from doing that. You can now only persist values that the client, or or types that the client knows about, specifically strings, byte strings, integers, and and floating points. Everything else you are responsible for converting in application space prior to passing the value to the library. And the goal of this is to make it very clear to users that the values that you're sending to Redis are stored as strings and are going to be retrieved as strings. And there's no kind of type hinting or anything that that Redis provides that would allow the client to auto magically, you know, convert or coerce that value back into, the the the type that you're expecting.
[00:39:51] Unknown:
Also something that novice users sometimes stumble upon is actually redis, similar to Python, requires a certain mindset philosophy. Is that what I'm looking for? And can be used in multiple I mean, the old Python adage still applies. There's at least 1 way of doing things, if not more in Python. The same goes for eddies. Take for examples. If you take a look at the documentation, you'll see, yes, each and every item of a set has a has an associated score. To the innocent reader, the score looks numerical. But if you take a look at the mutation, it can be anything. It can be also plain string. So the thing is that even if your use case demands it, you don't have to you you don't have to convert floating point values into floating points before you add it to a sorted set. But if the semantics of the use case, allow it, you simply can store strings as score values in assorted set, meaning that you'd get around on conversion on the client side of things. So, again, the advice is take a look at the documentation and spend a little bit of of thinking. Spend a spend a few cycles before you start coding.
[00:41:07] Unknown:
And what have been some of the most interesting or unexpected ways that you've seen Redis and Redis Pie used?
[00:41:14] Unknown:
1 1 of the the more interesting ones for me, I I had a an issue pop up on the issue tracker, and the short version is the user was attempting to store very large binaries as values in Redis. I believe that that the user was trying to serve torrents or, you know, some sort of, like, large downloadable file, like an, you know, an ISO or something, Linux images or something directly out of Redis. And so they were running into issues where, you know, the client wasn't necessarily optimized for sending, you know, multiple gigabytes of of data as a single, value.
And and I I found that, like, a very peculiar edge case.
[00:42:02] Unknown:
Especially if you're running on construct on on confined memory, on the clients are because, yes, Pyne does something called garbage collection. And then you might and then you just might run into problems if you don't have enough main memory depending on on on how your data is structured. But, I mean, suffice it to say, touching on this very particular point, because I did a module I I did something I did I did a use case recently internally where I, essentially forwarded binary data to, Redisearch. And maybe now is the time to shed some more light on the internal architecture of of Redis and the client side implementation because and you already kind of touched upon the overall architecture. We have the we have the database server.
And most of the time, the client side would use something called a wrapper wrapper called highRadis. This wrapper is a thin code layer around sockets or other t c or other, communication mechanisms by of offered by the operating system like ports and IP addresses. And this high radius then uses, a binary safe representation of strings in order to implement the WAP protocol. The WAP protocol is called RASP. Essentially, it's a it's something that each and every client has to implement when it wants when it wants to talk to a server. And building upon this high ready, building upon this high ready layer, which is essentially just a wrapper taking strings and then building the rest protocol, the client side would then define primitives that offer or reflect the data types as offered by the server. And, Annie, correct me if I'm wrong if I understood the Python lib the the Python implementation correctly. That's exactly the the way we do it in Python. Essentially, we have something called redis.py, which is which is just a just a thin wrapper around highRadis, essentially representing Hyrales as a Python object like as like a like a CPython, object in the reference implementation of Python.
And then redispy essentially talks to high redispy implementing the server side API, including connections, the data types, the the, the server has to offer and so forth. And built upon then on on on top of high readiness are the, for example, the client side implementations for all of the modules. So internally, radisearchpy, radis graphpy, rejasonpy, and you name it. They all use Redis internally, which again is built upon high RedisPy that talks to the server. So this is your typical open source ecosystem where many people have collaborated in order to come up with something very powerful, namely, for example, in the case of Redis Graph, a full blown graph interface into Redis with the benefits, that Redis has to offer out of the box like performance and scalability.
So this is how it works essentially on the client side and redispy is no exception here. Just to add on to that a little bit,
[00:45:18] Unknown:
the high redispy library does not manage, sockets or connections at all. It's simply a protocol parser, so it it can read and write the resp, r e s p, protocol, and redispy integrates with that in such a way that if high Redis Pie is installed on the system, it will choose to use that as it is, much faster. But if high readiness pi is not installed on the system, then there's a fallback, pure Python implementation. And this is useful for, you know, other architectures that, don't support c modules, like, say, jython
[00:46:00] Unknown:
or or something similar. But at the very bottom, I think, high readiness is is doing is doing the work at before it talks to the operating system. If HighRadis PIE is installed. But you don't have to have HighRadis PIE installed
[00:46:13] Unknown:
or HighRadis for that matter in order to use RedisPy.
[00:46:18] Unknown:
Oh, okay. I thought you had to, but maybe I should read the documentation once again. And what are some of the design anti patterns that you've seen where people are abusing Redis in ways that it wasn't necessarily meant to be used or that aren't very sustainable or scalable.
[00:46:37] Unknown:
If you're coming from your ordinary SQL based background and you don't wanna leave that trodden path. I reckon NoSQL and especially Redis is not for you. I've seen quite a few implementations where people try to replicate ordinary SQL based behavior, I e fix table structures, indexes for performance, and all the rest of it, and sometimes even strange things like foreign key constraints and so forth in their in their code talking to Redis. And this is not something that Redis was was built for because Redis, like, at the like any other NoSQL databases, really shines at, for example, semi structure, unstructured data. I already kind of mentioned the fact that the interface is binary safe, so you can pump in you can pump any value into, say, a key, whether it's binary data or whatever, the server doesn't doesn't mind.
And, also, if your data doesn't have real kind of real structure, you just wanna ensure that it's representable as a string or something like a hash, and off you go. Redis will take care of the rest. And it's I think it's that sort of mindset that you have to rethink, that you have to kind of ponder about when you wanna make that switch from your traditional SQL based environments to something that is unpredictable, has less structure, but has much more scalability requirements than your ordinary databases because this is what this current set of thing is. Right? I mean, we still have general ledgers that are driven by mainframes.
But if you take a look at the new world, if you wanna call it that, you have street lights giving you sensor data in terms of what is the current visibility. Can I turn myself off? Is it bright enough outside or still I have or do I still have to to power a couple of light bulbs in order to to keep the street safe and sound? So I was just talking to an automobile manufacturer recently, and he told me that your ordinary car has about 20 computers built into it. The entertainment system not counting. Code base, we're looking at 10, 000, 000 lines of code without, again, without the entertainment system. And here in Europe, there's a new legislation where each and every car actually has at least 2 SIMs built into it so that, if you if you have a car accident so that the emergency and first responders can act immediately based upon the crash data that the computer sends to some sort of cloud. Needless to say, this is something also that insurance is quite quite quite interesting because they can actually track drivers' behavior and come up with much more competitive policy prices.
But that's not the point. The point is that all of these new shiny things send you data that is way beyond you what what you have been used to for the last, what, 20 or 30 years when you're coming from a traditional SQL based background. Because as I said, more most of the time, this data is not structured, has different formats, if at all, and comes in gigabytes or terabytes and not kilobytes or megabytes or something. So if you wanna build an architecture for that sort of use case, you wanna take a close look at NoSQL databases.
And especially if you're talking about real time performance, Redis is probably something worth taking into account because this is this is where where the where Redis comes from. And this is also if if I take a look at the use cases where where people use us, this is basically where we excel with that sort of technology because it's in memory. Hence, the caching, start.
[00:50:27] Unknown:
And what are some of the least used or most often misunderstood features of Redis that you think developers should know about and consider using in their applications?
[00:50:37] Unknown:
Excellent question. Again, I talked to quite a few people using Redis, and the perception is there. Oh, yeah. You you have that key value store. That perception changes once they take a look at redis.io and see what Redis really has to offer in terms of very application oriented data types. The beauty is that you can set up your cluster with open source components, meaning that you can construct a scalable and highly available solution pretty much by sticking together a couple of various instances. There's something called Sentinel that will take care, for example, of the high availability requirements you may have. You don't have to buy fancy licenses for this. You can do this pretty much out of the box. And, again, that's something that not that many people know unless it unless they they take a look at what's out there in terms of the code base already existing.
Of course, now also clusterware libraries, I think, RedisPy is no I I know that RedisPy is 1 of them. So right from, from the start, you have the foundation for distributed systems, for a distributed system where an application can talk to cluster instead of single database instance.
[00:51:52] Unknown:
And are there any other aspects of the Redis Pie project itself or use cases for the Redis database that we didn't discuss yet that you think we should cover before we close out the show? You know, 1 thing we did not touch on at all is, Redis has this, concept. It it has a embedded Lua interpreter in it. So
[00:52:12] Unknown:
similar to a relational database, you can actually write Lua scripts that that, will execute atomically, you know, via command, and you can pass an argument similar to, being able to pass those arguments to any other Redis command. So it it's akin to, like, a stored procedure in that you can group together a number of these different Redis primitives and, you know, make them all, execute as a single, transaction.
[00:52:39] Unknown:
So essentially, that allows you to dynamically create scripts that you can simply then take, put on the server, and execute. Why why Lua was chosen? Lua drives actually a web interface in my OpenWRT based router. It's a very compact scripting language, and I think this is the most reason why Salvatori chose it when he was looking for an easy way to extend the server functionality before modules were invented. The interpreter has about a footprint of a 160 k. And as I said, it powers something called LuCI, which is the web interface in my router. And it's pretty performed because, essentially, we're looking at a 32 bit MIPS Processor driving the router, as in as in the system on the chip thing. The idea is, of course, because it's all open source, 1 could take a look at minimalistic Python contemplations like MicroPython or something else And simply, it would be an interesting experiment to see whether you could actually replace Lua with Python on the server side, Though not giving you the full blown Python functionality because MicroPython has certain restrictions and limitations when when you compare it to a typical standard implementation like CPython. But what you could do then, essentially, you could take portions of your of your code base and execute it on the server side. So if anybody is out there listening and has some spare sites to to around with that concept, the code is on GitHub. And when you take a look at the the Lua implementation on the server side, it's not complicated.
Essentially, Lua offers a certain interface that a c based implementation can talk to. And there are only a handful of primitives, namely, upload a script to a server and then evaluate that script as an executed. You can also do this actually predefined, I. E. You you ship off a string to to the server, and the and the server gives gives you back an ID, and then you subsequently execute on that ID with parameters. So if somebody would come up with a or would take a fitting file Python implementation with a similar minimalistic footprint, I think replacing this shouldn't be a big deal because the as I said, the interface is quite small, and you could then actually take portions of your of your Python code from the client side and ship it off to the server.
Has the beauty in its in in its own because you don't have to expand your code base with another programming language named Lua. So so you could confine yourself to Python. So anybody out there, would be a great experiment. Call to action.
[00:55:10] Unknown:
So for anybody who wants to, get in touch with either of you and follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And so with that, I'll move us into the picks. And this week, I'm going to choose the actor Rowan Atkinson. I recently watched his latest movie, the third Johnny English, and as always, he is hilarious and very expressive. So, I've enjoyed his work for a long time. So anybody who hasn't watched anything with Rowan Atkinson in it, definitely worth checking out. And so with that, I'll pass it to you, Andy. Do you have any picks this week?
[00:55:45] Unknown:
I do. I have 2, actually. Over the last 4 or 5 years, I've really gotten into, cooking and and trying to understand the science behind cooking. And 1 of the the most invaluable books that I found, as an engineer, approaching cooking and and wanting to really understand the science about it is a is a book called The Food Lab, A Better Home Cooking Through Science, and it's by a a chef here in the Bay Area, named J Kenji Lopez Alt. And the second pick that I have, I I do like video games and, there is a, a new community mod, for DOTA 2 called AutoChess, and it is just exploding. I think, it's been out for a month or 2. It has I think it was just announced 4, 000, 000 plus, installs at this point, and it's a fantastic kind of strategy puzzle kind of mod that completely changes the way that you might play, a DOTA 2, and I'm obsessed with it. Alright. And, Christophe, do you have any picks this week? 3, actually.
[00:56:54] Unknown:
1 nontechnical pick, and this is something that I've discovered recently. Indian Pales, RPAs, and infused with grapefruit juice. For those of the listeners who don't who don't know IPAs, IPAs are strongly hopped handcraft well, craft beers. Let's put it this way. Most of the time, hopefully, handcrafted too. IPAs go back to the point in time when the English were still actively in in involved in India, hence the name because that beer had to suffer a long journey from England to to India. So they put a lot of alcohol into into these beers and normally put a lot of hop into this too with the idea, of course, that once that batch would arrive in India, it would be watered down to normal drinking strength. Of course, that never happens. So people literally look took the IPAs and consumed it as they were. You're looking at beers with an ABV of typically more than 6%. So, with the recent ad war, well, for the last decade, as in with the with the with the craft beer movement, if you will, some of the more experimental types basically got into the old tradition of infusing beers with with fruit juice, and that's exactly what what this is I what this particular type of IPA is. If you like your your IPAs with a certain flavor of bitterness, this is this is your this is your best chart. The technical pick act actually, again, 2 things come to mind. Of course, redispy goes without saying because any and anything I can do I do in Python with with redispy, I do through redispy and also the the module stuff. Because it's based on RedisPy, I normally use the clients at Python libraries. The second technical pick again is a Python module called Selenium.
For those of the dear listeners who don't know Selenium, selenium, selenium is a web scraping framework that you can simply use to download websites and then to extract information. And, this is something that comes in handy if you wanna do a front as in as in client side, a quality assurance on your website because drafting a Selenium based test test script is as simple as importing Selenium, opening up a driver, looking for an element, extracting the value behind this element, and then seeing whether some whether this is the 1 you want or not. It's straightforward. Rebooting, for example, my router through a selenium based Python script is about 20 lines, give or take. So it saves a lot of effort when you when you're into that sort of thing. And the last pick that I have, it's somewhere between technical and nontechnical.
It's actually an author called Daniel Suarez. I don't know, if you know this, Tobias. He's kind of famous for his very realistic, well researched novels. Influx probably is 1 is 1 of the better known, or I think the second 1 was called Darknet or something. He has a particular style of writing that is very well researched. You can actually see that he puts a lot of effort into writing, and the facts that he portrays in or or the or the stories that he portrays in his books are very are very well researched and based on facts. So it's rather science than fiction. And the beauty about this particular author is that he manages to keep up the pace. So it's not only pretty realistic what he writes about, but also his style of writing is pretty close to something that you would expect from from your recent James Bond movie in terms of pace and action. And these are my picks. Alright. Well, thank you both for taking the time today to join me and discuss the work that you've been doing with the Redis database and the Redis Python client. I have used them both fairly
[01:00:34] Unknown:
extensively in my own work, so I appreciate all the effort you've put into it, and I hope you enjoy the rest of your day. Thank you for very much for having us.
[01:00:41] Unknown:
Wonderful. Thanks, Tobias.
Introduction to Redis and Guests
Guest Introductions
Overview of Redis Database
Redis Py Project and Its Features
Modules and Extensions in Redis
Redis as a Caching Layer
Common Issues and Edge Cases
Design Anti-patterns and Misunderstood Features
Lua Scripting and Future Directions
Closing Remarks and Picks