How To Include Redis In Your Application Architecture

Today, I'm interviewing Andy McCurdy and Christoph Zimmermann about the Redis database, Redis py, and some of the various ways that it is used by Python developers.

Hello, and welcome to podcast dot in it, the podcast about Python and the people who make it great. When you're ready to launch your next app or want to try a project you hear about on the show, you'll need somewhere to deploy it. So say hi to our friends over at Linode.

With 200 gigabit private networking, scalable shared block storage, node balancers, and a 40 gigabit public network, all controlled by a brand new API, you've got everything you need to scale up. Go to python podcast.com/linode,

l I n o d e, to get a $20 credit and launch a new server in under a minute. And And don't forget to thank them for their continued support of this show.

And if you're like me, then you need a simple and easy to use tool to keep track of all of your projects.

Some project management platforms are too flexible, leading to confusion of workflows and days' worth of setup, and others are so minimal that they aren't worth the cost of admission.

After using Clubhouse for a few days, I was impressed by the intuitive flow.

Going from adding the various projects that I work on to defining the high level epics that I need stay on top of and creating the various tasks that need to happen only took a few minutes.

I was also pleased by the presence of subtasks,

seamless navigation, and the ability to create issue and bug templates to ensure that you never miss capturing essential details.

Listeners of this show will get a full 2 months for free on any plan when you sign up at pythonpodcast.com/clubhouse.

So help support the show and help yourself get organized today.

And don't forget to visit the site at python podcast.com

to subscribe to the show, sign up for the mailing list, and read the show notes. And don't forget to keep the conversation going at python podcast.com/chat.

Registration for PyCon US, the largest annual gathering across the community, is open now. So don't forget to get your ticket, and I'll see you there.

You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis.

For even more opportunities to meet, listen, and learn from your peers, you don't want to miss out on this year's conference season.

We have partnered with O'Reilly Media for the Strata Conference in San Francisco on March 25th

and the Artificial Intelligence Conference in New York City on April 15th.

In Boston, starting on March 17th, you still have time to grab a ticket to the enterprise data world.

And from April 30th to May 3rd is the open data science conference.

Go to python podcast.com/conferences

to learn more and to take advantage of our partner discounts when you register. Your host as usual is Tobias Macy. And today, I'm interviewing Andy McCurdy and Christoph Zimmerman about the Redis database, Redis py, and some of the various ways that it is used by Python developers.

So Andy, could you start by introducing yourself? Sure. I'm Andy McCurdy. I'm a software engineer. I've been doing this for 20 years, and, have worked with Redis and Redis Pie for the last, 10 or so. And Christophe, could you introduce yourself? Yeah. Hi. My name is Chris Zimmerman. I've been using Redis since the second commit, I think. I've been using open source for the for the last 30 plus years,

And, I also took up a job,

at Radis Labs for for the last half year. And the main functions are here, looking after the community in Central Europe as well as, oh, Plano sales support as in presales.

And going back to you, Andy, do you remember how you first got introduced to Python? I do. I actually had 2 different introductions. The first, was when I was working in the video game industry in

the early and mid 2000.

We were embedding

a Python interpreter in our in our game engines

so that level designers could,

script behavior much easier than kinda writing it in c plus plus. And so I I kinda got a cursory

knowledge of Python through that. But I'd say my real deep dive into Python was

in,

2, 007 and 8 when we decided to abandon our homegrown PHP framework

and, adopt Django.

And so I I kinda took the deep dive there and and, haven't looked back. And, Christophe, do you remember how you first got introduced to Python? Oh, I absolutely do. It was in the early days when I did my PhD on reflective operating system architectures, and we were looking for

a scripting language to drive,

mostly the

QA effort that we were doing on the experimental microcurrent that we implemented about 25 years back. We were looking at Tcl and Perl, and they didn't really fit the bill. And then this new language called Python came up. It was kind of love at at first sight, and I've never really looked back.

That was the first kind of touchpoint when I also got introduced to Linux, which we use as the development and QA

environment for the for the operating system kernel that we were doing.

And over the last, well, 25 years, I've grown to love Python more and more and more given the fact that,

I'm hosting

system

system administration tasks, and I love it. And and I use it wherever I can, essentially. So full marks to I called Guido van Rassen to come up within it in the first place. And so

the conversation today, we're gonna be centering around the Redis Python

client and some of the ways that it is used with in conjunction with the Redis database. So can you start by giving a bit of an overview about what the Redis database is and how you each got involved in the project?

Sure. If that's okay, well, I I I would I would like to start. Salvatore and Filippo did the her first commit in 20

09. So we're just looking at at the 10th anniversary

of of the project on GitHub. When Salvatore was looking for something

that he could use as part of real time,

well, reporting basis, if you will, he took a look at at MySQL. He took a look at other SQL databases

which were around at the time, but none of them really fit the bill. So he sat down and started to implement a memcached alternative

initially.

Over the years, this has grown into a fully functional

multimodal database, not just using key not just using keys keys and values, but rather offering many, many more data types. I'm sure we will we will go into the further in into more detail later on. About

2 years later, 1 and a half years later, my I myself was looking for a PHP op opcode cache for some

FPM based application servers that I was running at home. And

these things were mostly running on embedded systems. So I chose NGINX as the main driver, FPM,

as the main PHP interpreter.

And I came across Memcached,

of course, and also across something called Redis.

And I've been using

Redis as a PHP opcode cache ever since. Just to give you some impression, 1 of the production systems for want of a better word runs on a 6L

325

that is machined with 512 megabytes

of main memory, hosts about 3 NGINX instances,

3 FPMs,

and 1 Redis instance driving all of these FPM instances from an opcode

cache perspective.

And this thing now drives

about

4 own cloud or next I'll just upgrade the code base, next cloud instances. And Redis is the main driver there with regards to speed and and

performance.

Yeah. It's an impressive amount of efficiency that you can get out of it. I know that a lot of the sort of more, quote, unquote, production grade databases that people will use outside of the relational engines are typically written in Java, which is much more heavyweight. So it's always nice being able to have, something that's lightweight and efficient

and, easy to get up and running with for facilitating those types of use cases. I mean, when you take a look at the code on GitHub, this is highly compact, highly, highly efficiency. This is highly compact, highly efficiency implementation.

I took a look at the code base when I first decided to go for it. And over the years, Salvatore and and the contributors, of course, have done an excellent job at at maintaining,

the conciseness,

the, the compactness of that of that code base. Needless to say, every now and then, I run it through through certain code analysis tools like,

Sonar

SonarQube and all the rest, and code checks out. There there's kind of no hidden

surprises.

There's no kind of gotchas in there. And if you take a look at the at the issues list, yes. Like every other code base there, there are, of course, bugs and so forth. But in comparison to other projects, it's

it's it's manageable. Let's put it this way. And, Andy, how did you first get involved in the Redis project?

It was probably around late alpha or early beta, so shortly after Christophe got involved. And, we were heavy users of, Memcached at the time. And we were we were having some issues around scaling 1 specific section of our site. And I I was looking for something that was like Memcached, but actually had, like, native

list types and and atomic list operations.

And I was all set to write this myself, you know, foolishly, of course, because I I couldn't have done nearly as good of job as, Salvatore did. But in kind of, doing a little bit of research,

fortunately, I stumbled upon Redis,

and this was back when I think there were 4 or 5 client libraries bundled within the main server

repository. 1 of those was a a very minimalist Python client. It wasn't maintained. It was

basically a proof of concept.

So I picked that up and,

started sending Salvator

patches.

And, I think some of the other folks that were working with the client libraries in other languages were doing the same, and I he he probably got inundated with patch emails at that point and, decide said, hey. Why don't you guys just all, you know, split these off and and, you know, run them yourselves, and and I'll link to you? So that's how Redis Pie was actually born. And so can you talk a bit more about what the Redis Pie project is and how it relates to the redispy database in terms of the features that it exposes

and the way that it can be used within Python applications? Certainly. So,

you know, at at at the core, redispy is is the main Python client for redis. It it's not

terribly difficult to build a a minimalistic

Redis client in any languages, and,

you know, there there's a number of them in in Python specifically,

but bread as pie definitely is the most popular and has the highest install base.

I I'd say that, you know, the the the primary mission is

to build kind of a canonical

Pythonic

implementation

that that makes it natural for experienced Python developers,

to communicate with Redis. And so can you go over some of the main use cases that Python developers

have used the Redis database for and some of the features that are unique

specifically to Redis in terms of the application architectures that it enables?

Sure. So I I would say by far and away, the the biggest use case that we see is

people using it as as a key value store in for for caching well, I guess, for caching

in data in a similar way that that memcached would.

There's a lot of, you know, interesting

other ways to use it as well.

Like Christophe said, you know, Redis exposes a lot of other data types besides just plain byte strings.

So

sorted sets, for instance, are,

something very unique to Redis

and something that can, be used to highly optimize,

specific access patterns.

And so in terms of the redispy project itself, can you talk a bit about how it's implemented

and some of

the

difficulties

or challenges associated with tracking the features of the upstream Redis project? Sure.

So Redis Py tries to follow

the Redis database lead as as closely as possible. What I aim for out of the box is a library that's easy to get started with and has sane defaults. So by default, it is thread safe in in typical

Python workloads,

you know, whether you you're using, like,

gunicorn

or uWSGI,

serving a Django or Flask application, you simply

create a a singleton instance,

of the Redis client, and you're good to go. I also try to provide

functionality

to make it easy to work with less typical architectures,

such as gevent or eventlet

or, you know, running on PyPy or other Python interpreters that are, you know, less mainstream.

And what are some of the

feature primitives that are available in the Redis database and the Redis Py library

that application developers can build on top of? So you start off with with basic, you know, key value strings. That that's kinda where everyone, I think, starts off with. And and that's very similar to the functionality that that someone might be used to with memcached. But there are other data types that Redis exposes as well, and there are atomic operations around each of those primitives. So Redis has a a list data type. So you can,

similar to a Python list, you can

insert a value at a particular location or you can, append something to the end of the list. You can pop things off either the head or the tail of the list. There's,

even concepts like blocking pop where the client can choose to block on, pulling an item off of of the list. And if there are no items on the list, it will block until another process puts

an item on on the list. So with that, you can build something like a, a message queue. Taking it a step further, there's,

sorted sets that like lists, but each element,

has an accompanying,

score value. And those scores, Redis keeps that,

the the sorted set sorted in memory so you can do very quick operations,

like,

for instance, pagination.

If if you have millions of items in assorted set, you can very efficiently

get page

2, 395,

similar to a a slice in Python. Redis provides,

hashes, which are similar to Python dictionaries.

So it's a single key that can have then other keys and associated values. Redis also provides more

advanced

systems like publish subscribe,

where

you can have a number of clients listening

for events on a particular channel, and you can have, publishers

that publish events to those channels, and and those events are replicated to all of the consumers. This would allow someone to build a, like, a chat

system. In fact, I've I've built several chat systems,

with this, using WebSockets

and, you know, consumers that listen to chatroom messages.

A a newer a newer concept that Redis just launched with,

version 5 is streams. Streams are

somewhat similar to to publish subscribe

in in that you do have publishers and you have consumers, but the messages are more more durable. So the the problem with,

typical publish subscribe is it it's fire and forget. So if a publisher publishes a message and a client happens to not be connected at that time, the client just doesn't get that message. With streams,

messages are published with key value pairs, and they're added to a queue, and a client can connect and consume those messages. And if the client becomes disconnected,

the server remembers where the client left off so the client can reconnect

and then pick up where it left off. Maybe I can chime in here. Streams is 1 of the recent addition where onto the server side where the gap between application requirements and what the server has to offer us narrows it more and more down. And the implementation

in RedisPy

is similar what you see

at any or in any client library that talks to Redis instance because the program languages or the client set libraries would mirror the the data types that are offered by the server. So streams would be the latest edition, and I actually started to extend

the code base of redis pi with streams

Lab took it over, and he accepted the pull request, I think, in October or November. And then it was the official launch, I think, was in November, Andy, of of streams in in Redis Pie. Yeah. That sounds right. So and this is what you see when you take a look at c sharp support, at,

Pearl support, at Rust support, at Go support. If the data types are not supported

by the client side library as in programming language, you will have you will more often than not have a compound header type reflecting

the architecture or or the or the structure on on the server side. So most programming languages,

for example, would offer lists. But when it comes down to sorted lists, more often than not, you will find you will find a compound data type where a list item is amended with the score reflecting

the server side implementation.

And

there is even another concept that we haven't touched upon yet called modules.

Modules allow you to extend the functionality

of the of a Redis server by writing custom codes in c or is our language that allows c bindings

and then to insert that module

when you start up the server into into the copays of the server, so to speak. So in Linux Parler, you would have something called a shared object that you load at server startup time. And this shared object then typically,

as part of its module implementation, would implement a specific

application

data type. There are

quite a few modules around these days.

Many of them actually

implemented by Redis Labs. So for example, there's something around called re JSON that implements

a JSON based document type in c, which allows you to store

documents in a in a rather server instance

and index them to some extent

using JS query,

and to retrieve them finally. There are other modules in the area of graph

databases. So there's something called Redis Graph that turns a Redis instance into a graph database

where you simply create a couple of nodes, create a couple of edges,

stack some properties on these items, and then finally connect the nodes through the edges forming a graph

similar to to neo4j

or other graph based databases.

The idea is, again, if you and all of these modules

typically have Python clients or library implementations

that are based

on redis minus pi. So the idea is, again, for these modules

to provide application specific data types that are reflected on the client side libraries by typical

implementation. So redis graph, as in the the the Python

client side implementation,

provides data types like edges, nodes, and finally graphs. So creating a graph in Redis graph Python is quite straightforward. You define nodes. You define edges. And when you build them into a graph, as in when you connect nodes to, with edges, you just have to say commit, and then the graph is instantiated

as in committed on the server side for any client with with the right credentials to traverse. And this is what you see in all of the module implementations on the client side where simply we the structure of the data types on the server is reflected by similar compound or native scalar data types on the client side implementation. And this is something I think that differentiates

Redis from from the rest of the pack because the module

SDK specification

is is also, of course, published because Redis is open source. It's under BSD license, so everybody can take a look at the code. The SDK is specify is is is specified on Redis dot io. And there are a couple of, as I said, module implementations on GitHub where you can simply take a look at how Redis is

extended using modules. And, of course, you're free to use you're free to create your own module and then slot this into your server implementation.

This makes, I think, Redis pretty flexible when it comes down to even narrowing down the gap between application requirements and something that the server has to offer more and more. Particularly in terms of the,

JSON module plugin, that definitely simplifies a lot of the application

requirements because I know that in my own experience using Redis as an object cache,

you would often need to serialize and deserialize

complex Python dictionaries

to and from JSON in order to be able to

store and retrieve it properly. So being able to add the ability

to offload that responsibility

to the database layer as well as provide some indexing into those objects

adds a lot more power than trying to do

string matching of the value in Redis in order to determine which key values you need to retrieve. So the module capability is is something that I wasn't previously very familiar with, so it's nice to see that it has that plug in capability so that you can, as you said, extend the server layer with some of these application concerns and thereby

simplify the code that each developer needs to write on their own. And it gets even better.

Come Redis

Conf that takes place in April, we gonna launch an open source

infrastructure

called Redis Gears that allow you to connect modules to each other so you can combine, for example, Redisearch, which is a full blown real time text search engine as an indexing engine with,

re JSON giving you something very similar in functionality to ordinary document based databases

like Mongo or for Couchbase.

And in terms of the release cycles for RedisPy,

how does that compare to the upstream

challenges

associated

with maintaining feature parity

between the Python client releases of the upstream database? I try to,

be vigilant

in in, you know, reading the mailing list and and finding, you know, changes.

It it was a lot more complex,

years ago when

it it seemed like every month or or every few weeks, new commands were coming out or existing commands were being changed

to, you know, add a new functionality or behavior. I I think 1 of the the biggest challenges is that I I've really been impressed by,

a number of packages in the Python community such as Django that adhere to

really strict backwards compatibility

requirements.

And

I I tried to

make that promise to

users of Redis Pie that, you know, upgrading to a new version isn't going to

break them, or or will at least warn them first for, you know, a few versions before,

it it does break them. Unfortunately, some of the changes that that that are made to the,

database side are

are are hard to

make it hard to keep that promise. And,

I I think that's 1 of the the the bigger challenges where, you know, folks in the Python community really do appreciate that backwards compatibility

guarantee.

And,

you know, it sometimes we just have to break it. You mentioned earlier that you tried to keep the client fairly Pythonic. So I'm wondering what are some of the cases where you break from the syntax that the,

upstream Redis project itself supports and some of the convenience methods that you provide to simplify

developers'

efforts

in interfacing with the Redis database?

So there's a there's a number of of

subtle things. I I I try to

make the client,

you know, have the same argument order and argument types.

But oftentimes

things like if you're

attempting to put, an expiry on a key. So you can tell Redis,

save this key and mark it, to be expired in, say, 10 seconds

or

an hour from now. Redis doesn't have any concept of, like, a a date time primitive,

but I know a lot of people

in in Python are oftentimes working with dates and times. And so I'll provide, you know, convenience methods that will accept either an integer

to specify

that time in, you know, a

a second offset, or you could supply a date time type,

and it would mark the ex the the expiration date, you know, precisely at that time. I'll I'll accept,

star hearts

for,

you know, a list of items,

to say

the delete command or the,

m git command where you can get a lot of, the values of a lot of keys in 1 request, or star star keywords to implement,

different options, like

options in, say,

there's a command called

set, which is, you know, by default, it it you just save,

you you give it a key and a value, and and that's persisted to the database. But,

there are a number of options that you can also pass to set. Those are expressed as keyword arguments instead of the user having to remember what the precise incantation is of the Redis command.

And that goes for the, for the streams interface as well. I can remember the discussions on on Slack that we had internally with with Andy, when he,

took a look at the pull request

and then, yeah, verified that

the

parameters reflect actually the documentation on Redis. Io when it comes down to streams. So, again, this is a very Pythonic approach,

I think, to mod to modeling the client side and kind of maintaining the spirit of Python.

Its simplicity, its elegance. Is that is is that the word I'm looking for? And it's concise

for 1st and foremost.

So

if you take a look at radis.au,

which is the website basically where all the

server side commands are documented and the options and all the rest of it, you'll see the exactly the same keywords in the corresponding

parameters when you invoke

Redis Pie on the client side. And I think that

philosophy makes it very coherent for somebody

reading up on the server side documentation than sitting down implementing

not only his first, but maybe his second or third client

using redispy,

because it simply takes a look at the documentation, sees the parameter that he has to use, maybe spend some time initially to to review the redispark implementation

or documentation

on GitHub, and I think the documentation is is on read the docs. Right? It is. Yes.

Yes. So,

and then he's he immediately noted notices the the similarities

between the client side implementation

and the server primitives. And I think that's what makes Redis Pie a very powerful client and very easy to use client because you can you can take literally

the server side primitives and direct the slot into Python on the client side. Cutting down the implementation times and QA

I

think. Yeah. And it definitely helps too in terms of

being able to translate your understanding of how to work with Redis across multiple languages because,

well, we all might like to stick with Python for everything. There are occasions where we need to either go into JavaScript or Ruby or PHP or whatever it might be, and Redis is equally useful in all of those contexts,

but it increases the barrier to entry if you have to relearn a new API every time you want to do it even though you're interacting with the same back end. Is any is anybody still using PHP? I'm only kidding.

I'm only joking It still powers a significant portion of their web, but

This is what I this is what I heard. Yes.

Unfortunately, all the people have switched over to Django or Flask or whatever. Yes. That's correct,

unfortunately.

And

you were mentioning that 1 of the first reasons that you came across Redis was

to use it in a caching context, similar to what memcached provides. So can you give a bit of a compare and contrast in terms of the caching context as far as the feature set that Redis has that might influence someone to choose that instead of memcached and some of the edge cases that users should be aware of when using it as a caching layer? Sure. Well, originally, as I said,

Redis was very similar in in functionality to memcached as in you have a key and you have a value associated with that key. But now I see this both

in in in in sales situation, but also when I look at the community.

People have moved on big time from the simple caching to much, much more

sophisticated

caching where actually the variety

of the server data types that Redis has to offer these days comes into play. For example, there's a company

unfortunately, I cannot mention the name yet. It's a betting company that uses Redis

in

as a user session cache for price modeling.

If you go to the website,

you can place bets. These bets, of course, have a price. If a user and and these and these and and and these price calculations are time sensitive. So imagine your typical punter

watching a game watching a soccer game on TV and then seeing that,

for example, Gomez, which is popular,

soccer star here in in Europe, has just scored a goal. And he goes to the website and say, now look. What's the probability of Gomez scoring a second goal within the remainder of of the game? So this betting company then takes your source IP, where you come from, essentially, when you wanna place the bet, takes data from your

basic maybe connected social blue social profile, but also takes, into account

the, the opponent that Goms is playing against on down on the field. And all these factors

actually

contribute to the overall modeling of the prize. And when I took a look at the prize models, these were pretty sophisticated, and these were pretty complex. Needless to say, if you wanna do this for each and every bat that a user is is is trying to get a price for, you rapidly run into performance problems. So this is what the company noticed when they were actively looking for an intelligent caching layer. So they chose Redis, and now Redis drives this in terms

of caching

some of the factors

and then aggregating

the remainder of the model in real time in memory.

So the so the user doesn't have to wait for for a couple of seconds to get a price for a better but rather gets this instantly

because the caching layer directly sits on top of the web server. So turnaround times and and thus latency are very, very short. And funny enough, the company is more than happy with the increased conversion rates because more and more people are actually able to put down money for bets on that website. And I think this is a very good example for how caching in Redis has has has evolved from a simple key value store to a much more sophisticated

caching layer that drives business and and keeps customers happy. I'd I'd also say that in in a in a more general sense, I remember

personally using,

Memcached

in a way where I wanted to persist things more than just simple strings. And oftentimes,

you lean on some sort of serialization

library, whether that be JSON

or Pickle, if you're using Python, or Messagepack, or something similar. You know, it's very common to to see folks that use Memcached relying on that stuff, And

then you run into these issues where you wanna modify some specific item, you know, within that data structure.

You have to do the the the whole you know, get the entire

data structure out, deserialize it, update the value, reserialize it, send it back to the database, and there's no guarantee

of that is happening in any sort of atomic sense. The richness of the Redis data types really

allow for a lot of that

kind of behavior

to be written in an atomic way and a much more efficient way where you're you don't have to do the the get process

and and set routine. You can just say, hey. Update this specific

attribute of this hash

or this specific list.

The the the other kind of odd edge case that, I I think not very many people know about or run into, Memcached actually has a hard

limit on the

size that any value can be. And it's somewhere around a megabyte or maybe 2 megabytes, but you cannot change this value based on any sort of configuration. The only way to change it is to actually alter the constant

in the the the code and and rebuild memcached from from source. And and so if you're trying to actually store any

larger kind of payload,

Memcached will just silently refuse to store it, and you're you're left wondering where my data went. You mentioned edge cases. Edge cases are pretty important point because

needless to say, you cannot implement everything

in Redis. Well, you can try, but Redis is only good for for for

quite a few use cases. For example, typically, you wouldn't use a a Redis instance to store petabytes of data

in a typical

traditional warehouse

that data warehouse scenario. But caveat is actually, if you wanna do something fancy, take a look at the ecosystem of of things that are already out there. Example,

again, a company,

wanted to you wanted to do their own content delivery network. And as part of of the implementation,

they started to implement something like a bloom filter. A bloom filter is essentially a probabilistic data structure that is able to tell you with a certain probability of a or if a certain element is member of a given set without having to store the complete

members in main memory. There's a beautiful Wikipedia page on it. So if you're interested in the mathematical background, feel free to check this out.

But what but what it wants to do essentially is that this bloom filter already had this module

it's module based implementation. So when the company kind of got around of doing their homework, they found out, oh, yeah. We don't have to do this ourselves because,

bloom filters are very handy if you wanna check if something is on your local cache in terms of you don't have to go upstream as a content delivery network provider,

but kind of serve this particular piece of data right from your from your local cache if you can establish the fact that with a certain probability, it's already part of this it's it's already part of this memory. So Redis is more powerful than you think, especially if you kind of take a look beyond of what is obvious in terms of what is kind of on the on the GitHub page and what the server provides, chances are that somebody has done some work already. And if it's down to a module, that probably goes very near of what you're looking for. This is kind of a kind of a hint. And this is the beauty of open source, right, standing on the shoulders of foot of giants. Chances are somebody has done the work for you, given the right license. You simply

take the code base, extend it, and then give it back to the community.

And another 1 of the edge cases that I've come across as well, particularly using the key value capabilities,

is that if you're trying to delete a large number of keys

using the wild card capabilities, then it can end up being a significant performance hit if you have a large number of keys because of needing to do a full scan. And so if you namespace things using the,

hashes that are implemented at Redis, it can speed things up a lot and give you the ability to get a lot more utility out of the key value capabilities.

Absolutely.

Actually, when we start Slack internally,

there's a warning say, don't use key star in production.

Very important.

And

when you're talking to users

of Redis and Redis Pie, what are some of the common points of confusion or difficulties that they encounter when they're first getting started with using Redis, whether it's as a cache or as a queue or for some of these more

complex and feature rich capabilities?

I think the biggest issue

is probably the idea around

data types

and

what you can store in Redis kind of implicitly

and what you need to mangle on the client side,

or on the application side before sending it. So

prior to redispy 3 dot o, the the client took a very liberal

standing

in terms of what kind of data values it would accept when it

it was encoding

your data to, prior to be sending sending

it to Redis. If it was a data type that it knew about, so like a string or a byte string or an integer or

a floating point number, all those are fine. It it just knows how to, you know, convert those to a string and and send that off to Redis. For anything else,

it

very,

liberally

just called, you know, the Python string function on the

value and sent that off. And that confuses a lot of people because now all of a sudden, if I say pass a Boolean type, well,

the the string representation in Python of a bullion type is either the string

t r u e or the string f a l s e. Likewise, this, the string representation

of a none value,

in Python or or a null value in in other languages,

is the string n o n e. And when you then refetch that value,

you actually have,

you know, that string representation,

rather than, you know, something that

is a little more intuitive to interact with.

And and so,

RedisPy

3 introduced a a backwards incompatible change,

that

prevents users from doing that. You can now only persist

values that the client,

or or types that the client knows about,

specifically strings, byte strings,

integers, and and floating points. Everything else you are responsible

for converting

in application space

prior to passing the value to the library.

And the goal of this is to make it very clear to users

that

the values that you're sending to Redis are stored as strings and are going to be retrieved as strings. And there's no kind of type hinting or anything that that Redis

provides that would allow the client to auto magically,

you know, convert or coerce that value

back into,

the the the type that you're expecting.

Also something that novice users

sometimes stumble upon is actually

redis, similar to Python, requires a certain

mindset philosophy.

Is that what I'm looking for? And can be

used in multiple I mean, the old Python adage still applies. There's at least 1 way of doing things, if not more in Python. The same goes for eddies. Take for examples.

If you take a look at the documentation, you'll see, yes, each and every item of a set has a has an associated score. To the

innocent reader,

the score looks numerical.

But if you take a look at the mutation,

it can be anything. It can be also plain string. So the thing is that

even if your use case

demands it, you don't have to you you don't have to convert floating point

values into floating points before you add it to a sorted set. But if the semantics of the use case, allow it, you simply can store strings as score values in assorted set, meaning that you'd get around on conversion on the client side of things. So, again, the advice is take a look at the documentation

and spend a little bit of of thinking. Spend a spend a few cycles before you start coding.

And what have been some of the most interesting or unexpected ways that you've seen Redis and Redis Pie used?

1 1 of the the more interesting ones for me,

I I had a an issue

pop up on the issue tracker, and

the short version is

the user was attempting to store

very large binaries as values in Redis. I believe that that the user was trying to serve

torrents or,

you know, some sort of, like, large downloadable

file, like an,

you know, an ISO or something, Linux images or something

directly out of Redis. And so they were running into issues where,

you know, the client wasn't necessarily optimized for sending, you know, multiple gigabytes of of data as a single,

value.

And and I I found that, like, a very peculiar

edge case.

Especially if you're running on construct on on confined memory,

on the clients are because, yes, Pyne does something

called garbage collection.

And then you might and then you just might run into problems if you don't have enough main memory depending on on on how your data is structured.

But, I mean, suffice it to say, touching on this very particular point,

because I did a module

I I did something

I did I did a use case recently internally where I,

essentially forwarded binary data to,

Redisearch.

And maybe now is the time to shed some more light on the internal architecture of of Redis and the client side implementation because

and you already kind of touched upon

the overall architecture. We have the we have the database server.

And most of the time, the client side would use something called

a wrapper wrapper called highRadis.

This wrapper is

a thin code layer around

sockets

or other

t c or other, communication

mechanisms by of offered by the operating system like ports and IP addresses. And this high radius then uses,

a binary safe representation of strings

in order to

implement the WAP protocol. The WAP protocol is called RASP. Essentially, it's a it's something that each and every client

has to implement when it wants when it wants to talk to a server. And building upon this high ready, building upon this high ready layer, which is essentially just a wrapper

taking strings and then building the rest protocol,

the client side would then define primitives

that offer or reflect the data types as offered by the server. And,

Annie, correct me if I'm wrong if I understood the Python lib the the Python implementation

correctly. That's exactly the the way we do it in Python. Essentially, we have something called redis.py,

which is which is just a just a thin wrapper around highRadis,

essentially representing

Hyrales as a Python object

like as like a like a CPython,

object in the reference implementation of Python.

And then redispy essentially talks to high redispy

implementing

the server side API, including connections, the data types, the the, the server has to offer and so forth.

And built upon then on on on top of high readiness are the,

for example, the client side

implementations for all of the modules.

So internally,

radisearchpy,

radis

graphpy,

rejasonpy,

and you name it. They all use Redis internally,

which again is built upon high RedisPy

that talks to the server. So

this is your typical

open source ecosystem

where many people have

collaborated

in order to come up with something very powerful,

namely, for example, in the case of Redis Graph,

a full blown

graph interface

into Redis

with the benefits,

that Redis has to offer out of the box like performance and scalability.

So this is

how it works essentially on the client side and redispy is no exception here. Just to add on to that a little bit,

the high redispy library

does not manage,

sockets or connections at all.

It's simply a protocol parser, so it it can

read and write the resp, r e s p, protocol,

and

redispy

integrates with that in such a way that if high Redis Pie is installed on the system,

it will choose to use that as it is, much faster.

But if high readiness pi is not installed on the system,

then there's a fallback, pure Python implementation.

And this is useful for, you know, other architectures

that,

don't support c modules,

like, say, jython

or or something similar. But at the very bottom, I think, high readiness is is doing is doing the work at before it talks to the operating system. If HighRadis PIE is installed. But you don't have to have HighRadis PIE installed

or HighRadis for that matter in order to

use RedisPy.

Oh, okay. I thought you had to, but maybe I should read the documentation once again. And what are some of the design anti patterns that you've seen where people are abusing Redis

in ways that it wasn't necessarily

meant to be used or that aren't very sustainable or scalable.

If you're coming from your ordinary

SQL based background

and you don't wanna leave that trodden path.

I reckon

NoSQL

and especially Redis is not for you. I've seen quite a few implementations where people try to replicate

ordinary SQL based behavior, I e fix table structures,

indexes for performance, and all the rest of it, and sometimes even strange things like foreign key constraints and so forth

in their in their code talking to Redis. And

this is not something

that Redis was was built for because Redis,

like, at the like any other NoSQL databases, really shines

at, for example,

semi structure, unstructured data. I already kind of mentioned the fact that the interface is binary safe, so you can pump in you can pump any value into,

say,

a key,

whether it's binary data or whatever, the server doesn't doesn't mind.

And, also, if your data doesn't have real kind of real structure,

you just wanna ensure that it's representable as a string or something like a hash, and off you go. Redis will take care of the rest. And it's I think it's that sort of mindset

that you have to

rethink, that you have to kind of

ponder about when you wanna make that switch from your traditional SQL based

environments to

something that is unpredictable,

has less structure,

but has much more scalability

requirements

than your ordinary databases

because this is what

this

current set of thing is. Right? I mean, we still have general ledgers that are driven by mainframes.

But if you take a look at the new world, if you wanna call it that,

you have street lights giving you sensor data in terms of what is the current visibility. Can I turn myself off?

Is it bright enough outside or still I have or do I still have to to power a couple of light bulbs in order to to keep the street safe and sound?

So I was just talking to an automobile manufacturer recently, and he told me that your ordinary car has about 20 computers built into it. The entertainment system not counting. Code base, we're looking at 10, 000, 000 lines of code

without, again, without the entertainment system. And here in Europe, there's a new legislation where each and every car actually has at least 2 SIMs built into it so that,

if you if you have a car accident so that the emergency and first responders can act immediately

based upon the crash data that the computer sends to some sort of cloud. Needless to say, this is something also that insurance is quite quite quite interesting because they can actually track drivers' behavior and come up with much more competitive policy prices.

But that's not the point. The point is that all of these new shiny things send you data

that is way beyond

you what what you have been used to for the last, what, 20 or 30 years when you're coming from a traditional SQL based

background. Because as I said, more most of the time, this data is not structured,

has different formats, if at all,

and comes in gigabytes or terabytes and not kilobytes or megabytes or something. So if you

wanna build an architecture for that sort of use case,

you wanna take a close look

at NoSQL databases.

And especially if you're talking about real time performance,

Redis is probably something worth taking into account

because this is this is where where the where Redis comes from.

And this is also if if I take a look at the use cases where where people use us, this is basically where we

excel with that sort of technology because it's in memory. Hence, the caching,

start.

And what are some of the

least used or most often misunderstood features of Redis that you think developers should know about and consider using in their applications?

Excellent question.

Again, I talked to quite a few people using Redis, and the perception is there. Oh, yeah. You you have that key value store. That perception changes once they take a look at redis.io

and see what Redis really has to offer in terms of very application oriented data types. The beauty is that you can set up your

cluster

with open source components,

meaning that you

can construct a scalable and highly available solution pretty much

by

sticking together a couple of various instances. There's something called Sentinel

that will take care, for example, of the high availability requirements you may have. You don't have to buy fancy

licenses for this. You can do this pretty much out of the box. And, again, that's something that not that many people know unless it unless they they take a look

at what's out there in terms of the code base already existing.

Of course, now also clusterware

libraries, I think, RedisPy

is no I I know that RedisPy is 1 of them. So right from,

from the start,

you have the foundation for distributed systems,

for a distributed system where an application can talk to cluster instead of single database instance.

And are there any other aspects of the Redis Pie project itself or use cases for the Redis database that we didn't discuss yet that you think we should cover before we close out the show? You know, 1 thing we did not touch on at all is, Redis has this, concept. It it has a embedded Lua interpreter in it. So

similar to a relational database, you can actually write Lua scripts that that,

will execute atomically,

you know, via command, and you can pass an argument similar to, being able to pass those arguments to any other Redis command. So it it's akin to, like, a stored procedure in that you can group together a number of these different Redis primitives and, you know, make them all,

execute as a

single,

transaction.

So essentially, that allows you to dynamically create scripts

that you can simply then take, put on the server, and execute.

Why why Lua was chosen?

Lua drives actually

a web interface in my OpenWRT

based router.

It's a very compact

scripting language, and I think this is the most reason why Salvatori chose it when he was looking for an easy way to extend the server functionality before modules were invented.

The interpreter has about a footprint of a 160 k. And as I said, it powers something called LuCI, which is the web interface in my router. And it's pretty performed because, essentially, we're looking at a 32 bit MIPS Processor

driving the router, as in as in the system on the chip thing. The idea is, of course, because it's all open source,

1 could take a look at minimalistic Python contemplations like MicroPython

or something else And simply,

it would be an interesting experiment to see whether you could actually replace Lua with Python on the server side, Though not giving you the full blown Python functionality because MicroPython has certain restrictions and limitations when when you compare it to a typical standard implementation like CPython. But what you could do then, essentially, you could take portions of your of your code base and execute it on the server side. So if anybody is out there listening

and has some spare sites to to around with that concept, the code is on GitHub. And when you take a look at the the Lua implementation

on the server side, it's not complicated.

Essentially,

Lua offers

a certain interface

that a c based implementation can talk to. And there are only a handful of primitives,

namely,

upload a script to a server and then evaluate that script as an executed.

You can also do this actually predefined, I. E. You you ship off a string to to the server, and the and the server gives gives you back an ID, and then you subsequently execute on that ID with parameters.

So if somebody would come up with a or would take a fitting file Python implementation with a similar minimalistic footprint, I think replacing this shouldn't

be a big deal because the as I said, the interface is quite small, and

you could then actually

take portions of your of your Python code from the client side and ship it off to the server.

Has the beauty in its in in its own because you don't have to expand your code base with another programming language named Lua. So so you could confine yourself to Python. So anybody out there, would be a great experiment.

Call to action.

So for anybody who wants to, get in touch with either of you and follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And so with that, I'll move us into the picks. And this week, I'm going to choose the actor Rowan Atkinson.

I recently watched his latest movie, the third Johnny English, and

as always, he is hilarious

and very expressive. So,

I've enjoyed his work for a long time. So anybody who hasn't watched anything with Rowan Atkinson in it, definitely worth checking out. And so with that, I'll pass it to you, Andy. Do you have any picks this week?

I do. I have 2, actually. Over the last 4 or 5 years, I've really gotten into,

cooking and and trying to

understand the science behind cooking. And 1 of the the most invaluable books that I found,

as an engineer,

approaching cooking and and wanting to really understand

the science about it is a is a book called The Food Lab, A Better Home Cooking Through Science, and it's by a a chef here in the Bay Area, named J Kenji Lopez Alt.

And the second pick that I have,

I I do like video games and, there is a,

a new community mod,

for DOTA 2 called AutoChess,

and it is just exploding. I think, it's been out for a month or 2. It has

I think it was just announced 4, 000, 000 plus,

installs

at this point,

and it's a fantastic

kind of

strategy puzzle kind

of mod that completely changes the way that you might play, a DOTA 2, and I'm obsessed with it. Alright. And, Christophe, do you have any picks this week? 3, actually.

1 nontechnical

pick, and this is something that I've discovered recently.

Indian Pales, RPAs,

and infused with grapefruit juice.

For those of the listeners who don't who don't know IPAs, IPAs are strongly hopped handcraft well, craft beers. Let's put it this way. Most of the time, hopefully, handcrafted too. IPAs go back to the point in time when the English were still actively in in involved in India, hence the name because that beer had to suffer a long journey from England to to India. So they put a lot of alcohol into into these beers and normally put a lot of hop into this too with the idea, of course, that

once that batch would arrive in India, it would be watered down to normal drinking strength. Of course, that never happens. So people literally look took the IPAs and consumed it as they were. You're looking at beers with an ABV of typically more than 6%. So, with the recent ad war, well, for the last decade, as in with the with the

with the craft beer movement, if you will, some of the more experimental types basically got into

the old tradition of infusing beers with with fruit juice, and that's exactly what what this is I what this particular type of IPA is. If you like your your IPAs with a certain flavor of bitterness,

this is this is your this is your best chart. The technical pick act actually,

again, 2 things come to mind. Of course, redispy goes without saying because any and anything I can do I do in Python with with redispy, I do through redispy

and also the the module stuff. Because it's based on RedisPy, I normally use the clients at Python libraries. The second technical pick again is a Python module called Selenium.

For those of the dear listeners who don't know Selenium, selenium, selenium is a web scraping framework that you can simply use

to download websites and then to extract information. And,

this is something that comes in handy if you wanna do

a front as in as in client side, a quality assurance on your website because

drafting

a Selenium based test test script is as simple as importing Selenium,

opening up a driver, looking for an element, extracting the value behind this element, and then seeing whether some whether this is the 1 you want or not. It's straightforward.

Rebooting, for example, my router through a selenium based Python

script is about 20 lines, give or take. So it saves a lot of

effort when you when you're into that sort of thing. And the last pick that I have, it's somewhere between technical and nontechnical.

It's actually an author called Daniel Suarez. I don't know,

if you know this, Tobias. He's kind of famous for his very realistic, well researched novels.

Influx probably is 1 is 1 of the better known, or I think the second 1 was called Darknet or something. He has a particular style of writing that is

very well researched. You can actually see that

he

puts a lot of effort into writing, and

the facts that he portrays in or or the or the stories that he portrays in his books are very are very well researched and based on facts. So it's rather science than fiction. And

the beauty about this particular author is that he manages to keep up the pace. So it's not only pretty realistic what he writes about, but also his style of writing is pretty close to something that you would expect from from your recent James Bond movie in terms of pace and action. And these are my picks. Alright. Well, thank you both for taking the time today to join me and discuss the work that you've been doing with the Redis database and the Redis Python client. I have used them both fairly

extensively in my own work, so I appreciate all the effort you've put into it, and I hope you enjoy the rest of your day. Thank you for very much for having us.

Wonderful. Thanks, Tobias.

The Python Podcast.init

Summary

Announcements

Interview

Keep In Touch

Picks

Links

The Python Podcast.__init__