Summary
SQL has gone through many cycles of popularity and disfavor. Despite its longevity it is objectively challenging to work with in a collaborative and composable manner. In order to address these shortcomings and build a new interface for your database oriented workloads Erez Shinan created Preql. It is based on the same relational algebra that inspired SQL, but brings in more robust computer science principles to make it more manageable as you scale in complexity. In this episode he shares his motivation for creating the Preql project, how he has used Python to develop a new language for interacting with database engines, and the challenges of taking on the legacy of SQL as an individual.
Announcements
- Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science.
- When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
- Your host as usual is Tobias Macey and today I’m interviewing Erez Shinan about Preql, an interpreted, relational programming language, that specializes in database queries
Interview
- Introductions
- How did you get introduced to Python?
- Can you describe what Preql is and the story behind it?
- What are goals and target use cases for the project?
- There have been numerous projects that aim to make SQL more maintainable and composable. What is it about the language and syntax that makes it so challenging?
- How does Preql approach this problem that is different from other efforts? (e.g. ORMs, dbt-style Jinja, PyPika)
- How did you approach the design of the syntax to make it familiar to people who know SQL?
- Can you describe how Preql is implemented?
- How has the design and architecture changed or evolved since you began working on it?
- What is a typical workflow for someone using Preql to build a library of analytical queries?
- Beyond strict compilation to SQL, what are some of the other features that you have incorporated into Preql?
- How does a Preql program get executed against a target database, particularly when using capabilities that can’t be directly translated to SQL?
- ** What are the main difficulties / challenges of compiling to SQL ?
- What are some of the features or use cases that are not immediately obvious or prone to be overlooked that you think are worth mentioning?
- What are the most interesting, innovative, or unexpected ways that you have seen Preql used?
- What are the most interesting, unexpected, or challenging lessons that you have learned while working on Preql?
- When is Preql the wrong choice?
- What do you have planned for the future of Preql?
Keep In Touch
Picks
- Tobias
- Erez
Closing Announcements
- Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
- To help other people find the show please leave a review on iTunes and tell your friends and co-workers
- Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links
- Preql
- Lark
- Postgres
- MySQL
- Relational Algebra
- Pandas
- ORM == Object Relational Mapper
- dbt
- PyPika
- GraphQL
- Julia
- runtype
- Rich terminal UI library
- prompt-toolkit
- DuckDB
- Askgit
- BigQuery
- Snowflake
The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA
Hello, and welcome to podcast dot in it, the podcast about Python and the people who make it great. When you're ready to launch your next app or want to try a project you hear about on the show, you'll need somewhere to deploy it. So take a look at our friends over at Linode. With the launch of their managed Kubernetes platform, it's easy to get started with the next generation of deployment and scaling powered by the battle tested Linode platform, including simple pricing, node balancers, 40 gigabit networking, dedicated CPU and GPU instances, and worldwide data centers.
Go to python podcast.com/linode, that's l I n o d e, today and get a $100 credit to try out a Kubernetes cluster of your own. And don't forget to thank them for their continued support of this show. And do you want to get better at Python? Now is an excellent time to take an online course. Whether you're just learning Python or you're looking for deep dives on topics like APIs, memory management, async and await, and more, our friends at the Talk Python training have a top notch course for you. If you're just getting started, be sure to check out the Python for absolute beginners course. It's like the 1st year of computer science that you never took compressed into 10 fun hours of Python coding and problem solving.
Go to python podcast.com/talkpython today and get 10% off the course that will help you find your next level. That's python podcast.com/talkpython. And don't forget to thank them for supporting the show. Your host as usual is Tobias Macy. And today, I'm interviewing Erez Shinhan about Prequel, an interpreted relational programming language that specializes in database queries. So Erez, can you start by introducing yourself?
[00:01:49] Unknown:
Hello. My name is Erez Sinan. I was born in Israel. I've been working in programming and software research since 2009, mostly in start ups and as a freelancer. I also contribute often to open source. My most popular project to date is Lark, parsing toolkit that's written in Python.
[00:02:07] Unknown:
I've actually spent some time looking at Lark because I was dealing with building a parser a little while ago for translating from SRT to VTT for transcripts of the The first
[00:02:23] Unknown:
The first time I saw Python was when I joined the Israeli army back in 2003. I was intended for a programming position, and it was part of the course material material. Until then, most of the code I wrote was in c and assembly. So Python was a little bit different at first. But at the same time, it made a lot of sense. And within a few weeks, I was able to in a few minutes, something that used to take me over an hour to write. And,
[00:02:48] Unknown:
so Python won me over, and it's been my favorite language ever since. Taught me to appreciate the power of conciseness and of delegating the unimportant details to the interpreter instead of having to handle them myself. In terms of the prequel project, as I mentioned, it's something that specializes in database queries. But can you give a bit of an overview about what it is that you've built there and some of the story behind why you felt that this was a useful or interesting project?
[00:03:15] Unknown:
So the idea for QuickWell formed in my head over many years. I did all sorts of jobs over the years, including web developer, data engineer, app developer, etcetera. And as I gained experience working with data, I came to recognize the the benefits of SQL's relational model and its use of, you know, relational algebra to organize linked data. And I was also impressed by the implementation of database engines such as Postgres and MySQL. You know, they provided performance, reliability, and the long list of features, and yet I would often feel trepidation about using those databases. And sometimes opt for less sophisticated solutions, like doing it myself in memory or doing it in using Pandas. And I realized that the main reason for that was that I didn't want to use SQL.
So I tried every ORM that Python had to offer. And while they do the job for very simple interactions, I found them too limiting. And I knew I wasn't alone. The NoSQL movement gained a lot of traction, but, unfortunately, the solutions it produced aren't clear winners either. So, you know, I wish databases didn't force SQL on us. I mean, instead, they give us, like, database assembly language that we could write our code in and use whatever language we want and compile it to that assembly language. And then it's, oh, wait. They they already do, and it's called SQL. So that's how the idea came about that I can write the equivalent of, like, the first c compiler, but compiled to SQL and provide a much better interface for that database without requiring any changes to it.
I started it about 2 years ago, and, well, there's a lot to be done. It's already at a stage where, in my opinion, it's quite useful.
[00:04:52] Unknown:
Can, like, pose a real alternative to SQL. As you mentioned, SQL is fairly ubiquitous, but it's also not uniform across the different database engines where there is a standard for it that covers a subset of the language specification, but it's not sort of strictly defined or enforced in any way. So different database engines will have different ways that they implement it and different extensions to it, and that definitely poses a challenge in terms of how you address it. Although your point of it being, you know, similar to assembly is, I I guess, apt because even in assembly language, there are different instruction sets for different CPU architectures, and so you need to be able to handle all of those different special cases. So I guess that's a good way to think about it where SQL is, you know, the the intermediate representation for the database, and prequel is sort of the syntactic sugar on top of that. And so I'm curious what the sort of overall goals and use cases are for prequel if it's just for, you know, transactional use cases or primarily for analytical workloads or just some of your overall features and directions that you're driving towards with the project?
[00:06:00] Unknown:
Well, I think that anything that you would write in SQL, you you could write in prequel and and just do it, like, a little bit shorter. If there are things that you can't like features that are specific to the database, you can just like in the c compiler example, you can use inline assembly. Right? You can use inline SQL to complete the things that are not supported. So I think for any job that involves data analysis, data engineering, I think it would be a very good fit. Maybe currently a little bit less for web development just because of performance issues, but I hope to solve that soon. There have been a number of different attempts to
[00:06:40] Unknown:
replace SQL or add different layers on top of SQL to make it more maintainable and more composable. And I'm wondering from your perspective of somebody who is working so closely with it and trying to build a new replacement, what it is about the SQL language and the way that it is used and managed that makes it such a difficult target to actually replace or turn into a composable and reusable structure?
[00:07:08] Unknown:
You have to keep in mind that SQL is a very old language. It was designed for non programmers and with business logic in mind. So all sorts of concepts that a programmer today would find elementary, they just don't exist in SQL. For example, you won't find variables in most implementations. To store the result of a query in the working memory, you have to write create temp table and when you really just wanna use an assignment operator. And then you have to manually drop it when you're done. There is no scope. There there's no garbage collection. There are no modules, no way to define structures. There is very limited control flow, if at all. I mean, Post UA and MySQL have them, but it's definitely not part of the standard.
The type system is also confusing, and casting between types is unpredictable. I think the biggest issue, though, in my opinion, is the lack of functions, specifically of higher order functions, which basically means being able to pass functions as arguments to other functions. Every paradigm, be it functional, procedural, object oriented, everyone agrees that it's important to have that. Every language in the top 20 most popular has higher order functions. Most even have closures, but most SQL implementations don't even let you define functions, and none of them allow you to pass them around as values. So the result is that it's really hard to reuse code in SQL, let alone build a standard library. And usually, it's impossible to work at the level of abstraction that we're used to as software engineers.
As for the syntax, it's not exactly challenging, but it's very long. I think there's a lot of value in having concise code. And to add to the confusion, each dialect of SQL has its own syntax and peculiarities and its own semantics, its own type system. So whenever you use a different SQL database, you essentially have to unlearn and relearn the whole set.
[00:08:52] Unknown:
There are a few different projects, particularly ones that are recent that work to address some of the challenges of working with SQL, particularly at scale in the organizational sense. You know, the obvious 1 from web development is the object relational mapper or ORMs that you mentioned. DBT is 1 of the current standouts for the analytical use case where they've been relying on Jinja to be able to template different substructures together and use, you know, Jinja macros. And then there's another library called Pypico, which is an attempt to add a Python interface that closely maps to the underlying SQL structure. And I'm wondering if you can just give a bit of an overview about where you see prequel in relation to some of those projects and just the overall space of tools that attempt to increase the utility and accessibility of SQL for programmers.
[00:09:48] Unknown:
All these tools that you mentioned, they have their merits. They all have the same, I would say, paradigm. All the ORMs, templating end engines, and query builders, which is that they try to fix SQL's shortcomings by embedding it in a different language. It wasn't designed for it. And that creates a lot of friction. So prequel is taking a unique approach of being a self contained programming language, and that comes with a lot of advantages. The syntax is much more concise, which I think is very important. I mean, imagine if instead of writing Python code, you'd had to write the Python AST directly and that, you know, the 1 that the code would pass to. And, functionally, it would be exactly the same, but I don't think anyone would get verified doing it this way.
Another advantage for prequel is that it has its own type system with type validation and introspection, so you can ask it for every type of object. If you have a table, you can ask which columns are in the table, and the result would be in itself a prequel table that lists all the columns. So there's a lot of things that you wouldn't be able to do otherwise. There's a lot less limitations to what you can write. For example, if you write an if else statement inside a function and use it in a query, Prequel will compile it to case when. And that's something that doesn't work with any other approach because they don't have the code that compiles the the language that they are hosted in. Prequel also comes with its own interactive shell. So you can write Prequel code, use the help of the autocomplete.
The syntax is designed for interactivity. Just as an example, if you're updating rows, you first query the rows that you want to update, and then you update keyword with the new values afterwards. In that way, the shell can preview which rows you're about to overwrite as you're writing the code. At least for me, it was always an issue. When deleting or updating rows in the database, it's always scary to do it.
[00:11:38] Unknown:
So there's a lot of small things that are designed to help you instead of getting your way. So at at the beginning, you mentioned that prequel is based on the relational algebra, which is the fundamental concept that underlies SQL. And I'm curious what you saw as being the benefit of using the relational algebra as the foundation to build prequel upon rather than just trying to remap programming concepts on top of the SQL language.
[00:12:06] Unknown:
So I think SQL tried to package the relational algebra as of something that's very human friendly by making these select statements where you can just do all the operators 1 after the other. But it actually came out very awkward because you have to do it in a certain order, and there's all kinds of rules about where you can put what. And actually, once you get back to the relational algebra, which is very cleverly made in a way that makes sense both from a mathematical point of view and from a programmer point of view. It feels a lot more natural. You know, you have operators that multiply, that intersect, that order.
It makes a lot more sense to me. In terms
[00:12:46] Unknown:
of the work that you're doing with prequel, I'm wondering how much effort you made to make it approachable to people who know SQL versus just treating it as its own entity and designing it with a syntax and structure that makes sense for your own purposes, and then just let people who are interested do the work of actually onboarding onto it and figuring
[00:13:10] Unknown:
out how it relates to the SQL that they might be familiar with. Prequel is based on the same model as SQL, which is relational algebra coupled with lazy evaluation. So if you already know SQL, then Prequel semantics will be very familiar. But, actually, prequel syntax was designed to resemble JavaScript or Go, which I think are much more concise and more suitable for programming. There are a few reminders of SQL, like the like operator and the use of its terminology, like join, left join, outer join, and all that. But overall, it looks quite different. Also borrowed ideas from Python, GraphQL, Julia. I tried to make it familiar and intuitive when possible, but my main goal was to make it a good language.
[00:13:51] Unknown:
And in terms of the actual implementation, I'm wondering if you can dig into how you built the language and some of the design and architectural considerations that went into it. So I wrote it in Python,
[00:14:05] Unknown:
and I actually rewrote it twice before landing on the final architecture. At first, I wrote a very basic proof of concept in Python, which showed me that it can work. I knew I could make a much better design, so I rewrote it. And this time, I picked Julia, which is a wonderful language, and it forced me to think a little differently about things. And it gave me a lot of great ideas. But, unfortunately, its runtime wasn't suitable for prequel. So finally, I went back to Python. I did a 3rd rewrite and a final 1. And this time, I got it mostly right. And I would say the main 4 components are the parser, the interpreter, the compiler, and the database interface.
So I'll just go over them shortly 1 by 1. The parser uses lock to convert from prequel code to AST. Lock is a parsing library that I wrote a few years ago, and I don't know what I would do without it. Really made it very helpful. And things like autocomplete come automatically using that so I don't have to, like, implement it separately. The compiler converts prequel AST into SQL AST and the SQL AST into SQL code. There's also, like, a small templating language that I added in between, but that's a minor detail. The DB interface runs the SQL code and imports the result into prequel.
And then there's the interpreter, which evaluates the prequel AST on the spot, but it also runs the show. It decides when to use the compiler, when to query the database, and so on. It's kind of like left brain and right brain. Right? The the interpreter is like the slow left brain that's capable of doing all the logical computations. And then the right brain is only good for specific kind of things, but it's very efficient. So you want to offload anything that you can onto it. I I'm also using a few external libraries for it. I'm using run type, which is another library that I wrote, which gives me multiple dispatch and type safety in the data classes. I'm using rich for, outputting tables and colorful text. It's pretty nifty.
And I'm using prompt toolkit for the, shell for the REPL, although that might change soon. In terms of
[00:16:09] Unknown:
the the actual workflow of using prequel, because of the fact that it has the database connections, it sounds like you would likely use it as the replacement to the database client shell where instead of using the Postgres CLI to execute some queries directly written in SQL and see the results, you would just use prequel to connect up to the Postgres instance instead and use the prequel code to interact with the database to query it or add rows or whatever the case might be. Right. That's the idea. Yeah. In terms of using it as a library, what are some of the other sort of considerations of how to actually use prequel for building up a set of maybe analytical queries that you want to use as part of a data engineering pipeline and just some of the overall workflows from the programmatic sense, but also in the sort of team and organizational sense of how to structure it in a way that makes it scalable conceptually?
[00:17:05] Unknown:
So I think the typical use case, especially in this early stage of the language, would be to use prequel as a way to interface between Python and the database. So in the exploration stage, users would do what you described. They could open the shell or Jupyter notebook and play around with different queries, interactively build it up, build up the functional functionality that they need for that database. And then in the next step, they can take all these functions and move them into a prequel module, you know, with a dot PQL extension where they'll refactor it, add documentation, add tests.
It could be several modules. And then finally, you can import prequel into Python. Just do import prequel, and you instantiate the object with the prequel module that you wrote. That object, let's call it p. So it has all the functions that you define in the module. So if you've declared a function called foobar, you can call it using p.foobar, and it accepts Python arguments and it returns a Python value. So the interface is very seamless.
[00:18:04] Unknown:
You mentioned that in the process of getting to where you are right now with prequel, you ended up rewriting it a couple of different times. And so I'm wondering what were some of the initial ideas or assumptions that you had about what a useful solution to the problem was and how you how to go about getting there that you ended up having to reconsider or reevaluate
[00:18:23] Unknown:
as you iterated on the problem? I would just say that writing a compiler is a very difficult task, and the design is not trivial. You know, you can read all the compiler books you like, but in the end, especially when you're doing something that's a little bit out of the ordinary. Right? It's not just like another c compiler where it's all already mapped out, but it's a new kind of target, new kind of language. It takes a while to get it right. And, for example, 1 thing that took me a while to sort out in my head is that objects can go through several lifetimes while still being the same object. So, for example, an order by operator, right, is just an AST node that says order by, but then it's also an object in the language that represents it as something that you can interact with. You can add more operators.
And then in the SQL section, it compiles into the order by op an AST node for SQL. But, basically, it's all the same object. It's just in different lifetimes in its course. The reason that this distinction became important is because sometimes different objects of different lifetimes have to interact with each with each other. In a normal compilers, that wouldn't happen. But because I'm mixing an interpreter and a compiler and there's lazy evaluation, then all these different objects at different stages of their lives have to be able to interact and be aware of which stage they are they are in and which way they're going. As far as the actual design of the syntax, I'm wondering what your
[00:19:58] Unknown:
overarching philosophy was about how best to structure it and how to make the overall space of working with databases more ergonomic and more flexible.
[00:20:11] Unknown:
I was heavily influenced by Python and also by the zen of Python, which is a little bit ironic because there's always the joke that Python doesn't follow the zen of Python. But I do like the idea of always writing the minimum amount of characters that you need. Never repeat yourself if you don't have to. Try to find an idea that is intuitive in order to express what you want to express. It's just design. It's intuitive. It's hard to really explain it. But when you see it, you think, oh, yeah. That makes sense.
[00:20:41] Unknown:
Beyond just being able to compile the SQL as I was going through the documentation, you mentioned that because it's housed in a full programming language, it also gives you the ability to interact with other systems, create side effects, pull in other data sources for being able to simplify the workflow of, you know, loading a CSV into a database, for instance. And so I'm curious, what are some of the other features or capabilities that you've incorporated into the prequel language beyond just strictly being able to compile a DSL down to SQL dialect?
[00:21:16] Unknown:
Well, there are a lot of features in the language. There's a type system, strong typing. There are modules, tracks, high order functions, exceptions. In terms of tooling, you can use it direct in the shell, but you can also write it in Jupyter Notebook. You can load it from Python. You can export and import to pandas tables. You can also even serve it over HTTP. And in the future, I'm gonna add support for GraphQL as well. Yeah. Sorry. That's my answer.
[00:21:45] Unknown:
And in terms of being able to actually run some of those side effects, So for instance, if you have a step that downloads a file from the Internet and then uses that to drive the SQL execution in the mode where you are executing prequel against a database connection. I'm just curious if you can just talk through sort of the overall life cycle of the code as it interacts with the database.
[00:22:12] Unknown:
So anything that can't be done in SQL, SQL runs locally with the interpreter. And that usually includes control flow, input, output, introspection, all of the things that SQL either doesn't have the feature or it's not prevalent enough to warrant, you know, compiling it. I hope that in time, most programs will be able to run purely in the database, at least in databases that have a rich implementation to allow it, like like Postgres, which allows a lot of these things. Well, you can think about it as a polyfill. It just does the operations, maybe not as efficiently, but ensures that the flow of the program continues.
[00:22:49] Unknown:
In terms of your experience of actually building prequel, what have you found to be some of the main difficulties and challenges of actually compiling down to SQL and being able to handle the varying dialects across the different engines?
[00:23:03] Unknown:
So compiling to SQL is fairly challenging because there are a lot of many different dialects, each with different syntax, semantics, features. There's also different types, different operators, even different rules about where you can and can't put parentheses. So there's a lot of small and very crucial differences. For example, division in MySQL returns a float, but in post gray, it returns an int. And MySQL doesn't support infinite series or set substraction, but it does support union and intersection. So it's a little confusing. Currently, I think the biggest obstacle for me is that SQL doesn't have a standard way to report errors.
For example, in postgre, casting the string hello to an int will throw an error, obviously, like in Python. In MySQL, it just returns null, and there's no hint that anything was wrong. And, of course, I can test the string my if the string is valid myself, but there's no way in MySQL, to report an error. And it's the same is true for SQLite. So this is just 1 example, and I often have to do more workarounds than I would like to just to get all the dialect to behave similarly.
[00:24:08] Unknown:
Now that you have sort of the core structure of the project, what is actually involved in being able to add support for a new SQL dialect and new database engine?
[00:24:18] Unknown:
So that's very dependent on the dialect. For example, adding DuckDB, which is clone for SQLite, is fairly easy. Adding support for Ask Git, which is using the SQLite, is fairly easy. But for example, adding BigQuery, it was a little bit challenging because it does do things differently. For example, for 1 thing, the storage engine is different. And also, I have to be more careful about doing certain queries because they charge you money for it. I'm not just getting a preview of a table. Getting the first 10 rows would make BigQuery query the entire table, which would charge you a lot of money when all you want is just 10 rows.
So there's extra stuff to do whenever there's a new dialect. And I'm sure whenever replication comes into play or time series, they would all require a bit more effort. But in general, I think now that I have the base basically covered, it's relatively like a small addition and not a huge difference.
[00:25:19] Unknown:
In terms of the features or use cases or sort of combinations of systems that it might be used with, what are some of the non obvious or potentially overlooked aspects of what you're building with prequel that you think are worth mentioning or
[00:25:35] Unknown:
ways that you have seen it used that aren't something that somebody might necessarily think of off the bat? I can't say I've seen it used in unexpected ways yet. I will say that nowadays when I need to write SQL, I often prefer to write it in prequel, generate the SQL, and clean up the result. And the surprising part that it's often faster and less or prone than writing SQL, even though I'm already pretty proficient in it. But as for a fun feature, I mentioned that I added support for Ask Git, which I don't know if you know it. It allows you to query your Git repository using SQL. So now it has a back end, and you can use prequel to query your Git repository. I would say that since I'm basically coming to replace an existing solution, then it's fairly clear the scope of what it is that is possible and impossible with it. So I'm not sure if there's going to be a lot of surprises. I hope the surprise is just how easy it is.
[00:26:31] Unknown:
And so for somebody who's maybe using dbt, which is an engine for building pipelines in an analytics engineering workflow where you start with SQL files that are Jinja templated that can then be used for building transformations in a data warehouse to create downstream tables and views. Do you have any thoughts on what would be involved in using prequel in place of some of those SQL files to let DBT then execute those overall pipelines?
[00:26:56] Unknown:
It's possible to use prequel to programmatically generate SQL code and not just query the database, but just give you the code. And then you can put it in whatever file you want and maybe add templating around it. So it might be possible. I'm not sure if that necessarily be the best use case, but it might it's interesting.
[00:27:13] Unknown:
You also mentioned its utility and focus on data engineering use cases. So I'm wondering what you have seen or used it for in terms of being able to use it in the context of an Airflow or a Daxter for executing these different SQL operations as part of an overall data workflow.
[00:27:32] Unknown:
I have not combined it with these tools. So far, I've mostly used it for my own workflow.
[00:27:37] Unknown:
And then in terms of your experience of building prequel and iterating on it and using it for your own work, what are some of the most interesting or unexpected or challenging lessons that you've learned in the process?
[00:27:48] Unknown:
Well, I started out thinking that the biggest challenge would be the technical implementation, and it definitely was and is very challenging. But I found that explaining prequel to people, marketing it, writing good documentation, all of these things are just as, if not more important, for its adoption. And because these aspects, as a programmer interest me the least, they are the ones where I end up spending most of the efforts.
[00:28:11] Unknown:
As far as it being a project that you are building and that you have, you know, use cases for, I'm wondering what your overall hopes are in terms of community adoption and community growth and your motivation for releasing it as open source and putting in the effort to help promote it? I hope it will be widely used. I hope everyone
[00:28:33] Unknown:
will take advantage of it. I hope that people will contribute back, and I hope that there's going to be, like, a standard library built around it. I mean, I'm planning to 1 day build something that's a little bit like PIP. So imagine if you wanted a message queue in a SQL database, you could just do, like, something like a pip install message queue and then get it. All these things that are missing from the SQL ecosystem, even though it's such a popular language and every other language already has that. That also brings up an interesting point
[00:29:04] Unknown:
of SQL is an integral part of database engines where it's part of the actual engine. It has a whole query planner. And I'm wondering if you think that there's a potential future for prequel to be more closely related to the different database engines or being an extension that can be installed maybe using something like the Python support and Postgres or, you know, being able to bring it into a closer coupling with the actual database engines versus being this surface layer that uses SQL as the means of interaction?
[00:29:34] Unknown:
Thinking some distance feature, it would actually be very beneficial to have Prequel as a plug in to the dat database because then it would be able to use the entire features of the database and use them correctly instead of being limited by the SQL interface, which is a little bit stunted sometimes. But I do think that the current approach that SQL does, which is, like, storing all the functions in the database and all of that, I think it's very wrong from a modern perspective. We want our code in GitHub. We want it version controlled. It shouldn't be in the database itself. But in terms of having the plug in, having the interpreter in the database, that would be great. Definitely would make prequel better. But I think there's a long time until that happens.
[00:30:17] Unknown:
So for people who are interested in what you're building and they are tired of handwriting a bunch of SQL code and they're thinking about using prequel, what are some of cases where it's the wrong choice so they might be better suited with just, you know, using the native SQL or using a library like PyPika or going with an ORM?
[00:30:35] Unknown:
Well, for 1 thing, if you don't have the patience to learn yet another technology, then it's probably not for you. But actually, I can think of many reasons not to use it. You know, if you need a lot of really quick and simple queries, prequal might slow you down. Although, as I said, I hope to fix that. And, you know, it's still a new language. It take it needs time to grow, to build a community, to polish the hard angles. If you're looking for a drop in solution that has an answer for every use case, then then you should probably wait a little, you know, come back in a year or 2. Definitely, as much as I try to make the documentation and better and improve all the bugs, Other projects are currently more mature. So I think I have some advantage in my approach, but there's still time for me to be, like, a serious competitor.
[00:31:17] Unknown:
And as you continue to work on prequel, what are some of the things that you have planned for the near to medium term, either in terms of technical additions and new features or work that you're doing to help raise awareness and grow the community or just anything that you have planned to help grow the language and grow the project?
[00:31:36] Unknown:
So in the near future, I plan to add syntax for JSON operations. Like, all the big databases are adding now, and I think it's something that people are really missing on having unstructured data but being able to do queries on it. So that's going, pretty soon. I plan to add the auto joins for attribute access, and other ORMs already have that. I waited until I had, like, a solid base to implement it. So that's gonna happen soon as well. I'm working on well, right now, the functions are compiled at runtime, just in time as they say, but they are not cached, which makes the completion a tiny bit slower when you do a query. Usually, it's not noticeable.
But I do plan to add the caching and, you know, do it like a real JIT compilation like Julia does it. And then I think it would be also very useful for web developers. And already, like, in my preliminary benchmarks, I'm performing better than SQLAlchemy. So I have good hopes for that. Plan to add control flow compilation so that more of the code can run-in the database. And, of course, adding more that more targets, like snowflake. Maybe I'll even add Mongo. I haven't tested if it's possible, but it might be. As for the far future, that's not as certain. Like I said, I wanna add a peep like feature where you can have a marketplace of modules and packages that will be ready made and you can use.
I'm thinking of adding it as a plug in to all kinds of dashboards where you already can use SQL. So maybe you could use prequel. Could make those dashboards a little bit nicer. And perhaps I'll even write my own, like, dashboard framework because I feel that by starting with the shell command and building the add into a UI instead of doing it the other way around, you can get a very effective workflow. But, really, I'm still not entirely sure, and it really depends a lot of user inputs. So if anyone has any idea, if anyone wants me to take it somewhere, I'd be very happy to hear it. And are there any other aspects of the work that you're doing on prequel or the
[00:33:46] Unknown:
overall use cases that you are driving towards with it or any specific areas of contribution that you're looking for help with that we didn't discuss yet that you'd like to cover before we close out the show?
[00:33:57] Unknown:
Nothing in particular. I'm always willing to accept help and ideas and tips. If anyone's, like, an expert in databases and he thinks that I'm doing something wrong, I'll be happy to hear it. If anyone has a really interesting idea. If anyone wants to help me test for bugs, all the better. But really, just, like, engaging. If you want me to add any feature, I'll add it. Just participating, that would make QuickWear grow the fastest.
[00:34:22] Unknown:
Alright. Well, for anybody who wants to get in touch with you and follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And so with that, I'll move us into the picks. This week, I'm going to choose a new show that I started watching called Counterpart. That's an interesting sort of sci fi drama where it's set in Berlin, Germany, and the idea is that there was a an event where our reality branched into 2 parallel realities, but there is a conduit by which you can traverse between the 2. And so there's this whole sort of diplomatic apparatus that has arisen to
[00:34:57] Unknown:
control crossings and manage the interactions between them. So just really interesting premise and interesting show and great acting. So definitely recommend checking that out if you're looking for something to watch. And so with that, I'll pass it to you, Erez. Do you have any picks this week? I don't know if it counts, but I've spent the last month in, Bansko, Bulgaria. It's a small ski town that in the winter, it's for ski. But now that it's summer, it's mostly empty. But there is a lot of co working spaces here. It's like a digital nomad hub. And it's a very nice place. There's a lot of nature, salaried from Piel. If anyone's, like, traveling and working on the road, it's a nice place to visit.
[00:35:33] Unknown:
Alright. We'll definitely have to try and make it there someday. So thank you again for taking the time today to join me and share the work that you're doing on Prequel. It's definitely very interesting project and 1 that I plan to experiment with and possibly integrate into my own data engineering work. So thank you for all of the time and effort you've put into that project, and I hope you enjoy the rest of your day. Thank you very much. You too. Thank you for listening. Don't forget to check out our other show, the Data Engineering Podcast at data engineering podcast dot com for the latest on modern data management. And visit the site of pythonpodcast.com to subscribe to the show, sign up for the mailing list, and read the show notes.
And if you've learned something or tried out a project from the show, then tell us about it. Email host at podcastinit.com with your story. To help other people find the show, please leave a review on Itunes and tell your friends and coworkers.
Introduction and Sponsor Messages
Interview with Erez Shinhan Begins
Overview of Prequel
Goals and Use Cases for Prequel
Comparison with Other SQL Tools
Design and Implementation of Prequel
Using Prequel in Data Workflows
Syntax and Features of Prequel
Challenges in Compiling to SQL
Future Plans for Prequel
Community Contributions and Closing Remarks