Summary
Our thought patterns are rarely linear or hierarchical, instead following threads of related topics in unpredictable directions. Topic modeling is an approach to knowledge management which allows for forming a graph of associations to make capturing and organizing your thoughts more natural. In this episode Brett Kromkamp shares his work on the Contextualize project and how you can use it for building your own topic models. He explains why he wrote a new topic modeling engine, how it is architected, and how it compares to other systems for organizing information. Once you are done listening you can take Contextualize for a test run for free with his hosted instance.
Announcements
- Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
- When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
- Your host as usual is Tobias Macey and today I’m interviewing Brett Kromkamp about Contextualise, a topic modeling application that helps you build a mind map for information-heavy projects
Interview
- Introductions
- How did you get introduced to Python?
- Can you start by describing what Contextualize is and some of the types of projects that it can be used for?
- What was your motivation for creating it?
- How do you use topic maps in your own work and creative endeavors?
- The space of personal note-taking and knowledge management is vast and varied. What does Contextualize do well that you have been unable to find or implement in other tools?
- For someone using Contextualize, what does that workflow look like?
- How are you approaching integration with different creative contexts (e.g. text editors, graphics editors, word processing, etc.)?
- Can you describe how Contextualize is implemented?
- How has the design evolved since you first began working on it?
- In the documentation for Contextualize it mentions that this is the latest in a string of topic mapping platforms that you have built. What are some of the lessons that you have learned from previous efforts that have influenced the design of this one?
- One of the challenges with many knowledge management tools is that they are proscriptive in how to work with them. In what ways has your own preference for how to interact with information influenced the direction of Contextualize?
- Being an open source application, how has its exposure to the public directed your software and user design?
- How do you approach the challenge of reducing friction in adding content and relations while allowing for flexibility and context management?
- What are some of the projects that you are using Contextualize for?
- What are your thoughts on the utility of something like Contextualize for capturing and organizing the collective knowledge of a team of collaborators, whether in a work or casual context?
- What have you found to be the most interesting, complex, or complicated aspects of building a topic mapping platform?
- When is Contextualize the wrong choice?
- What do you have planned for the future of the project?
Keep In Touch
- Website
- @brettkromkamp on Twitter
- brettkromkamp on GitHub
Picks
- Tobias
- Brett
Closing Announcements
- Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
- To help other people find the show please leave a review on iTunes and tell your friends and co-workers
- Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links
- Contextualise
- Norway
- IBM Rexx
- Java
- Semantic Web
- Topic Map
- ISO standard for topic maps
- RDF
- Spain
- Knowledge Management
- Graph Database
- Worldbuilding
- Roam Research
- TopicDB
- Twitter Bootstrap
- Hypergraph
- Digital Gardening
- Notion
- TiddlyWiki
The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA
Hello, and welcome to podcast dot in it, the podcast about Python and the people who make it great. When you're ready to launch your next app or want to try a project you hear about on the show, you'll need somewhere to deploy it. So take a look at our friends over at Linode. With the launch of their managed Kubernetes platform, it's easy to get started with the next generation of deployment and scaling powered by the battle tested Linode platform, including simple pricing, node balancers, 40 gigabit networking, dedicated CPU and GPU instances, s 3 compatible object storage, and worldwide data centers.
Go to pythonpodcast.com/linode, that's l inode today and get a $60 credit to try out a Kubernetes cluster of your own. And don't forget to thank them for their continued support of this show. Your host as usual is Tobias Macy. And today, I'm interviewing Brett Kromkamp about contextualize, a topic modeling application that helps you build a mind map for information heavy projects. So Brett, can you start by introducing yourself?
[00:01:12] Unknown:
Yes. As mentioned, my name is Brett. I'm living in, Northern Norway. I'm working for a local software company. We are in the educational sector. I've been building topic map based systems for for over 15 years, so much so that I'm I'm building them now in my, private life as well.
[00:01:33] Unknown:
And do you remember how you first got introduced to Python?
[00:01:35] Unknown:
Yes. I was thinking about that. I I think if I remember correctly, at some point, and I mean, this must have been 20, 20 years ago, 18, 20 years ago, I bought a CD ROM when you would still buy things like a CD ROM. I bought a CD ROM and it contained a lot of, niche languages. I think it included things like Rex, a language from IBM, and X Lisp, some kind of object oriented Lisp. And 1 of the languages was also Python. So a very, very early version of Python. So I took a look at the different languages on the CD ROM, and I took a look at Python, and it was yep. This is quite interesting. But I did actually forget it at that point, and I moved on to other things. I think I I went into Java programming at the time. But at some point in my in 1 of my jobs, I have to actually give lessons to to people programming lessons. So then Python came back to me as this, well, incredibly simple looking language that would allow people that are learning to program to really focus on learning to program and not all of the the ceremony around programming languages. So so I I I picked up Python. I started teaching people, how to program with Python.
And slowly but surely, I I personally started using Python more and more from from that point onwards.
[00:02:52] Unknown:
And you mentioned that you've been working on topic maps for quite some time now. I'm wondering if you can give a bit of context about what a topic map is and a bit about how you first got involved in working on them.
[00:03:04] Unknown:
Yeah. Sure. Topic maps, that they were part of I think this was around about 2, 006, 2007. There was this whole semantic web and semantic web technologies. It became a real thing at that point and topic maps was was 1 of those technologies within the semantic web, space. So so topic maps as as the name implies, it it is really about connecting, creating topics. And a topic can be anything, any abstract concept, anything can be a topic. So you create your set of topics, and then you start relating those topics. But topic maps also was an ISO standard or is an ISO standard. And and to a degree, topic maps never were able to to jump in in, in entrepreneur speak. They were never able to jump the chasm. So they they fell somewhat out of out of favor. And now if you talk about semantic technologies, you're normally talking about, RDF, which is a an alternative semantic technology. But topic maps in themselves, I mean, it it is such a powerful paradigm, and it's such a powerful way of thinking about so many things, and it allows you to model so many things. I always say topic maps are a meta model. They allow you to model other models.
So, using topic maps, I started using topic maps in the company I was working for at the time, which was a quite large, company in Spain in the tourism industry. And we started using topic maps to to manage the numerous websites we had. We had a lot of products targeting a lot of very, very specific niche holiday markets. So we built a topic map CMS to actually manage all of these, different, smaller microsites. And we and and topic maps worked absolutely brilliantly in in that aspect. So that that's how I came to start using topic maps professionally. And then and then I came to Norway about 8 years ago. And in Norway, they've always used topic maps in the educational sector, and they use it literally to to manage at a, at a state level. They use it to to manage, to organize the the national curriculum.
So that's what we're currently doing. We build systems that allow at a national level to to, make available services, using using topic maps, educational services using topic map. And then personally, I've just I fell in love with the topic maps, with the topic maps model, the topic maps paradigm. And and just, I I think in topic maps. I model in topic map. So just personally, I I use it for almost
[00:05:45] Unknown:
all of my own personal projects. And that brings us to contextualize, which is your latest incarnation of a platform for being able to actually create and manage and organize these topic maps. So can you give a bit of a description about what it is and some of the types of projects that you're using it for and your motivation for creating this iteration of it? Okay. So, yes, I I have built several topic map, systems. I I built the first 1 in 2, 007.
[00:06:12] Unknown:
Personally, I built, that 1 in 2, 007. The problem with that, topic map and in general with topic map and knowledge management applications in general is is the complexity. Especially, the complexity of the underlying model, it leaks through into the user interface. And and to a degree, that is inevitable, but it's it's it's it makes the application, very difficult to use and and by extension, also very niche. People people just don't have the time. They don't have the interest to try to learn extremely, complex applications. So what I've been trying to do and with contextualize, I I'm on that road is trying to use something like topic maps and the topic maps paradigm. But but at the same time, have the the simplest possible user interface.
And and contextualize is is an iteration on on that road. So trying to make it very simple. A user accesses logs into contextualize. They create a topic map. The creation of a topic map is a very simple thing. You provide a title, a description, maybe an image. That's it. Now you have a topic map. The topic map is prepopulated with some some core topics, some critical topics. And then after that, 1 of the first things you would do is create a topic. And the creation of a topic is again as simple as possible. You could leave it at just adding a title. You add a title, the application will automatically generate the the identifier for you, the topic identifier for you. You potentially could add some topic text, and then you just click on save.
And now you have a topic. And, again, a topic can be anything. I'm a topic. My project, which is a topic. The company I work for is a topic. Everything that you can think of, any concept you can think of can actually be be a topic. Once you've set up, a topic, then you start thinking about the relationships between your topics, the relationships. And this is an interesting thing about topics because in many respects, when it comes to graph databases, it's not so much the entities, although they obviously are valuable, but it's the relationships, the edges between the vertices that are what is where the power of a graph database, comes into play. So in a topic map, the relationships called associations in topic map terms, they are actually like everything in topic maps. They are semantically meaningful. So you're not only asserting a relationship between 2 topics, between, for example, me and the company I work for. I'm also asserting the type of relationship. So I can say this is of type employment or work. And on top of that, each topic plays a role in any relationship. So I play the role of employee.
My employer obviously plays the company I work for. They play the role of employer. All of these nouns that I've just mentioned, employer, employee, work, all of these, they are topics as well. So you eventually end up with a completely self contained, self referential set of, yeah, a complete data model that is self referential. And everything you're doing there has a semantic mean, meaning. What else could I say, about topic maps? I mean, again, topics in themselves, have a type. So if you create a topic, it is of type topic. That type is a topic. When you create an association, the association has a type of by default, the type of the association is association.
The roles, the occurrences. Occurrences are what allow you to connect to a topic information resources. So information resources can be anything, absolutely anything, from a link to a page, to a video, to a PDF, to an Excel spreadsheet. Anything you want, you can connect to a topic. And the occurrence is what establishes the connection between a topic and the actual information resource itself. These occurrences in themselves are also typed. And, again, like everything in topic maps, the type is a topic. So it just is a very powerful way to structure and organize what is potentially a complex, domain or maybe not such a complex domain, but you have this very powerful model at your at your disposal to do that organizing, that structuring of of information and turning information into something that becomes more knowledge as opposed to just information. From a reading of the documentation
[00:10:52] Unknown:
platform is for being able to do world building in terms of creating a fictional world and some of the backstory and context characters in it. And I'm wondering if you can give a bit of background on some of the types of activities that go into that and the ways that a topic map helps you there, and just some of the motivation for building out contextualize in terms of what were the limitations and other similar platforms that didn't fit your particular use case or mental model for being able to actually
[00:11:23] Unknown:
pursue this creative endeavor? Okay. So, yes, I I do use contextualize, for my my own project. So I use contextualize also to help me actually organize and structure the contextualized project. So my application contextualized for me, obviously, it's it's a personal project. There's a lot, going into that project, features, ongoing issues. And, of course, the actual issues I can manage on something like GitHub. But conceptually, when I'm thinking about contextualize, I I have I've organized that in a topic map. I also use, like you mentioned, I also use, another another hobby of mine is is world building, specifically for, games and for, books. So world building is just I believe a fantastic example where these kind of systems not necessarily just contextualized but just in general these kind of systems are are really, really powerful because okay. When when you start focusing on building a world, I mean, that that is potentially vast. So you have, you have countries, you have races, you have characters, you have the economy to think about, you have the geography to think about, and trying to keep this all straight. I mean, a a an application like contextualize just, suits that that kind of project very well because, I mean, every character in your world, while a character is a topic, the relationship between the characters is a topic. The quest sorry. You can establish the relationships between these characters.
The quests that they need to go on, those are topics. The events that take place in these quests, those are topics. And you can establish temporal or spatial relationships between all of this. So for world building, a a topic map or or a, yes, a topic map based application, but in general, over the last couple of years, there has been a huge growth in applications for personal knowledge management, and and the majority of which are based on graph data models. So these kind of applications are just extremely, extremely, suitable for for these kind of quite complex, large, information sets of information that that you need to need to manage. Also, and another thing with something like topic maps is that even the relationships that you are searching between 2 topics, even the relationships themselves, the associations, they are topics in themselves.
So if, for example, if we go back to the the relationship between me as an employee and my company, the employer, there's a relationship there, 1 of employment. But that relationship in itself, there's a lot to say about that relationship. There's a contract between an employment contract between us. There is dates. There are lots of things that are relevant with regards to that very specific relationship. With topic maps, you can express all of that directly on the relationship. So, relationships, associations are are top level addressable items just like everything else. So, again, to kinda like summarize, this the the topic map way of thinking and the topic map way of modeling, it it's just ideal for you to make sense of these kind of large sets of of information over. And then on top of that, because once you have created the topics and you have asserted your relationships between these topics, you can then obviously navigate these topics. You can either do that in a very textual way by means of links or you can do that even with visualization with graph visualizations, so that you can actually visually see from where you are currently in your topic map, the current topic, what is related 1, 2, 3, 4, 5, steps away from your current, your current topic. So it's it's just a very powerful way of of of dealing with large amounts of information. What was the other question?
[00:15:30] Unknown:
The other question was just what are some of the things that you found lacking in the other tools that are available for knowledge management and topic mapping that led you down the path of actually creating your own system and some of the specifics of contextualize that might lead someone to choose that over the other options?
[00:15:47] Unknown:
Okay. So so that is a good, question actually because, currently, the knowledge management space is is is going through quite a lot of changes, and there's an application that is becoming very popular, and and I understand why it's becoming very popular. It's called it's an application. I think the application is called Roam, as an ROAM, and it's from a company called Roam Research. And 1 of the big things, 1 of the big features in this application is what they are calling backlinks. So when you have a a document and you create a a a link to another document. So say you have a document a and you create a link to document b. Currently, on the web, links go in 1 direction. You actually don't see once you've navigated to the other page, you don't see the link back to where you came from. There is no link backwards. So so so they've introduced this concept of backlinks where if I connect document a to document b, when I'm in document b, I can see a summary of, and I can see all of the links, the backlinks to the current document. And and this is a very powerful way of, again, just managing and organizing and navigating your information. So So this is this is something that is quite recent, this kind of, feature in, personal knowledge management systems.
But topic maps have had that from day 1. They've had associations. Associations are have always been 2 way. So when I create an association, I I have to do a couple of things. First of all, I have to say, okay, what type of relationship of am I establishing? So there's semantically this relationship has now become semantically meaningful. I'm establishing a type on that relationship. But more importantly or just as importantly, I have to also say, okay. The the topic I'm in now, the current topic, the topic that I'm going to I'm going to connect it to another topic. So what role does this other topic play within the context of this, relationship?
So, again, if we go back to the the the employee employer relationship, I play the role of employee. My employer, obviously, the company I'm working for, obviously, is playing the role of employer. So I have to I have to define on that relationship. I have to define those roles. So when I'm in a current topic, I can see all of the related topics. I can see what is directly related to this current topic, the 1 that I'm in currently. I can see not only can I see what's, other topics are related to this current topic, I can also see the type of relationship that has been established, and I can also see the role of that other topic? And, again, this just gives you so much context.
And when you're in a in a specific topic, you have all of that context of what's happening, what is around this topic. And that context is is something that I was missing in quite a few knowledge management applications up until quite recently. So, that's 1 of the reasons why I built and have always liked building topic map applications because they've had this kind of 2 way back linking thinking in place since since day 1, basically. Yeah.
[00:19:04] Unknown:
And so digging deeper into contextualize itself, can you describe a bit about how it's implemented and some of the ways that it has evolved in terms of the system architecture and design since you began working on it? Okay. So, yes, broadly speaking, contextualize
[00:19:19] Unknown:
is made up of, 2 2 components. The the the back end back end is what's I call it the back end back end. That is what's called a topic map engine. That is another I've implemented that engine. It's also available on open source on GitHub. So that that engine is called topic DB. And that is the engine that really is what is providing all of the magic to to contextualize. So once you've connected to this engine so when contextualize obviously starts up, it connects. It makes a connection to the back end engine, topic map engine. From that point onwards, you can start creating the the actual topic maps. And in this engine, it has obviously an API that allows you to create topics, to create associations, to create occurrences. Occurrences are what connect information resources to to topics, to retrieve them, to, so I can for a specific topic, I can say, get me all of the related topics for this topic, and then I pass in a topic identifier. I can also say, get me the network of topics. So that's not just the direct directly related topics. I can actually span out and and go and get of up to 4 or 5 hops away, 5 associations away. I I can get all of those, topics.
So so this engine provides all of that functionality. Then the other main part of contextualize is the actual Flask application itself, which in many respects is is is a very normal straightforward Flask application, and it's providing the the web front end that talks to the back end topic map topic map engine. So again, just these 2 parts, the topic map engine on 1 hand and the actual web application on the other. The topic map engine is in itself is there's nothing web about it. It could be a desktop application talking to it. It could just be an API with a set of endpoints talking to it. There's nothing webish about the actual topic map engine. So Flask is what is really putting this topic map engine onto the web. And 1 of the challenges that I've often found in terms of being able to
[00:21:28] Unknown:
get into the habit of using a particular application for tracking notes and ideas is the availability of it across multiple different platforms and, you know, contexts where, for instance, I've used Org Mode for a little while in eMax, and that's been great because it's very flexible and powerful. But then as soon as you try to use that on mobile, it all falls apart because there are not very many good clients for it. And even the ones that are decent in their own right lack a lot of the explicit power of org mode or they're just cumbersome to use. And then there are proprietary systems that are pleasant to use, but then you're locked into their particular platform. 1 of the ones that stands out in terms of recent memory is Notion because of its flexibility.
But then if you really want to be able to use it in other contexts, then it's difficult. And so I'm wondering what your thoughts are on the overall benefits and challenges of why it's so hard to make a tool accessible in those different contexts? And some of your thoughts on providing integration points in contextualize for being able to work across those different environments?
[00:22:32] Unknown:
So so obviously, contextualize with it being a web application, and I've used a front end framework, Twitter Bootstrap, which obviously supports and and enables this responsive nature of of mobile of web applications. So so using something like contextualize on, for example, an an iPad, it's it's it's totally doable. Also, so in that respect, contextualize is is something that you can use, on your desktop machine or on a a tablet. Where it would become more difficult to use something like contextualize is obviously on a smaller screen. Although you can, use contextualize on a smaller screen, and it works quite well on a smaller screen, I think it's also a different mode.
When you've got a smaller screen, you probably are not going to be in the mode of creating a complex taxonomy or ontology or doing the the relationships, establishing the relationships between topics. So, sorry. Contextualize has basically 2 modes, 2 ways of working. 1 is the normal mode where you are where you are working with your topics and creating the relationships between your topics and and attaching information to your topics. The other 1 is purely a note taking mode so that you can switch over to just switch over to this note taking mode, and then you just start, writing your notes, but you are not putting them into the context of anything.
You are just recording notes. You're just, writing your notes. These notes are what I call them unattached. They they are not they are just floating in space. They're not connected to anything. But that's the point. You, for example, would be at a conference and you just want to record something that's, you just wanna take a note on something that's been said, or a person that you that you're interested in a certain subject, yes, just record it. Just get it into into contextualize. And then at a later stage, you would then see your list of notes that you that you made at that point in time. And that's when you can either do 1 or 2 things. You could either convert the note into a topic. Once you got those notes, you would then subsequently, convert, a note into a topic, which you then would then, you would then start relating and and connecting to your other topics, or you attach that note to an already preexisting topic. So again, just 2 different ways of using contextualized depending on the circumstances. And I think on a small device, you're not really going to be wanting to use something like a contextualized order. You can. You're not really going to be using it. It would be difficult to use it to to do complex organizing of your of your topics.
I don't think it would be a very good experience. And then, obviously, at some point, I've been thinking about it. At some point, maybe there should be an actual native application. And, again, like I said, the topic map engine in itself, there's nothing webby about it. It it is just a topic map engine. There is the beginnings of a web API, a RESTful API to actually also talk to this topic map engine. And then I would have a native application obviously talking to the RESTful the the REST API. But I'm somewhat reluctant to go down that road, I must admit, because building native applications, well, yeah, it's, a lot of time will be spent on building a native application.
And, yeah, I'm reluctant to go down that road. I'm I'm trying to ensure that topic, sorry, that contextualize is usable across as many screen sizes as possible. And then with these different modes, well, you probably on a small device and are going to be more note taking as opposed
[00:26:20] Unknown:
to really organizing and structuring your your knowledge or a specific knowledge domain. At least that's the thinking. Yeah. It's definitely easy to spread yourself too thin on a particular project and then end up just losing steam on it and sidelining it in favor of something else. So I can appreciate your reticence to go down the path of building a dedicated mobile app just for this particular use case when you already have something that suits the context and the device form factor well enough for the time being?
[00:26:48] Unknown:
Yes. Really. I mean, but something like a personal project like contextualize, I there's only so many hours in the day. So you really have to ensure that you are focusing on what, in this case, what I consider to be where I'll get more bang for my bucks and where users will actually benefit the most. So yes. Yes. A mobile and native mode a mobile application. Who knows? Maybe. 1 day.
[00:27:14] Unknown:
Yes. Yeah. And given that contextualize is a successor to other topic management systems that you've built in the past, what are some of the useful lessons that you've been able to draw on from those previous experiences that have, that you've taken on and contextualized to either, things that you've done right or things that needed improvement that you're using this as the opportunity to get correct, in this incarnation?
[00:27:42] Unknown:
2 things specifically. 1 1 was simply, simplify. Simplify as much as possible and specifically in the actual UX and the UX UI side of things. Again, I mean, there are quite a few really, really powerful, knowledge management systems out there. But to be able to use them, you really do need to be an expert, with regards to these kind of applications. And and I want this application to be useful to as many people as possible. I built it for myself, but it would still be a very nice thing if other people found it useful. So in order to to accomplish that, I I really did see that I needed to, simplify as much as possible.
I mentioned that we started or at least I professionally started using topic maps, building topic maps within the context of a company. We built a CMS around topic maps. And and I saw the difficulties that people had with topic map systems, where where if you were using complex terms or if you had a GUI or UX that was just a bit too complex, I mean, well, eventually, the system fails. We we had to do quite a lot of refactoring between what we initially built and which we thought was very straightforward and what we actually ended up with about 2 to 3 late years later with the users with something that they finally said, this is good enough for us. There was there was a lot of refactoring that we had to do. We also had to introduce things like tagging. So, tagging within the topic map system, you actually what you're doing is you're creating you're creating a relationship between multiple topics.
But from the user's point of view, they don't see that. They are not aware of that. They are just tagging. So you create a tag topic, and you create an association between that tag topic and the topic that's actually being tagged. So under the hood, it's doing all of the complex, it's dealing with all of the complexity. But for the user, it's just tagging. So with those kind of things we saw, hey, we can be successful. So that lesson is something that I've really tried to always take into account when building later and later or later versions of the topic map systems is simplify, simplify, simplify as much as possible until you can't. At some point, things are as complex as they are, but still try to simplify as much as possible on the UI, UX side of things. The second lesson is, I think being is about practicality.
Being practical as much as much as possible, but sometimes practicality, literally practicality beats purity. So at some point, I was saying to myself when I started building contextualize, hey, I know a lot of people have struggled with this concept of hypergraphs. So a hypergraph is when you establish you use 1 association to establish a relationship between more than 2 topics. So people naturally understand a relationship between 2 topics. So, yes, again, between a mother and her child, between an employee and an employer. Those kind of relationships, people naturally understand and research asserting, these relationships, people naturally understand. But topic maps enable people actually understand.
But topic maps enable hypergraphs. That is, you can use 1 association to connect multiple topics together. A lot of people struggle with this. So I I was thinking to myself, okay. I had learned this lesson. This is way too complex. Don't include it. And then there's another lesson on that. Yes. But for some people for some people, there are some actual advanced users using topic, using contextualize, and they do find this a very interesting and useful feature. So you make the feature be available, but you put it behind an advanced user interface. So people actually have to go out of their way to create hypergraph based associations. So, yes, I want to be quite pure in my implementation and simplify and keep things, keep things as easy as possible.
But from a practical point of view, so many people were asking for this this specific feature. I said, okay. I will add it, but I will add it in a way that it doesn't compromise the user experience for,
[00:32:10] Unknown:
less advanced users or users that just don't have that need. Yeah. 1 of the experiences that I've had with different knowledge management systems is that they can be very prescriptive in terms of how you have to interact with them and requiring that there are multiple metadata fields that you have to enter in order to be able to record anything or gain any real utility from the system. And so eventually, you end up having to become an expert at data entry, and you spend more time on the actual organizational aspects of it than on the aspect of just recording information and then being able to be flexible in terms of how you interact with it. And I'm wondering what your particular preferences are in terms of interacting with the information that you're gathering and being able to structure it and how that has influenced the direction that you've taken with Contextualize?
[00:33:00] Unknown:
Contextualize is quite an opinionated piece of software. I'm, you rarely in order to get the most out of contextualize, you do need to understand at least have a basic understanding of of topic maps. So if you have that understanding, then I believe and based on feedback from my users, then contextualize is a relatively straightforward application to use. Also, contextualize relies a lot on defaults. So if you go and create an association, an association is probably 1 of the more complex things, entities that you are going to be dealing with within the context of contextualize. So in association, there are at least 7 fields. So think of it this way. I'm I'm always within the context of the current topic. Now I want to assert a relationship between the current topic and another topic. So what do I need to provide? Well, if you just use the defaults, you only have to provide 1 piece of information, and that is the topic identifier of the other topic. So if I'm in the Brett topic as an employee, and I want to connect to my employer, my company, all I need to do to establish that relationship is I need to provide the ID of that topic of that employer relay, topic. That's it. And then all of the defaults will apply to that relationship. So you will get an association of type association, and the roles that we play in that association, both topics will play a role of related. So it's a very generic association, but it's good enough. You've established a relationship between 2 topics.
Only when you see that, no. I need to go a bit beyond that. I want to establish a more meaningful semantically meaningful relationship between these 2 topics. Do you need to start thinking about overriding the default? So potentially, you wouldn't have an association of type association. No. You would want an association of type employment. And, no, I don't want Brett to be playing the role of related and the company to be playing the role of related. I want Brett to play the role of employee and the company to play the role of employer. You can do all of that. And if that topic doesn't exist, so because this is something that is really based on feedback from the users. Before, if you went to create an association, because, again, everything in, contextualize and in topic maps, you're always referring to other topics. So those roles and those types, they are other topics. So you would have this very unfortunate UX where somebody would go and create or try to create an association between 2 topics, and they are straightaway, impeded in doing so because, oh, 0, wait a minute. I don't have a topic, for for the concept of an employee.
I don't have a topic for the concept of employer. I don't have a topic for the concept of employment. So so it became this, oh, I have to set this up first and then create this topic and then create this topic and then create this topic, and only then can I create association? That's a very bad user experience. So what we have now in this in contextualizes, well, first, you can use the defaults. And if that's good enough for you, every single creational form has default. So if that's good enough for you, then you're fine. If you need to override it, then in line, in place, in the actual form itself, you can create a topic. So, obviously, you would type in a topic ID, for for example, employment.
The system detects, hey. This topic doesn't exist. So instead of you having to actually cancel the association creation and go and create that topic, no. In place, you can create the employment topic and then carry on to the next field and then carry on to the next field. So that that is best definitely changes that have been made to something like contextualized based on on feedback. And also trying to make it as easy as possible to to allow people to build what is a knowledge domain, a model, to organize the information, to structure the information, but at the same time, not make them have to go too much out of their way to do it.
And and and stopping them from being able to do all of this because they have to do all of this pre setting up of topics just to satisfy the application. That that is a very bad user experience, and that's really based on feedback. I've been trying to improve on that aspect a lot a lot.
[00:37:33] Unknown:
Given the fact that this is an open source application, and as you mentioned, 1 of those pieces of feedback has directed some of the user experience. But how what are some of the other ways that it's exposure to the public and the fact that you're doing this development in the open? How has that influenced your overall approach to the design and implementation of the software and the user experience?
[00:37:55] Unknown:
So, as mentioned, quite a lot of people, and this is fantastic, quite a lot of people do just directly go to GitHub and and and create an issue and ask for a certain feature or they ask for the priority of a given feature to be to be increased. But apart from that, so that's a very direct advantage, that you get from having an open source project. But apart from that, just because you have put something out there and it and it has attracted a certain amount of people's interest, Just because you have that project out there, it it it sets you somewhat aside, and and people will take you more serious, and and they will engage with you. And you, over over the last couple of months, yes, over the last couple of months, I a lot of people have started conversations with me that we've eventually taken, have become email threads or they've or they've asked asked me to to join a Telegram group. And so so you're basically exposing yourself, or making yourself available to a lot of people getting in touch with you. And then, I mean, there are so many like minded people and people that have so much experience and good ideas and insights.
More than anything, I think that is the absolutely the biggest advantage I'm seeing of putting something like this, making it available on GitHub, having it as an open source pro, application, project, and just the amount of people that start engaging with you. It's it's absolutely fantastic. The amount of good input you get from people. And and they're not asking anything in return. They are just interested. They have their projects. They have their insights. They have their perspectives, and and and people are sharing that. There is a huge movement now. It was something that I wasn't even aware of, and it's called digital garden gardening.
So these are people that are actually now in the open. They are creating, using all kinds of applications like Notion, you mentioned. There's another application called TiddlyWiki, I think, Roam Access. So these are people that are putting their thoughts directly, their thoughts, their notes. They are doing this in the open, and they call it digital gardening. And there's a huge amount of people. I was completely unaware of this, of this movement, this this trend going on. Because if you you would never think to search in Google for something like, digital gardening when you're talking about personal knowledge, management systems. So I just was not aware. Now I'm aware of it. And why am I aware of it? Because people somehow have found my project, started to talk with me. We've taken that conversation further, and and it's it's just this very enriching experience. It's I'm I'm getting out of having my application available as as open source. And so 1 of the other use cases for knowledge management systems is in the context of a team, whether that's a group of people who are collaborating on a creative endeavor
[00:41:07] Unknown:
or a team of engineers or just a group of people who work in the same company. And I'm wondering what your thoughts are on the utility and potential benefits of contextualize in that context and being used in a group setting? So, again, this was something that I've added,
[00:41:24] Unknown:
later. For me, all of the systems I've built up until now, that they were really personal knowledge management systems. So it was me or a specific person mapping out a domain, a specific knowledge domain, and, they themselves and only themselves were interacting with this this documented knowledge domain. Obviously, 1 of the first things that happened when I made this application available, as an open source project was, yes, but I would like to be able to collaborate with my friend on this world building project that we have. Yes.
I think that makes a lot of sense as well. So I've I've added, collaboration features to, contextualize very much modeled on the Google Docs approach. So you can you can comment on a topic map. You can edit a topic map, or you can view a topic map. And you you can invite people to your topic map and give them 1 of those roles. But I'm I'm not sure. I'm I'm a bit torn on this because, collective knowledge management, you know, how can I say it? Real learning and managing one's own knowledge is such a highly personal thing. What works for 1 person doesn't necessarily work for another person. So there's a bit of tension here between between, collective knowledge management and personal knowledge management. There's obviously a lot of room and a lot of value for being able to collaborate on a a topic map between 2 or more people.
And hence, that's why I added the feature. But I think also, something like contextualize, it's equally valid really as a personal knowledge management application. But, yeah, I'm I'm a bit torn. I I don't know how much, again, learning because a lot of this is about learning. When you're using an knowledge management application, it's about documenting, information. It's about, documenting a certain learning process, and that is just a very personal thing. And what works for me won't work for you and vice versa. Nonetheless, it makes sense that people can collaborate on common topic maps or common knowledge domain. So,
[00:43:41] Unknown:
yes, what can I say? In terms of the experiences that you've had building contextualize and some of the other topic modeling platforms that you've worked on, what have you found to be some of the most interesting or complex or complicated aspects of that and some of the most interesting or unexpected lessons that you've learned in the process? Yeah. I must admit, I think for me, more than the actual web application itself is the underlying topic map engine,
[00:44:08] Unknown:
which is, it's the most interesting part. I mean, that's where the magic happens. Graph theory. I mean, there's a saying that everything is a graph. The graphs are so powerful. Imagine for for learning systems that you create a an an application that not only allows you to to to map knowledge, but it also allows you to determine the best path from 1 point in that map to another point in the map based on a certain amount of metadata that you either record on the topic itself or on the associations. So there are so many things you can do with graphs. And that beyond doubt is is probably the most interesting part, for me. What I think is the most challenging is is the front end. Not not so much the the front end as in the JavaScript side of things. I mean, actually, the the UX UI of of this world. Because, again, the challenge here for me at least is to make these kind of applications as usable as possible to as many people as possible and trying to translate what is potentially quite a dry abstract subject into something that is really useful for a person and they have an application that, doesn't make it more difficult than it might be, that's a challenge. That that really is a challenge. It's it's probably what takes up most of my time with contextualize is trying to implement well thought out GUIs, in the sense that, well, this has to work for a lot of people. They really have to understand it. Yes. I am putting on the table that they really also need to understand topic maps before they will get the most out of this application.
But once they've gone through that effort, Brett, then you need to make sure that you stick to your side of the bargain and that they can actually use this application. So, yeah, that's the most challenging thing for me is the GUI, the the UX of it all. So for somebody who does want to get started with using contextualize,
[00:46:07] Unknown:
what are their options for being able to actually set it up and get things up and running and start documenting their knowledge and building these topic maps? Okay. So obviously,
[00:46:17] Unknown:
with it being an open source project, first of all, contextualize is available online. You can go to contextualize.dev, and you can sign up, and you can start using it. It's it's free. I'm not going to start even thinking about charging for it. I want people to use it. It is available on GitHub. So you you can obviously clone the repository and try to set it up. It's not the most straightforward application to set up, specifically if you're not on a Linux machine, because 1 of the dependencies specifically for PostgreSQL, you actually have to have, a c compiler and the appropriate header files and So it's it's specifically for people that are not on Linux.
So it's it's specifically for people that are not on Linux. It's it's quite a difficult thing to set up. We have, and I've I'm very thankful for people that have contributed to this because I my knowledge of Docker is is quite limited. Contextualize is also available. There's a Docker image for contextualize. So you you can actually set it up and have a Docker container up and running and and use contextualize just like that. Once you actually have the application up and running, then it's about, how do I go about actually getting some value out of this application? There's there's lots of ways. I mean, there there really are lots of ways. It depends on the audience. It depends on what what you want from this. For example, I use contextualize or topic map systems in general. I could set up a topic map and the person or the people, the audience that is going to be consuming or interacting with that topic map, they they are doing that in a, not necessarily from a creation side of things, but more from a consumption side of things. They're going to be navigating this map a lot. So contextualize also has this concept of what I'm calling knowledge path. So you can you can create specific associations of type navigation between topics And contextualize will render out specific, navigational, GUIs to easily allow you to then traverse this kind of topic map. So if the if if topic map is more for navigation, then you will set it it up 1 way. If the topic map is more for you as a person to say, okay, I have this knowledge domain. I want to actually document this knowledge domain, and I'm going to be reusing and extending this dollar this knowledge domain, then you would potentially take another approach.
Also, there are at least 2 different approaches to setting up knowledge domains. 1 is a more top down approach where you are being quite formal in in how you're going to be setting up the relationships between the topics, the topics themselves, what kind of information you're gonna be connecting to the topics. And the other is a more bottom up approach, iterative, incremental approach. You create the topics as you as you need them. You link them up as you need them. And, obviously, there's a combination of these as well. For a lot of things, it actually makes sense that you bootstrap a quite formal topic map with a a a set of topics and a set of predefined associations between those topics. And then from there, you start modifying, extending the topic map and creating your own relationship. So it really is, it depends on how you want to use it. Yes.
[00:49:46] Unknown:
And for people who are considering contextualize, what are the cases where it's the wrong choice and they might be better off with a simple note taking application or some other means of knowledge management such as a Wiki? There is a certain amount of formality to using an
[00:50:02] Unknown:
application like contextualize. And again, you you should have a basic understanding of topic maps. Topic maps, are seemingly quite simple and conceptually, I suppose they are. But there are some darker aspects or some nooks and crannies with regards to topic maps that if you don't really understand it, there's some things that just won't make sense. Topic maps, for example, have this concept of scope. And scope, you could think of as a synonym for context or as a synonym synonym for perspective or point of view. So it's very easy to, it's possible in something like, contextualize to to create different perspectives of what is basically the same underlying data.
But you need to understand that. You need to understand that, and you need to be able to to set up your topics, but specifically the associations and the resources that you're connecting to those, topics. You need to understand the concept of scope. Otherwise, you won't get this benefit out of, something like contextualize. So again, there's some formality and some prerequired or basic knowledge that you need to have to use something like contextualize to get value from it. So if you are not willing or if it's not your need, it's not about just willingness or lack of willingness. If it's not something that you need, if it's if you are just putting a couple of, notes together on a specific topic, and and you want to yeah. And and if your needs don't go beyond that, then then don't use something like contextualize.
Absolutely don't. It's it's it's overkill. So, yeah.
[00:51:40] Unknown:
Probably something along those lines. And looking forward, as you continue to work on the project and build it out, what are some of the things that you have planned for the future that you're excited about?
[00:51:50] Unknown:
Probably, some, well, talking about Python. We were before the podcast talking about Python and how Python has grown in so many directions. The amount of machine learning libraries that are now available to to, for Python. So I could quite easily make it possible in contextualize to to extract from a given so you create a topic, you put in your topic text, a a document of sorts, and then you could have contextualize actually extract from that document entities. So topic extraction. And it could automatically create those topics and then automatically, obviously, link those topics to the current document, to the current topic that is it it has used to extract those topics. So that is something I've been thinking about, generation programmatic generation of of topics based on things like topic extraction. Also, text, summarization.
So it's if you got a complex piece or a very long piece of text in a specific topic, you could say, okay, just give me the summary. Well, you now have the libraries available to give you a summary. So so those kind of things. I'm I'm really thinking about, okay, what can I do? Because using contextualize now, it it's quite a manual process. You you have to create the topics. You have to create the relationships between those topics. You have to upload or attach resources to those topics so that it it is a quite manual thing. I think that's quite a good thing, but it's still a quite manual thing. It would be nice, and I think potentially beyond nice, it would be useful if there could be a bit more intelligence in in contextualize, like automatic creation of topics and the relationships between those topics using machine learning language to do so. So, yes, that's probably 1 of the next things I have in mind for 2021.
[00:53:44] Unknown:
Are there any other aspects of the work that you're doing on contextualize or the overall space of topic modeling and knowledge management that we didn't discuss that you'd like to cover before we close out the show? No. I think,
[00:53:56] Unknown:
we've discussed the majority of things. I I think, a lot of people could benefit from and not necessarily contextualize per se, but these kind of applications. So I really encourage people, especially people I mean, and who isn't nowadays. So many of us are so called knowledge workers. We we have so much information at our disposal. We are we are in this what's called this info glut. We have too much information potentially available to us. So so having an approach and the tools to be able to to manage that and to be on top of that, I really just recommend it to people. Not necessarily contextualize, but use something like Notion. Use something like TiddlyWiki.
Use something like Roam Research. Use something like contextualize.
[00:54:43] Unknown:
I think a lot of people more than they would expect, I think a lot of people can benefit from these kind of applications. Well, for anybody who wants to get in touch with you or follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And with that, I'll move us into the picks. And this week, I'm going to choose the tool, Pydantic. I started using that recently on 1 of my projects, and I've been enjoying using that for being able to constrain the space of available options for inputs to different other functions. So it's definitely a great library for just building out specifically typed objects so that you can pass them through your program. And then in conjunction with that, I've been using mypy just to handle the simple errors that I make in terms of, you know, this function doesn't accept this type of thing. So Pydantic and Mypy have been great for some of the recent development I've been doing. Definitely recommend checking those out. And with that, I'll pass it to you, Brett. Do you have any picks this week?
[00:55:36] Unknown:
Difficult 1. But I I think a subject that I think is touching all of us, and I mean globally is is probably what is happening now all over the world. The killing of George Floyd, the Black Lives Matter movement. This is hugely relevant for all of us. As as a white heterosexual guy, I'm very aware that I'm playing the game of life in easy mode. My daughters, just because they will be woman, will be playing the same game in a more difficult setting. People of color are playing the game of life in an even more difficult setting. And this is not how things should be. And we need to progress in our societies and in our communities and we need to treat people fairly. So this is something that has to be hugely important, for all of us. I I just wanted to say that. It is happening. We're seeing it happening specifically in the US now, but people are protesting all over the world. Even in the small little town that I'm living in, in the north of Norway, people are very aware of this. They are protesting or making themselves heard, and I think it's a good thing.
We we have to make sure that this is important for us. We have to make progress on this. People have to be treated fairly. Everyone has to be treated fairly. That's it.
[00:56:52] Unknown:
Definitely something worth calling out.
[00:56:55] Unknown:
I I I hope so. I hope so. I don't sorry. I I don't want to be political. I really don't. But I think this is something that really is
[00:57:02] Unknown:
important for all of us. Yes. Absolutely. Thank you. Well, thank you very much for taking the time today to join me and discuss the work that you've been doing on contextualize and in the broader space of topic bottling. It's definitely an interesting project and an interesting problem domain. So I appreciate all the work that you put in there, and I hope you enjoy the rest of your day. Thank you very much, Tobias. Thank you very much. Thank you for listening. Don't forget to check out our other show, the Data Engineering Podcast at dataengineeringpodcastdot com for the latest on modern data management.
And visit the site of pythonpodcast.com to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email host at podcastinit.com with your story. To help other people find the show, please leave a review on Itunes and tell your friends and coworkers.
Introduction and Guest Introduction
Brett Kromkamp's Background and Introduction to Python
Introduction to Topic Maps
Contextualize: Overview and Motivation
World Building and Knowledge Management
Implementation and System Architecture
Lessons Learned and Simplification
User Interaction and Flexibility
Team Collaboration and Knowledge Management
Getting Started with Contextualize
Future Plans and Enhancements
Closing Thoughts