Summary
Adding translations to our projects makes them usable in more places by more people which, ultimately, makes them more valuable. Managing the localization process can be difficult if you don’t have the right tools, so this week Michal čihař tells us about the Weblate project and how it simplifies the process of integrating your translations with your source code.
Brief Introduction
- Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
- I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable.
- When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app.
- You’ll want to make sure that your users don’t have to put up with bugs, so you should use Rollbar for tracking and aggregating your application errors to find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan.
- Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch.
- To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers
- Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas.
- Your host as usual is Tobias Macey and today I’m interviewing Michal Čihař about Weblate
Interview with Michal Čihař
- Introductions
- How did you get introduced to Python?
- Can you explain what Weblate is and the problem that you were trying to solve by creating it?
- What are the benefits of using Weblate over other tools for localization and internationalization?
- One of the advertised features of Weblate is integration with git and mercurial. Can you explain how that works and what a typical translation workflow looks like both for a developer and a translator?
- Given that part of the focus for the tool is to allow for community translation, how do you simplify the experience for first time contributors?
- I understand that Weblate is written as a django application. Is it possible to use Weblate with other Web frameworks or non-web projects?
- Can this be used with projects implemented in other programming laguages? Are there any capabilities that are lot in this scenario?
- Why should developers and product managers be concerned with localizing an application? How does Weblate help to reduce the level of investment necessary for such an undertaking?
- What are some of the biggest difficulties that you have encountered while building and maintaining Weblate?
- What are the most common problems that you see people encounter on both the translator and developer side when dealing with internationalization and localization?
Keep In Touch
Picks
- Tobias
- Michal
Links
The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA
Hello and welcome to Podcast. Init, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you're ready to launch your next project, you'll need somewhere to deploy it, so you should check out linode@linode. Com/podcastin it and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. You'll want to make sure that your users don't have to put up with any bugs, so you should use Rollbar for tracking and aggregating your application errors to find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300, 000 errors tracked for free and on their bootstrap plan. You can also visit our site at www.podcastinit.com to subscribe to our show, sign up for the newsletter, read the show notes, and get in touch. And to help other people find the show, you can leave a review on iTunes or Google Play Music and tell your friends and coworkers. You can also join the community at discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your host as usual is Tobias Macy, and today I'm interviewing Michal Chihaj about Weebly. So, Michal, could you please introduce yourself?
[00:01:16] Unknown:
Hello. I'm Michal Chehaj, and I'm currently freelancer on free software. So that means that I basically work on what I like, and 1 of the things is applied, which we'll be talking about today.
[00:01:30] Unknown:
It's always nice when you can get paid to work on the things that you enjoy. So how did you first get introduced to Python?
[00:01:37] Unknown:
I don't really recall the details, but I think it was something like 15 years ago when I worked for 1 company which developed websites, and they were using Kuzzle in this time for it. And that's basically where I started with Python. Our system was really nightmare from all points of view. It's had really a lot of historical stuff, and that was not really pleasant experience. So I basically didn't use Python much 5 years after that. And, yeah, somehow then I started to like it because it seemed like language which is easy to use. And, yeah, then I really started to use it.
[00:02:20] Unknown:
Yep. So can you share a little bit about what WebLite is and the problem that you're trying to solve when you created it? WebLite is
[00:02:29] Unknown:
basically tool for adding translations online. And the main motivation behind it was to make this process as seamless as possible. We have used in some other open source projects where I've I'm involved with you used different tools for translating, and it always involved quite a lot of manual work. And, yeah, I hate to do that. If there's something you can automate, you should automate that. And this this was certainly 1 of the things which sucked and where we had problems with merging our translation from the translation service back to the to our code base as yeah, always somebody forgot to do that when doing changes in the code, and then we had to debugging files, which had to be merged manually. And, yeah, this was really the first main motivation to start with the web. Over the time, some new features were added and we have really now quite comprehensive code base to cover most of what people need when translating, but this was a little thing which started it.
[00:03:36] Unknown:
Yeah. It's definitely a pretty impressive list of features that you've got on the website. So you mentioned that 1 of the primary motivating factors was wanting to have the integration with version control because it definitely seems like that's 1 of the aspects of the project that make it stand out from some of the other offerings.
[00:03:53] Unknown:
Yeah. That's really what we locked, and I've already started with that, like, let's see if I can write something like that in few days. By that time, I worked at SUSE, and there is 1 nice thing which they have is that there is 1 or 2 weeks in a year which you can spend on anything. And this was where I started hacking Weblight. I had 1 week of time and I wanted to try if I can implement something which would work for us. And, yeah, in 5 days, I had working code, which for most cases worked better than all the solutions we have tried before. And that's how it started.
[00:04:31] Unknown:
And are there any other projects that have started up in the time since you first created WebLite that might have sort of precluded the need for you to build the, the WebLite project?
[00:04:41] Unknown:
I don't think there is anything which which has a similar version control integration as Weblight. You can script it with some of the tools, but I don't think it has is built around version control as the blade is. What are some of the other features that you've added in after the initial version that you think are worth calling out and that really make it stand out aside from some of the other options that people might still be using? Well, that really depends on point of view. The the initial version didn't have much of the things which are really necessary for bigger services, like it is currently hosted web light, which where we offer hosting for free software project and for commercial ones as well. So we really need to add some access control and stuff like that, which was not there in the initial version, but that's not really what would differentiate Webly from other tools. 1 of the things which is probably nice is that we really try hard to display translators as much context as possible. That this depends on the translation format which is used. For example, get text profiles have quite a lot of context in them, and it's important to present it to the translators.
But if you look at, for example, Android translation, there is really nothing extra in there. You have just ID and the text, and there's not much to display. So we have added options, to add context within Weblight as well so that people can upload screenshots and see where the translation is being used. That helps quite a lot to the translators that they see the whole window or page or whatever where the where the thing is being used, and they can get a better feeling how the translation should be done.
[00:06:28] Unknown:
And for somebody who wants to use Weblyte, can you describe what the integration with the version control systems looks like and how that works with a typical translation workflow, both on the side of the developer and the translator?
[00:06:42] Unknown:
There are 2 sides of the integration. 1 is that the blade needs to know what's going on in the in the development repository. So we need to stop some hook that will let the blade know that there's something going on, and the other side is allow allowing the blade to push to the repository. If you use some well known hosting service like GitHub, Bitbucket, or GitLab, we have already integrations for that. So it's just a matter of adding URL to webhooks or whatever it's called in the given service. And Webblade will get all the information out of that for pushing. It's just matter of adding the user or we we have also support for creating GitHub pull request from the translation. So if you want to review them, you can get them as a pull request and merge them when you when you are ready for that.
[00:07:34] Unknown:
So you can have Weblight automatically open a pull request with the translations that are involved?
[00:07:39] Unknown:
Yeah. It will be all translation changes with web done in Weblight will appear in the Google AdQuest.
[00:07:47] Unknown:
And when I was reading through the documentation, I mentioned that it has support for being able to add translations across multiple branches. So I'm wondering what that looks like in terms of the interface of the from the translator side and any sort of difficulties that you encountered while you were trying to work on that functionality.
[00:08:05] Unknown:
This works in a way that both branches are added to the blade as a separate translation component. And there's option in Weblight to make it keep the translation of the same string consistent across the component within 1 project. So this can be used for for translating branches, or it can be also used for translating different parts of the projects and ensuring that you use really the same translation for the same string. It can be that there is documentation and the project or the program itself. And this way you can keep them in sync and use the same string in translating in both parts.
So it really can be used for multiple situation, not just the branches. But the branches were the original use case we needed to solve.
[00:08:56] Unknown:
And for the developers who want to add translation support to their projects, what does the workflow look like in terms of how they structure their code or how they identify the strings that are eligible for translation?
[00:09:12] Unknown:
This really depends on language or environment we choose. The for example, most of the environments for for building mobile applications already have something built in, so use you use whatever is native on the platform. It's both for Android and iOS. It's already built in in the default stack, so you will just stick to what's what's there. And if you build application from scratch, yeah, usually it's good idea to stick something well known like get text or using the native thing, which which the libraries you use have, like, qt translations thing for qt. And yeah. Usually, what is the worst thing you can do is invent your own way of doing translations because you will just repeat mistakes which were already done many times before, and it will not bring you any benefits unless you are a really strong player in a market, and you can push the farmers to be more widely spread like Mozilla is doing with something, but generally, this really doesn't work, and you just would bury it for the translators because they cannot use the tools they are get used to and makes the things that are necessary complex then.
[00:10:26] Unknown:
And for being able to add additional context, because I know that you mentioned in the documentation that you can actually add things like source code comments to explain what's the purpose of the translation is and to give some more feedback in terms of any sort of additional context that the translator might need to be able to make sure that what they're providing is as accurate as possible. So does that still follow in with the same sort of standard tooling for translation?
[00:10:57] Unknown:
This quite a lot depends on what translation framework you use. I think this is supported only in few of them. It's definitely GetText that's we support it. It support it in Excel if, but that really depends on how you generate the the Excel files. And I think Qt translator does support this as well, but this definitely doesn't work on things like Android or iOS because the the file format itself doesn't wait to store this information at all.
[00:11:30] Unknown:
And how much technical knowledge is required on the part of the people providing the translations? Does Weblight abstract any of the more more complicated aspects of the workflow that they would otherwise need to be aware of if they were using different tooling?
[00:11:47] Unknown:
I have a bite. It's like any editor for doing the translation. So anybody who has experience with doing software translation should be just fine with that. Person who was not doing translations should still feel familiar because it's there just you have a page where you have source thing and text area where you enter the translation, and that's where the most important thing. And you have just some surrounding information where you can dig into the technical details, but you really don't need to do. It's just it's just simple workflow where you can enter the translation in the end.
[00:12:25] Unknown:
And is the primary focus for Weblyte for projects that are looking to crowdsource translations? Or is it just as widely used for, for instance, companies that have in house translators who will be working on the code base?
[00:12:39] Unknown:
It's used by both of them. Of course, for the free software projects, it's mostly the crowdsourcing. And for the companies, it's usually they are using it for in house translators. Most most time, they they have their own installation of a blade and using that in house. And I don't I'm don't really don't know about most of the users. So that that's it.
[00:13:04] Unknown:
And for the open source projects where you're looking for contributions from the community, what's the setup necessary for somebody to actually start contributing? And is it simple enough and fast enough for somebody to be able to do sort of a drive by contribution where they just translate 1 string and don't necessarily engage long term with the project?
[00:13:25] Unknown:
Yeah. It's just matter of create, in case it's being we are talking about hosted web blade, which is service I provide for free software projects. It's just matter of registering there, and you can start translating. The registration can be done by GitHub, Google, or Facebook Facebook login. So I think pretty much everybody has at least 1 of these. So it's just matter of few clicks, and you can contribute. As a not locked in user, you can also give suggestions, which are not directly committed to the version control, but somebody has to review them. But you really don't need to authenticate to do that. So that's really even without without the login, but somebody has to review your work then.
[00:14:10] Unknown:
And for the hosted version, at least, if somebody does register and create a profile on that service, are they able to then use that same profile across multiple different projects?
[00:14:20] Unknown:
Yeah. It's shared for the whole service. So you can translate any project, which is there, even the commercial ones, if you get access to them.
[00:14:29] Unknown:
I'm sure that that definitely simplifies for people who are interested in providing translations, but aren't necessarily committed to a given project where they might be browsing through to see what translations are needed. And they made have a suggestion or or a solution for 1 particular passage of text, but don't necessarily want to create an account just for the purpose of the 1 project.
[00:14:53] Unknown:
I have seen quite a lot of people who started to contribute because they wanted to translate 1 project, and later then they started to contribute to others. It quite often works in a way that they first contribute to the project they originally wanted, and then they start to translate the plate as well. So that's why we have something like 20 translations right now. So that's 1 of the side effects of the hosting that we get the translations as well.
[00:15:20] Unknown:
So moving over to the more technical aspect of it, I know that you wrote this as a Django application. So I'm wondering what contributed to that decision and what sort of benefits that provides.
[00:15:32] Unknown:
By the time I started to write Webly, it was 1 of the things I knew, and it seemed to be good fit. It still helps in quite a lot of things. It's it's complete ecosystem, so you can easily integrate other things like the authentication from the third party services, which is just a matter of installing some or adding some module, and you really don't have to look into the details on that so that when would we implement it, the complete solution, it will be much more tricky to do all these things. Yeah. Also, the the ORM is really helpful in doing handling with the database. There are probably some other frameworks which would work well as well for Webblade. In the end, it really doesn't do much special. It's more or less just that working with the database.
We have some things which really don't don't fit into the relational database, so we are doing some abstraction layer on that, but, yeah, that's still something which would happen in any framework. So I don't think that's something to blame Django for. It's the way we had to implement some features. Yeah. Well, I still think it's 1 of the best approaches we could have taken. The only thing which is not that pleasant is installation procedure, which is not as straightforward as it could be. And, yeah, maybe we could do something with that, but we are really sticking with the Django way of working things for now, and it's too complex for somebody who has not seen Django before.
[00:17:11] Unknown:
Yeah. I noticed that you provide a Docker container for being able to get people up and running with it more quickly so that they don't necessarily have to understand how to actually do the full setup of a Django application.
[00:17:23] Unknown:
Yeah. That's exactly the reason we started with Docker because that way you can just get the image which is working, and most of the things should should work out of the box. It's still not perfect. It could use some improvements, but, yeah, most of the things are working out of the box and you can play with web blade and try if it works good for you or not. And if it works, you can tweak them later. But still the basic setup works really easily.
[00:17:52] Unknown:
And what are some of the complications that you've had to overcome in terms of being able to integrate with the different types of translation systems, like get text and Android translation strings and things like that?
[00:18:03] Unknown:
Yeah. In the beginning, I have decided that I'm not going to solve this at all, and I'm going to use translate toolkit library which does this. Well, that was just the beginning, and I didn't really look at translate toolkit, how it does the things. And, it turned out that its API is totally consistent across different translation formats. So we we had to build additional layer on top of that to hide these details. And, yeah, it works quite well now, but it's not really something I would be happy about. I'm contributing to translate 2 cars to make it better or more consistent, but in some cases, it's a little tricky because they have already built quite a lot of tooling around existing libraries, and they don't want to change the API. So it's like, yeah, the API is not consistent, but we want it this way.
So there still will be needed some layer in in web light, which hides these things and make it behave consistently. At least in our point of view, it might be different for other guys. So it's I'm not really blaming them for that. It's just that it it's another layer of complexity for us.
[00:19:21] Unknown:
And speaking as someone who doesn't have much experience working with localization and internationalization, are there any particular feature trade offs between the different available underlying systems that would encourage somebody to use 1 over the other? I mean, obviously, for things like an Android project, you would wanna use the native functionality built in. But for other sorts of situations, does new git text have any advantages over some of the other options? And what are some of the considerations that people should be thinking about when they are making that choice? Yeah. I think
[00:19:56] Unknown:
most visible feature which is not always present is handling plural forms. It's like in English, you just add s to the end of the to the noun and it's done. But in other languages, it's more complex. You have like in Czech, we have 3 plurals, plural firms. In Arabic, they have, I think, 7. So it's something that the translation format should should understand. And it's done properly in Getex. It's done properly in translation system. It's done properly in Android, but I'm not sure about iOS. Right now, I'm not I don't really know. And if you're using something like JSON, it's simply not there. You you will really get wrong profiles for people not speaking English because there's not really no way to do it with just simple key value storage for months.
[00:20:52] Unknown:
And is there any functionality that you aren't able to easily expose in a generic way through your adapter over the translate toolkit?
[00:21:03] Unknown:
Yeah. There are few features we don't support right now, but this can be work around it in in the translation format. 1 of the is the such things which just came to my mind is using translational lists. I I think it's called Android because it's just array of string and it's not really exposed from Tensile's toolkit and that way we don't get these things at all. And there are probably some more features like this, which are not that widely used and we don't implement, but really the basic or the most widely used functionality should work for all all the formats.
[00:21:41] Unknown:
And for somebody who's working on building out an application, are there any reasons why they would not want to consider adding translation support, particularly given the availability of tooling to help simplify the process?
[00:21:58] Unknown:
It really depends on where you target your application. If you are, for example, British bank and you are developing application for customers, you really don't need to consider localization because, you know, all your customers will speak English and it really doesn't matter. So it really depends on the target audience. But if you are developing something which should be generally used worldwide, I don't think there is much reason to skip the localization because if you skip it in the beginning, you will get some feature records later to edit, and it will be much more work to do it later than if you start with it from the beginning of the project.
[00:22:34] Unknown:
Yeah. And what does the initial startup cost for adding translation to a brand new project look like for somebody who's never done it before?
[00:22:43] Unknown:
They need to learn how to work with system, and the cost is really depending on the tooling and platform you are using. With I will get back to the Android example. With that, you will get a translation out of the box, and you have actually no way to develop application without translations because it's already there, and you need to use the the same system even if you don't translate the application. So it's simply there and you cannot avoid it. If you are using being some program in Python, you need to put some effort into calling, including get text and building the translation file then when doing the release. So that's definitely something you need to learn, look at projects which already do integrated and base base your goals on on that.
[00:23:36] Unknown:
What are some of the common issues that people run up against when they are first starting to work with translations? 1 of the things that I'm thinking of off the top of my head is the right to left language support. So what are some ways that people can sort of educate themselves about things to be aware of?
[00:23:52] Unknown:
Yeah. I think first thing is don't make any assumption based on your language because there are many things which doesn't work worldwide. 1 of the things is right to left, left to right, but the other things like the plurals I've already mentioned. Don't assume that everybody has first and second name, first name and surname because it doesn't work like that in many countries. It's not really about doing the translations, but you really should look at quite a lot of assumptions which are usually there, but I'm not always valid.
[00:24:25] Unknown:
And what are some of the biggest issues that you've come across while building and maintaining Weblight and how did you overcome them?
[00:24:33] Unknown:
I would say it worked quite well, so I don't really recall any big hurdles. Yeah. 1 thing which we currently face is that solution we have choose for full text search and machine translation handling is not really working well in some cases, and it's not really easy to figure out in which cases. So that's why the full text search is currently not working as as good as we hope for, and we'll need to look at it and find out some other solution or fix the existing 1 if it's back in a way we have implemented. But, yeah, that's something which is broken right now and will not be easy to fix.
[00:25:18] Unknown:
And when you say machine translation, are you referring to things like the Google translate API? Does that mean that you could, for instance, expose the strings that need to be translated and then automatically use something like the Google translate API to get an initial pass of them and then offer up those for review and correction by native speakers?
[00:25:37] Unknown:
Oh, well, let me add something to the previous answer first. The it was about the web internal machine translation. So that's when we do suggestion based on the things which are already translated within the web light. But, yeah, we do have also support for using Google or Microsoft translate. I think there's few more services supported. It's currently not automated. It's just you can when translating, you will get on the bottom suggestions from these services and you can use them. We have done it this way because usually the machine translations are not really nice and it's always good to review them. So it's really somebody needs to go through them and either accept them or do some editing to make them look like a some native speaker and not like a random collection of the words, which together might give the same sense, but are not really ordered properly like Google translate sometimes does.
[00:26:41] Unknown:
Yeah. I think that anybody who has experimented with any of the translation services has come across the case where you feed your sentence through and it comes out on the other side looking barely recognizable. And, you know, you you can kinda get the idea of it, but it definitely doesn't feel native. For anybody who hasn't done this, I recommend taking a sentence in your native language, translating it to a different language, and then back, and then seeing what it looks like. Yeah. That really helps. And in some languages, you even get completely
[00:27:09] Unknown:
opposite meaning than it was originally. So that's something which you shouldn't not rely on, and I still don't think it's a good idea to do something like that automatically and give it to your users as a translation because it really looks weird then.
[00:27:27] Unknown:
Yeah. It can help you when you're trying to figure out how to communicate with somebody in person, and you can use body language to help provide additional context. But, yeah, when you're trying to incorporate it into a project that's going to be used by other people, most likely in professional context, it definitely would look fairly amateur to include the translations that are automatically generated. So is there anything that developers can do to make it easier for translators to make sure that they have all the information that they need to provide accurate, information or even to simplify the overall process of do of contributing the translations?
[00:28:02] Unknown:
Yeah. There are several mistakes which developers often make because they are not obvious in English and cause problems in other languages. 1 of the things is when you don't translate whole sentence, but you concatenate the strings so that you translate first half of the sentence, then you insert, for example, name and then second part of the sentence. It's usually doesn't work well, especially if you use the sentence in different half of the sentence in different context because that many languages have different genders, and the translation should be simply different based on what you are translating.
This is 1 of the things which Mozilla is trying to solve with Altwentyen, which is their new localization format which covers all these details. But on the other side, it covers, so many details that it's almost a programming language. So it's something which is not really easy to handle by somebody who has not seen it before, but still it's something it's a generic solution for the problem. In most cases, you will are good with just comp translating complete sentence or paragraph where you talk about single thing and not trying to concatenate this part of the sentences or the word in in in the code. That's 1 of the problems I have seen several times, and it really causes headaches to translate just because they have no way to translate it properly then.
[00:29:34] Unknown:
K. Are there any particular areas that you're looking for help and contributions in the Weblight project?
[00:29:41] Unknown:
Yeah. We always welcome code contributions, translations, documentation, whatever. We have also some crowdfunding efforts. So if you like our service, you can donate us. We the the free software hosting is really something which cost us some money, but we won't do it. So it's basically where the all the money we get goes so that we will be able to host more projects and do more translations.
[00:30:10] Unknown:
So for anybody who wants to follow what you're up to and get in touch, what would be the best way for them to do that? Yeah. Our website is the blade.org,
[00:30:19] Unknown:
and, we have Facebook, we have Twitter, all the usual stuff. Our code is hosted on GitHub, so it's pretty easy to be raised as well. And, yeah, that's the most important thing. So I I
[00:30:31] Unknown:
So with that, I'll move us into the picks. For my pick today, I'm going to choose the movie War Dogs, which I watched recently. So it's a movie based on the true story of a couple of, young guys who end up starting to provide arms to the US military through their, online bidding system and just sort of the story of what happens. So it's pretty interesting. It's kind of surprising to see some of the realities behind war. I I definitely enjoyed the movie. Some fairly thought provoking. So for anybody who's interested, I definitely recommend taking a look. With that, I will pass it to you. What, do you have any picks for us today, Michal?
[00:31:14] Unknown:
Okay. So my recent pick would be Jodie's chocolate. I was on chocolate tasting from them recently, and I feel enjoy these because it's 1 of the few guys in Europe who make chocolatier out of the cocoa nuts or not not doing anything produce far away, and that's really 1 of the nice chocolates I had.
[00:31:37] Unknown:
Great. Well, I really appreciate you taking the time out of your day to join me and talk about WebLates. Definitely an interesting project and 1 that I hope sees continued use. Definitely something that I will try to push for at my work to add internationalization and localization to the projects that we build. So I appreciate it, and I hope you enjoy the rest of your day. Thank you for interviewing
[00:32:01] Unknown:
me, and I hope we will get some new users then to thanks to this interview.
Introduction and Sponsor Messages
Interview with Michal Chihaj: Introduction and Background
Introduction to WebLite and Its Motivation
Key Features and Integration with Version Control
Translation Workflow and Contextual Features
Adding Translation Support to Projects
Technical Aspects and Challenges of WebLite
Feature Trade-offs and Considerations for Translation Systems
Best Practices for Developers to Support Translators
Contributions and Community Support for WebLite
Closing Remarks and Picks