Best free podcast transcript generator

If you need to transcribe less than 100 hours of audio, in my opinion the best option is to use AssemblyAI. The first 100 hours are free.

And even if you have more than 100 hours of audio, for about 60c per hour (yes, per hour and not per minute), you can get a lot of features out of the box:

  • Very high speech to text accuracy
  • An ability to improve the accuracy for very technical topics by passing in a custom vocabulary in your API call
  • Speaker diarization (who speaks during which time segment)
  • Decent quality auto chapters (in other words generate subheadings based on topic)
  • Decent quality keyword detection

I will use an example RealPython podcast episode to show how each of these features improves the podcast transcript.

How to generate a podcast transcript using AssemblyAI

The first step is to generate a basic podcast transcript without any of the additional features.

I chose the episode “Episode 184: PyCoder’s Weekly 2023 Wrap Up” because it already has some interesting aspects:

  • there are multiple topics already covered in the episode
  • most topics are different from each other (that is, there is no overarching topic)
  • the topics do not overlap with each other – that is, they are very clearly delineated in the audio itself, which is quite unusual for a podcast episode
  • there is plenty of technical jargon

Here is how you get the transcript from the URL to the mp3 file

from dotenv import load_dotenv
import assemblyai as aai
import os
import json

load_dotenv()

audio_url = 'https://files.realpython.com/podcasts/RPP_E184_02_Cx2.01c66af0a50e.mp3'
aai.settings.api_key = os.getenv('ASSEMBLYAI_API_KEY')
transcriber = aai.Transcriber()
transcript = transcriber.transcribe(audio_url)
transcript_json = transcript.json_response
with open(f'json/response.json', 'w+') as f:
    json.dump(transcript_json, f, indent=2)

This is the output:

Basic Transcript
Welcome to the Real Python podcast. This is episode 184. It's been a fascinating year for the Python language and community. Pycoders Weekly included over 1500 links to articles, blog posts, tutorials, and projects. In 2023, Christopher Trudeau is back on the show this week to help wrap up everything by sharing some highlights and python trends from across the year. Christopher shares the top five links explored by Pycoders readers. We also dig into trends we noticed across all the articles and stories this year, including removing dead batteries from the standard library, ongoing work to speed up Python using rust code in libraries, and moving away from microservices. We hope you enjoy this review. We look forward to bringing you an upcoming year full of great python news, articles, topics, and projects. All right, let's get started. The real Python podcast is a weekly conversation about using Python in the real world. My name is Christopher Bailey, your host. Each week we feature interviews with experts in the community and discussions about the topics, articles, and courses found@realpython.com. After the podcast, join us and learn real world python skills with a community of experts@realpython.com. Hey Christopher, welcome back. The last episode of 2023. Yeah, I'm excited to do a picoders wrap up. We've never done it before, and I think this will be interesting and let us know what you guys think. I'm interested in covering things like this, like kind of trends and so forth. And it was weird looking at the numbers of how many links that you guys publish every year. When we got into the thousands, I was like, okay. Or at least 1500 or something like that. I was like, wow, okay. A lot of information goes out every week. Yeah, well, we usually do about somewhere around 20 links a week is pretty typical, and 52 weeks a year. So, yeah, it adds up. Cool. So we are going to start with some news, though some things have been happening still here at the end of the year. Yeah, the march towards Python 313 steadily carries on. Feels like only a few weeks ago we were announcing 312, but that's how it goes. Python 313 Alpha two is now out. This is the second in seven planned alpha releases. So, yeah, progress happens. Another chunk of news, also from the Python Software foundation, is they've hired a supporting developer in residence. This was almost accidental. You might recall there was an announcement a while back about the position for the deputy developer in residence. Well, they got a bit more funding than they expected for the role, so they were able to hire their second choice for that job as well. So Serhi Storchaka is now getting paid to write code for the PSF. So good. Welcome. And although it sounds like he's been a longtime contributor, so that's nice that people are getting paid to do this. Yeah, Wukus has a little team now. That's pretty cool. That's right. And then a couple of quick release announcements. Django Five has gone gold, so it's time to upgrade there. Get off your release candidate if you're playing around there. And one of my favorite Django third party libraries, which is Ninja, it's a fast API like tool for writing APIs. They just released their 1st 1.0 version, so congrats to both groups for finishing their big releases. Is there much change from the versions or just kind of solidifying and making it version one? I didn't see anything massively changed. They've been on .9 for quite some time, so I think it's probably just been bug fixes and things. But yeah, it's progress. Maybe I'll include a link to your course that you did on. That's right, yeah, we can go learn more about ninja. One of the things that we thought of featuring is out of all those links that were shared of the year, we keep metrics of how many people click on the different things and kind of can get an idea of what is the most popular of all the different links from pycoders across the year. And so it looks like you compiled the top five here, and we can kind of go through them and kind of just cover them quickly. I have the top 1500, but we'll stick with the top five. Our illustrious editor in chief at real Python ran set up a query for me and it almost knocked the Heroku server over. Took a little bit, but that's fine. Yeah, so I figure folks might be curious about it. So with apologies to David Letterman, here is the Pycoders top five. All right, number five was a real Python article by Gerard Hilla called Python 312. Cool new features for you to try. This isn't terribly surprising. Lots of our coverage was on Python 312, and there was a fair amount of interest. And if you somehow missed it, there is both a tutorial article and a video course version of this content. So if you want to go back and learn more about 312, that's there. Yes, and a podcast episode about it. That's right. Number four, frequent link inamar Turner Trowaring's article speeding up your code when multiple cores aren't an option. This article covered ways of improving your code speed without touching parallelism. We actually talked about this in episode 176. Yeah. Number three was actually a fairly recent article. This one was by Mike Driscoll called Learning about code metrics in Python with radon. This article covered different ways of measuring your code for quality using the radon library. Interestingly enough, this one was also covered in episode 176. So we must have done something right that day. I don't know how that worked out. Number two is another Python 312 article. This one is Python 312. More intuitive and consistent f strings. And it's by Leodanos, Potzo, Ramos. And as the title implies, it's all about the changes introduced with the new f string parser. Yeah. Popularity of f strings there. That's right. And finally, I don't have a drum roll sound effect, so you'll just have to imagine it. Okay. Design and guidance. Object oriented programming in Python. This was actually one of my courses based on a tutorial also by Leodonis. It was the third part in a three part course all about oocoding, with part three focusing on the solid principles. It's always nice to know that some of the content I'm creating is getting viewed. So where we go, I don't know. I got to get my little gold star for number one, I guess. Yeah, good job. Yeah, that whole series was really great, and it was fun working with you to turn it into a set of video courses. And, yeah, it's been very popular. We joked around in the real Python meetings and so forth, that it was sort of the autumn of oop at real Python. There was so many object oriented kind of diving into that area of python. And so that was neat kind of coverage this year. So great. All right, I think that takes us into. What I'm going to talk about are trends across a handful of different articles and include a bunch of links. Some of these we might have touched on, and some of them we haven't, and then also mention sort of trends in some new people posting that we featured a lot this year. And the first one, the trend I wanted to talk about is about batteries. I know that may sound strange, but batteries are definitely one of those things that we talk a lot about in Python. This idea that Python has batteries included, the trends I kind of noticed is this trend toward deprecating these batteries and idea that there are dead batteries. And then there's a couple of articles that talk about there's sort of missing batteries or they would like more batteries. And so it's kind of an interesting idea of these libraries that are additions to Python that can kind of help out in ways that people would love if it was even included by default as part of a built in thing. So the first article I want to feature is actually an article about Python 312, and it's by our newish friend Bytecode from October 3. It's Python 312, what didn't make the headlines, and I'll just kind of COVID what's in the article. Briefly, they wanted to feature a handful of things that didn't quite get the high points that were covered in maybe some of real Python's coverage or other people's coverage. And the first one is there was a lot of hay being made about increasing the performance of Python over the last couple of versions. And it was noted that there was a 5% gain in general as far as what was officially sort of shared, and that maybe that was a bit of a letdown. They also ran tests that showed that between eleven and twelve, 311 and 312. It showed only 14 of the tests that they ran being faster and 79 running slower. So in this test suite that they ran, it was not really much of a gain and maybe in some cases might have been slower. So kind of a mixed story on performance, which is kind of interesting. They also talked about pathlib improvements, which I think we talked a little bit about, but Pathlib added a walk feature which is very handy. Better debugging experiences. Not only were the better error messages that we talked about in detail, but there's some new ways that PDB functions, and the article kind of digs into that a little bit. Some new command line interfaces, things that you might not. Many people are aware that there's this HTTP server built into Python, or you maybe aren't aware of it, but you can just type python mhtttp server and boom, you got an HTTP server set up which is really handy for doing different things. 312 added a SQL lite shell that does something similar. And then it also added a UUId. Is it universal? I forget what the extra U is in UUID, but you can just generate one. It stands for very long. Very long. Id. Yeah, exactly. And unique. And then at the very end of it is where I wanted to get to, which was headline, which was deprecations, deprecation, deprecations. And things that they call out is Telnet lib two to three being deprecated, date time UTC now. So these are things that are noted as deprecations. Their joke is maybe don't update to the new version on a Friday. So there's lots of these little things that are kind of happening and they link to the Docs section of the 312 release notes which digs into this whole deprecated section. What's interesting is just kind of going through the list and kind of seeing okay, is that going to affect your code and these things that are being removed, these older things that maybe there's new ways that they're being approached. And the other thing that I thought was interesting is the next section kind of goes into this pending removal for 313. And that leads me into the next article which Python 313 removes 20 standard library modules. So again, coming up here in 313. Hence why you may want to pay attention to the upcoming alpha releases. Core developers are busy working on pep five nine four, which is sort of known as the removing dead batteries pep. It was approved. Kind of a long post that goes into the discussion forums. There's also this link is the first post kicking off this thread from Victor Stenner. He's a cordev and it runs through the highlights. But I thought this section on advice was kind of interesting. If your project is affected, you have some different solutions. First one is do nothing for now, only remain compatible with 311 and older. In his opinion, it's dangerous long term choice that the technical debt only becomes more expensive over time. But maybe someone will come with a solution for you in the meanwhile. Next one is attempt to propose recipes and alternatives in the what's new documents and create a group of volunteers and give a new life to the remove module by maintaining them on Pipi. You will be able to use Pip install, then Linux distributions should package it and add it as a new dependency to your projects. And the last advice is copy the remove module inside your project and maintain it there. Usually it's a single short py file. If you choose to skip documentation and tests, just be careful about the license and copyright. You're now on your own to maintain it. But this solution is quick and simple. So a handful of approaches, I think all of them have their own baggage potentially there that you need to pay attention to. His next post is thanking Christian heims and Bret cannon for being stubborn and getting this pep five nine four accepted and implemented. So yeah, digging into the batteries being removed and then what I thought was interesting is this trend across picoders over the year though of including links that talked about either missing batteries or maybe the desire for other things that are like this. The first one is missing batteries, essential libraries you're missing out on. Even though Python standard library comes with batteries included, it's still missing some essentials. This article is from Martin Hines, who we feature pretty often. It's from May 1. Again, kind of digging to that idea that batteries are included thanks to its extensive library in Python. Many modules and functions that you would not expect to be there are there, which is really kind of nice. However, there are more that he feels are essential, and he just goes through and lists a whole bunch of them and kind of digs into them in detail. One is a set called bolt ons, which includes json utils, time utils, and iter utils. The json utils adds a set of tools for working with JSON. One that's really unique that I haven't seen before is called json lines, which deals with the JSON JSONL format. There's a date range in the time utils that he highlights that creates date ranges that can be iterated over, which is really kind of a neat functionality. Iter utils, not to be confused with iter tools, which we'll talk about. The iterutils has a bunch of really interesting remapping sort of functions for iterators. He talks about a library sh that looks for shell commands and binary packages in your path. It can use pseudo commands, has a bunch of stuff under data validation, a validators library that instead of creating your own reg x formats for things like email addresses or credit cards or IP addresses, that has that stuff kind of built into it, which is nice. There's a library that originally was called fuzzy wuzzy. That's how I had heard of it originally. A coworker that I had back in Hawaii shout out to TJ. He loved that package. He would use it all the time. We worked in the marketing department. And so finding close or close enough matches in text strings and things like that was really important. They've renamed it to the fuzz. It has functions and modules, things like a ratio function, get close matches, sequence matchers, really handy for again, that kind of fuzzy, close enough matching in text. A couple of others that are nice, freeze gun which can freeze time when you're doing testing. And then kind of similar to the fuzz, there's a thing called dirty underscore equals that has like is approx is now is JSON is positive int has all these kind of interesting little different sort of functionality in the library. He also links out to an earlier post that he talks about in 2020. He had one about iter tools and the more iter tools library. And then the other one I want to mention is a post from Carlton Gibson that's called more batteries, please. He's kind of in the hashtag more batteries camp, as opposed to the remove the dead batteries camp, which I think is kind of interesting. More than just including the link, I want to mention that he refers to the series by Doug Hellman that started in the Python two era that was called the module of the week, and it got updated to the Python three module of the week, which I'll include a link to that also. And it's just a great way to sort of study what is all inside of python as far as looking at the batteries and the things that are there and all the different modules. And I can't think of a better way to kind of learn some of the deeper areas of python than to kind of look at each know maybe one a week as way of practicing and learning a little bit more about it. And then Christopher and I, we covered in some detail. We had a discussion in episode 171. The title of the episode was making each line of code efficient. And we discussed iter tools. We discussed more iter tools. We talked about enumerate and functions like all and any and stuff like that, ways that you can kind of write code, how there's lots of built in kind of functionality that you're not having to recreate this functionality yourself, that a lot of that's built into Python, and you can write efficient python with very few lines of code. So, yeah, so kind of a trend across a variety of articles there. I don't know if you'd heard of any of those other libraries. I've come across a couple of them, but none of them are common in my toolkit, for sure. Yeah. I think it kind of depends on the way that you work. It sounds like a lot of them would be handy in the testing environment. Those things like the dirty equals and that freeze gun, I thought was interesting as far as freezing time. Yeah. And I think that's sort of always the debate with the, should it be included as a battery or not? Right. I've seen a couple conversations, particularly about the requests library. Yeah. Everyone's like, everyone uses it. Why isn't it just part of the standard library? And the counterargument is because then we can control the release path and it isn't tied to a Python release. And so even if everybody is using it, there's an argument to keep it out. Right. So, yeah, I think that's anytime you have these kinds of things, it's always a trade off. That one's a hybrid in a way that I don't want to call it abandoned, but it kind of got taken over, and I think the Python packaging authority is in charge of request now. Oh, really? I hadn't heard that it shifted ownership in some way. I'd have to maybe do the research again, but it's definitely under a different GitHub than originally as far as the path of it and its development. But, yeah, it's crucial. But should it be built in? Yeah. And then that gets into a whole other debate that we talked about with Waziwasm and even these other embedded tools. Where can't we just get Python down to its core as opposed to everything else added? What was your trend or area that you wanted to talk about? So I've got an article actually on trends. It's called three Python trends in 2023, and it's from the site Jerry codes. I assume that's not his real name. Okay, Jerry might be, but I'm suspicious when the last name is codes for a programmer. This article is highlighted way back in issue 563, which is at the beginning of the year. Hadn't highlighted it in one of the podcasts before, but I thought it would be kind of interesting. He talks about three different things, and they're sort of predictions, or you could kind of claim that the trends have continued, because he was talking about this in February, and they're still valid. So the first one that Jerry covers is he subtitled it Python handshake emoji rust. And he mentions the PI three library that allows Rust to call Python and Python to call Rust. And this gives several examples of libraries where this inter language friendship is sort of blossoming. Pedantic polars and rough are sort of the three of the more popular components out there that have rust chunks in them. Rust is far safer than C and almost as fast, so it works as a nice companion when you need that whole get down closer to the metal thing that some Python libraries do. So this is something that he was saying is a trend in 2023, and I think it's very much continued and will keep happening in 2024. And as rough keeps getting new features, it's starting to win the battle as to which linter you should be using as well. We talked a couple of episodes back about the fact that they've added formatting to it, and the fact that it allows single quotes and doesn't scream about them is enough to convert from black, in my opinion. But anyways, I'll get off my soapbox. If anything was a trend, people talking about rough was like one of the most common things I heard the second trend Jerry mentions is web apps. There were several announcements, pyodide, PyScript, those things sort of the year built on WASM, and all the things that are happening in that space. He also mentions a couple of framework libraries that are Python libraries for building web things, streamlit, nice GuI, and Pine cone spelt with a y. So there seems to be almost a resurgence in this space. And I do wonder, particularly with the increase in popularity of HTMX, there's been a bit of a push back away from single page applications, because HTMX lets you get a lot closer to that without all of the overhead. So I think these simpler libraries in combination with something like HTMX gives you a fair amount of power. So it's almost a resurgence of Web 1.0 there without the restrictions and handcuffs that Web 1.0 had. And the third trend is typing and type safety. Each new release of Python adds more typing features, and we've talked about it many times in the podcast, and it essentially just keeps having a deep effect on the language, and I suspect that one's going to keep happening. As I mentioned, Jerry wrote this article in February. The trends he selected were pretty prognostic. They're still happening, and I suspect we're still going to get new additions of a lot of this into 2024 as well. Yeah, I think the other two things that you dug into there, the HTMX is one of the other phrases, or the other libraries, or whatever you want to call it. What is HTMX? No, it's a library. It's definitely a library. It's a JavaScript library. That means you don't have to write JavaScript, which is the best kind. The beauty of it, right? Just add it and you're good. I've heard so much about it. That's probably just before Pycon I started to hear about it, and then I sat down at a table and that was like all these people wanted to talk about around me. And then I kept seeing it in other places, and I never quite got somebody on the show about it this year. But next year I do plan to dig into it and talk a little bit about it. I know that you did a course kind of digging into it, and so you've had your hands on it a little bit more. Do you continue to see it growing? I think it's a little bit like your favorite band. That is a low overnight success. That's on their fifth album. Okay. It's been around for a couple of years. It used to be called something else. So I think it's just kind of hit the peak where enough people have figured out that this is helpful. Okay. There's some popularity going there. It's also got a really strong social media presence, for whatever reason. A few months back, somebody on the HTMX hashtag and Twitter discovered that there's a chapter in my book on it and posted, hey, this is coming. Isn't this great? And I couldn't believe the traffic. And it was a lot of memes with horses and laser beams coming in their eyes. And the fact that I had said the word HTMX, that's funny, it just sort of caused everything to blow up. So there's definitely a group of folks who are very excited about pushing it. Yeah, this week I want to shine a spotlight on another real Python video course. This year we've talked a lot about object oriented programming, or oop, in Python, the method of structuring a program by bundling related properties and behaviors into individual objects. The course I'm featuring this week builds on a previous Python basics course, object oriented programming, and both courses are built from the section of the real Python book, Python Basics a practical introduction to Python three. This one's titled Python Basics building systems with classes. In the course, Ian Curry takes you through how to compose classes together by creating layers of functionality, how to inherit and override behavior from other classes to create variations, extending a parent class, how to use the super function, and how to creatively mix and match these approaches. Object oriented programming can be a bit intimidating for someone starting on their Python journey, and this course continues with a steady hand to lead you deeper into the topic. And now you can check out the companion course with exercises and a challenge to practice these techniques and hone your skills. This companion course is titled Python Basics Exercises, building systems with classes and the instructor is Martin Royce. And like all the video courses on real Python, the course is broken into easily consumable sections, and you get additional resources and code examples for the techniques shown. All the course lessons have a transcript, including closed captions. And for all the basics courses, you'll see that we've created quizzes for you to test your knowledge as you work through them. Check out the video course. You can find a link in the show notes, or you can find it using the enhanced search tool on realpython.com. One of the other trends in articles that I wanted to talk about briefly is this sort of desire for speed. We've had a handful of different things kind of digging into it. We talked a lot about how what's going to be happening in Python 313 with the gillectomy, or whatever you want to call it, the idea of trying to remove the gill and the plans for how to do that and ways that might approach speed. We've talked about the rustification, we talked about a lot of these other things, but one of the more interesting ones that I felt like kind of came out of left field a little bit was the introduction of Mojo. There was a very flashy launch of it without being able to play in it. You had to sign up to be in their playground. This is back in May. So the first article was Mojo, a superset of Python. And this was from Jeremy Howard from Fast AI, where he was talking about it. And Mojo is a new programming language which is a superset or builds on top of Python. So it's very pythonic in its design, but it is its own language. And the goal is to fix bottlenecks for Python's performance in specifically machine learning and AI areas where you're dealing with these massive data processing situations where, like, okay, not only that, but also how would you potentially deploy these models? It was very interesting to kind of watch it in the background. There's a follow up link that I'll include that's just titled why Mojo. This one's from Chris Latner, who's from Modular. We talked a lot about Chris and his background and how he's always been a compiler person. The subtitle of that is a backstory and rationale for why we created the Mojo language. And then a little later in the year, we got the SDK release for Linux. So I have that link. And that allowed another article to kind of come out, which was a mojo head to head with Python. And the person who did this comparison, Maxim Saplin. His article covers this Mandelbrot based benchmark of Python, variations of Numba, comparing it to the newly available Mojo. And although Mojo was fast, it takes up a lot more work than this author expected to translate Python to it. And with the right parameters, Numb was still beating it. So kind of an interesting comparison. I think there's going to be continuing trends of that. I think there was another article that I didn't find it in the list there that, where we talked about it did finally get released on the Mac platform, and luckily it's working on Apple Silicon. So mojo looks like it's kind of across all the platforms now, which is nice. So if you're interested in diving into this new language that is sort of this subset of python. Here's a set of articles that can kind of give you the background at it in a way that you can kind of dig in. What's the last topic you wanted to dig into? So I don't know if I want to fully declare 2023 the year microservices died. Okay. But there definitely was a lot of counter coverage throughout the year. I saw a whole bunch of articles basically saying, you shouldn't do this, or you shouldn't do this first, or we did it and we regret it. Learn from us. The one I want to highlight is by David Sudden of Kraken Technologies, who wrote a pretty good article on how they structure their python monolith. And he didn't really get into the microservices thing, but it was essentially the hey, at scale, you can still do this even with Python. It was linked in issue 586, and the code at Kraken has over 25,000 modules, not including their tests, and over 400 developers actively working on it. So this is not a small undertaking. And very few articles out there talk about structure at this size. And those that do are almost always, this is how you cloudify, or this is why you should break it down into microservices. So it was kind of great to see somebody with practical experience saying, no, you can still do this with a monolith. And these are the design decisions we made and why David spends a fair amount of time talking about how they layer their modules in their architecture and specifically what kind of approach they took at Kraken in order to stop it from being unmaintainable. One of the key decisions they made was to enforce that dependencies are always downward facing in their stack, so the module on bottom can't have dependencies on the module above it in the layer. And then to enforce this, they actually have a linting tool, which screams if the rules are violated. So the linter goes through and checks what modules are being called. And they essentially have groupings of modules where they say, hey, this is a layer a thing, and layer B, it checks. Oh, okay. Layer B is allowed to call A, but isn't allowed to call C, for example. The article then goes on to talk about how they recognize and address their technical debt, which isn't so much about monoliths as it is about having large chunks of software that you're maintaining. And if you're going to keep adding features as you go, you have to think about this and how to grow it. And if you don't clean up as you go along, you're going to end up with a mess. So this becomes part of managing larger software projects. And then finally the article finishes up by recognizing reality, which is kind of what I just finished saying a minute ago. All design decisions are trade offs. So David then kind of touches know where there are rough spots and how they try to work around those rough spots at Kraken. So I love when larger projects reveal how they think about things. This is good content for developers to learn from. You're not on a large project. There isn't a lot of content like this. And if you are on a large project, there might be some takeaway ideas in here that you can beg, borrow, and steal for your own painful journey. I'm trying to think of when the microservice trend was at its peak. I feel like it was the year that I kind of started in Python because I heard so much about it that year, which was 2017, 2018. And then I feel like ever since then, it's been on the wane. My gut is it's been about a decade, but I don't have any evidence to say that could be an old man. After a certain point in time, it just becomes, well, we talked about it before COVID and that's the new line for everything. The gray area. Yeah, it must be old technology. It was before COVID It's interesting that the trend, I'm guessing a lot of companies just stayed with what they had. This is why it's working. I agree. The idea of showing how they're doing things and their design philosophies. I really do enjoy those kind of articles, too. I feel like that helps. Well, and I think particularly because it's Python as well. Right. This was specific to a Python conversation because this is one of the things, not that Python is a toy language, but because so many people do use it for like 50 line scripts, I think folks think, oh, well, you can't really engineer in that. And a lot of the type hinting and type safety kind of conversations are almost always around that kind of space. Well, because Python isn't strongly typed, you can't use it to do something of a certain size. So it's always nice to see a counterexample where it's like, oh, yeah, we've got hundreds of thousands of lines of code here, pure python, and it's working fine. Right. Yeah. We don't have a discussion for you this week. I have a single project that I wanted to kind of at least cover. It's sort of an article and a project at the same time, I feel like it's something I can kind of leave you with to play with. This is from Lucas Krimpov. It's on Hacker noon, and the title of it is Python and folium to visualize my outdoor activities. Lucas is a hiker, kind of an outdoors person, and he has a dream of hiking from Munich to Venice and tracking across the beautiful Alps, which I think would be quite the intense hike if you were to take that on. He likes to track his hikes and his exercise and other outdoor activities. He's found that a lot of these outdoor or sport apps, things like adidas running, comout, I'm not familiar with that one. But Strava I'm familiar with and a few others, allow you to export the activities that you do as a unique file format, a thing called a GPX file. GPX stands for GPS exchange format. Takes you into how to work with these GPX files and walks you through creating up a Jupyter notebook. And he's using Folium, which we've talked about a few times across the last couple of years. It's a really great library for doing interactive maps, and a very popular way to do that is inside of a Jupyter notebook. He goes into plotting these GPX trails onto this interactive map. He uses a library called GPX PI, which makes sense. And the GPX stuff, I guess, includes not only the GPS data, but elevation also through this you can kind of practice Foleum's feature groups, their polyline feature, and then also this nice feature of adding different types of markers across the maps for the interactivity, like what you want to turn on and turn off and kind of check between different things. He mentions that to stay tuned because he's got much more to come. Things like deploying a website using an AWS, plotting elevation and speed profiles, using Python, and plotly enhancing trails of pictures, and much more. He provides the link to his GitHub, which has the Jupyter notebook in it. The notebook that he shares on this hacker noon article is kind of formatted a little weird, but you can kind of see the documentation he's done inside there that gives a lot of detail of not only the sets of specific data and GPS areas and elevations and things like that, but it has. Why is it organized like this and so forth? You can dig in and learn a little bit more about working with it if you're interested. For additional resources, there's a couple of resources on real Python about folium. One is a tutorial by Martin Royce, and then Kimberly Fessel created a video course based on that that actually dug in a little bit deeper and covered even more standardized features of working with folium to create webmaps for your data. Well, Chris, thanks for helping me cover all these picoders, articles and projects across 2023. It's been an interesting year. We'll see what 2024 has in store for us. Yes. All right, well, see you next year. Cheers. I want to thank Christopher Trudeau for coming on the show and helping me wrap up all the pycoders news and articles and projects. And I want to thank you for listening to the Real Python podcast. Make sure that you click that follow button in your podcast player, and if you see a subscribe button somewhere, remember that the real Python podcast is free. If you like the show, please leave us a review. You can find show notes with links to all the topics we spoke about inside your podcast player or@realpython.com Slash podcast. And while you're there, you can leave us a question or a topic idea. I've been your host, Christopher Bailey, and look forward to talking to you soon.

While the transcript is pretty decent, you can see that it has missed some of the technical jargon, like Py03

The next thing we will do is add speaker diarization.

How to add Speaker Diarization using AssemblyAI

For adding diarization, we need to define a config, and add the config in the API call as you can see in the code below.

from dotenv import load_dotenv
import assemblyai as aai
import os
import json
import datetime

load_dotenv()

audio_url = 'https://files.realpython.com/podcasts/RPP_E184_02_Cx2.01c66af0a50e.mp3'
aai.settings.api_key = os.getenv('ASSEMBLYAI_API_KEY')
transcriber = aai.Transcriber()
config = aai.TranscriptionConfig(speaker_labels=True)
transcript = transcriber.transcribe(audio_url, config=config)
transcript_json = transcript.json_response
with open(f'json/response.json', 'w+') as f:
    json.dump(transcript_json, f, indent=2)

diarized_transcript = ''
utterances = transcript_json['utterances']
for utterance in utterances:
    speaker = utterance['speaker']
    start = utterance['start']
    end = utterance['end']
    text = utterance['text']
    start_sec = str(datetime.timedelta(seconds=utterance["start"] // 1000))
    end_sec = str(datetime.timedelta(seconds=utterance["end"] // 1000))
    diarized_transcript += f'Speaker:{speaker} [{start_sec} - {end_sec}]\n{text}\n\n'

with open(f'json/diarized_transcript.txt', 'w+') as f:
    f.write(diarized_transcript)

The response JSON contains an array of JSON objects called utterances, and you can iterate over each object and get the speaker label, start, end and text objects as you can see in the code to create a diarized transcript as I have done in my code.

Here is the result:

Diarized Transcript
Speaker:A [0:00:00 - 0:01:34]
Welcome to the Real Python podcast. This is episode 184. It's been a fascinating year for the Python language and community. Pycoders Weekly included over 1500 links to articles, blog posts, tutorials, and projects. In 2023, Christopher Trudeau is back on the show this week to help wrap up everything by sharing some highlights and python trends from across the year. Christopher shares the top five links explored by Pycoders readers. We also dig into trends we noticed across all the articles and stories this year, including removing dead batteries from the standard library, ongoing work to speed up Python using rust code in libraries, and moving away from microservices. We hope you enjoy this review. We look forward to bringing you an upcoming year full of great python news, articles, topics, and projects. All right, let's get started. The real Python podcast is a weekly conversation about using Python in the real world. My name is Christopher Bailey, your host. Each week we feature interviews with experts in the community and discussions about the topics, articles, and courses found@realpython.com. After the podcast, join us and learn real world python skills with a community of experts@realpython.com. Hey Christopher, welcome back.

Speaker:B [0:01:34 - 0:01:37]
The last episode of 2023.

Speaker:A [0:01:37 - 0:02:08]
Yeah, I'm excited to do a picoders wrap up. We've never done it before, and I think this will be interesting and let us know what you guys think. I'm interested in covering things like this, like kind of trends and so forth. And it was weird looking at the numbers of how many links that you guys publish every year. When we got into the thousands, I was like, okay. Or at least 1500 or something like that. I was like, wow, okay. A lot of information goes out every week.

Speaker:B [0:02:09 - 0:02:17]
Yeah, well, we usually do about somewhere around 20 links a week is pretty typical, and 52 weeks a year.

Speaker:A [0:02:17 - 0:02:26]
So, yeah, it adds up. Cool. So we are going to start with some news, though some things have been happening still here at the end of the year.

Speaker:B [0:02:26 - 0:03:14]
Yeah, the march towards Python 313 steadily carries on. Feels like only a few weeks ago we were announcing 312, but that's how it goes. Python 313 Alpha two is now out. This is the second in seven planned alpha releases. So, yeah, progress happens. Another chunk of news, also from the Python Software foundation, is they've hired a supporting developer in residence. This was almost accidental. You might recall there was an announcement a while back about the position for the deputy developer in residence. Well, they got a bit more funding than they expected for the role, so they were able to hire their second choice for that job as well. So Serhi Storchaka is now getting paid to write code for the PSF.

Speaker:A [0:03:14 - 0:03:15]
So good.

Speaker:B [0:03:15 - 0:03:21]
Welcome. And although it sounds like he's been a longtime contributor, so that's nice that people are getting paid to do this.

Speaker:A [0:03:21 - 0:03:24]
Yeah, Wukus has a little team now. That's pretty cool.

Speaker:B [0:03:24 - 0:03:47]
That's right. And then a couple of quick release announcements. Django Five has gone gold, so it's time to upgrade there. Get off your release candidate if you're playing around there. And one of my favorite Django third party libraries, which is Ninja, it's a fast API like tool for writing APIs. They just released their 1st 1.0 version, so congrats to both groups for finishing their big releases.

Speaker:A [0:03:47 - 0:03:53]
Is there much change from the versions or just kind of solidifying and making it version one?

Speaker:B [0:03:53 - 0:04:04]
I didn't see anything massively changed. They've been on .9 for quite some time, so I think it's probably just been bug fixes and things. But yeah, it's progress.

Speaker:A [0:04:04 - 0:04:08]
Maybe I'll include a link to your course that you did on.

Speaker:B [0:04:08 - 0:04:11]
That's right, yeah, we can go learn more about ninja.

Speaker:A [0:04:11 - 0:04:36]
One of the things that we thought of featuring is out of all those links that were shared of the year, we keep metrics of how many people click on the different things and kind of can get an idea of what is the most popular of all the different links from pycoders across the year. And so it looks like you compiled the top five here, and we can kind of go through them and kind of just cover them quickly.

Speaker:B [0:04:36 - 0:05:23]
I have the top 1500, but we'll stick with the top five. Our illustrious editor in chief at real Python ran set up a query for me and it almost knocked the Heroku server over. Took a little bit, but that's fine. Yeah, so I figured folks might be curious about it. So with apologies to David Letterman, here is the Pycoders top five. All right, number five was a real Python article by Gerard Hilla called Python 312. Cool new features for you to try. This isn't terribly surprising. Lots of our coverage was on Python 312, and there was a fair amount of interest. And if you somehow missed it, there is both a tutorial article and a video course version of this content. So if you want to go back and learn more about 312, that's there.

Speaker:A [0:05:23 - 0:05:25]
Yes, and a podcast episode about it.

Speaker:B [0:05:25 - 0:05:45]
That's right. Number four, frequent link inamar Turner Trowaring's article speeding up your code when multiple cores aren't an option. This article covered ways of improving your code speed without touching parallelism. We actually talked about this in episode 176.

Speaker:A [0:05:45 - 0:05:46]
Yeah.

Speaker:B [0:05:46 - 0:06:24]
Number three was actually a fairly recent article. This one was by Mike Driscoll called Learning about code metrics in Python with radon. This article covered different ways of measuring your code for quality using the radon library. Interestingly enough, this one was also covered in episode 176. So we must have done something right that day. I don't know how that worked out. Number two is another Python 312 article. This one is Python 312. More intuitive and consistent f strings. And it's by Leodanos, Potzo, Ramos. And as the title implies, it's all about the changes introduced with the new f string parser.

Speaker:A [0:06:24 - 0:06:26]
Yeah. Popularity of f strings there.

Speaker:B [0:06:26 - 0:06:32]
That's right. And finally, I don't have a drum roll sound effect, so you'll just have to imagine it.

Speaker:A [0:06:32 - 0:06:32]
Okay.

Speaker:B [0:06:32 - 0:06:59]
Design and guidance. Object oriented programming in Python. This was actually one of my courses based on a tutorial also by Leodonis. It was the third part in a three part course all about oocoding, with part three focusing on the solid principles. It's always nice to know that some of the content I'm creating is getting viewed. So where we go, I don't know. I got to get my little gold star for number one, I guess.

Speaker:A [0:06:59 - 0:10:34]
Yeah, good job. Yeah, that whole series was really great, and it was fun working with you to turn it into a set of video courses. And, yeah, it's been very popular. We joked around in the real Python meetings and so forth, that it was sort of the autumn of oop at real Python. There was so many object oriented kind of diving into that area of python. And so that was neat kind of coverage this year. So great. All right, I think that takes us into. What I'm going to talk about are trends across a handful of different articles and include a bunch of links. Some of these we might have touched on, and some of them we haven't, and then also mention sort of trends in some new people posting that we featured a lot this year. And the first one, the trend I wanted to talk about is about batteries. I know that may sound strange, but batteries are definitely one of those things that we talk a lot about in Python. This idea that Python has batteries included, the trends I kind of noticed is this trend toward deprecating these batteries and idea that there are dead batteries. And then there's a couple of articles that talk about there's sort of missing batteries or they would like more batteries. And so it's kind of an interesting idea of these libraries that are additions to Python that can kind of help out in ways that people would love if it was even included by default as part of a built in thing. So the first article I want to feature is actually an article about Python 312, and it's by our newish friend Bytecode from October 3. It's Python 312, what didn't make the headlines, and I'll just kind of COVID what's in the article. Briefly, they wanted to feature a handful of things that didn't quite get the high points that were covered in maybe some of real Python's coverage or other people's coverage. And the first one is there was a lot of hay being made about increasing the performance of Python over the last couple of versions. And it was noted that there was a 5% gain in general as far as what was officially sort of shared, and that maybe that was a bit of a letdown. They also ran tests that showed that between eleven and twelve, 311 and 312. It showed only 14 of the tests that they ran being faster and 79 running slower. So in this test suite that they ran, it was not really much of a gain and maybe in some cases might have been slower. So kind of a mixed story on performance, which is kind of interesting. They also talked about pathlib improvements, which I think we talked a little bit about, but Pathlib added a walk feature which is very handy. Better debugging experiences. Not only were the better error messages that we talked about in detail, but there's some new ways that PDB functions, and the article kind of digs into that a little bit. Some new command line interfaces, things that you might not. Many people are aware that there's this HTTP server built into Python, or you maybe aren't aware of it, but you can just type python mhtttp server and boom, you got an HTTP server set up which is really handy for doing different things. 312 added a SQL lite shell that does something similar. And then it also added a UUId. Is it universal? I forget what the extra U is in UUID, but you can just generate one.

Speaker:B [0:10:34 - 0:10:36]
It stands for very long.

Speaker:A [0:10:36 - 0:17:51]
Very long. Id. Yeah, exactly. And unique. And then at the very end of it is where I wanted to get to, which was headline, which was deprecations, deprecation, deprecations. And things that they call out is Telnet lib two to three being deprecated, date time UTC now. So these are things that are noted as deprecations. Their joke is maybe don't update to the new version on a Friday. So there's lots of these little things that are kind of happening and they link to the Docs section of the 312 release notes which digs into this whole deprecated section. What's interesting is just kind of going through the list and kind of seeing okay, is that going to affect your code and these things that are being removed, these older things that maybe there's new ways that they're being approached. And the other thing that I thought was interesting is the next section kind of goes into this pending removal for 313. And that leads me into the next article which Python 313 removes 20 standard library modules. So again, coming up here in 313. Hence why you may want to pay attention to the upcoming alpha releases. Core developers are busy working on pep five nine four, which is sort of known as the removing dead batteries pep. It was approved. Kind of a long post that goes into the discussion forums. There's also this link is the first post kicking off this thread from Victor Stenner. He's a cordev and it runs through the highlights. But I thought this section on advice was kind of interesting. If your project is affected, you have some different solutions. First one is do nothing for now, only remain compatible with 311 and older. In his opinion, it's dangerous long term choice that the technical debt only becomes more expensive over time. But maybe someone will come with a solution for you in the meanwhile. Next one is attempt to propose recipes and alternatives in the what's new documents and create a group of volunteers and give a new life to the remove module by maintaining them on Pipi. You will be able to use Pip install, then Linux distributions should package it and add it as a new dependency to your projects. And the last advice is copy the remove module inside your project and maintain it there. Usually it's a single short py file. If you choose to skip documentation and tests, just be careful about the license and copyright. You're now on your own to maintain it. But this solution is quick and simple. So a handful of approaches, I think all of them have their own baggage potentially there that you need to pay attention to. His next post is thanking Christian heims and Bret cannon for being stubborn and getting this pep five nine four accepted and implemented. So yeah, digging into the batteries being removed and then what I thought was interesting is this trend across picoders over the year though of including links that talked about either missing batteries or maybe the desire for other things that are like this. The first one is missing batteries, essential libraries you're missing out on. Even though Python standard library comes with batteries included, it's still missing some essentials. This article is from Martin Hines, who we feature pretty often. It's from May 1. Again, kind of digging to that idea that batteries are included thanks to its extensive library in Python. Many modules and functions that you would not expect to be there are there, which is really kind of nice. However, there are more that he feels are essential, and he just goes through and lists a whole bunch of them and kind of digs into them in detail. One is a set called bolt ons, which includes json utils, time utils, and iter utils. The json utils adds a set of tools for working with JSON. One that's really unique that I haven't seen before is called json lines, which deals with the JSON JSONL format. There's a date range in the time utils that he highlights that creates date ranges that can be iterated over, which is really kind of a neat functionality. Iter utils, not to be confused with iter tools, which we'll talk about. The iterutils has a bunch of really interesting remapping sort of functions for iterators. He talks about a library sh that looks for shell commands and binary packages in your path. It can use pseudo commands, has a bunch of stuff under data validation, a validators library that instead of creating your own reg x formats for things like email addresses or credit cards or IP addresses, that has that stuff kind of built into it, which is nice. There's a library that originally was called fuzzy wuzzy. That's how I had heard of it originally. A coworker that I had back in Hawaii shout out to TJ. He loved that package. He would use it all the time. We worked in the marketing department. And so finding close or close enough matches in text strings and things like that was really important. They've renamed it to the fuzz. It has functions and modules, things like a ratio function, get close matches, sequence matchers, really handy for again, that kind of fuzzy, close enough matching in text. A couple of others that are nice, freeze gun which can freeze time when you're doing testing. And then kind of similar to the fuzz, there's a thing called dirty underscore equals that has like is approx is now is JSON is positive int has all these kind of interesting little different sort of functionality in the library. He also links out to an earlier post that he talks about in 2020. He had one about iter tools and the more iter tools library. And then the other one I want to mention is a post from Carlton Gibson that's called more batteries, please. He's kind of in the hashtag more batteries camp, as opposed to the remove the dead batteries camp, which I think is kind of interesting. More than just including the link, I want to mention that he refers to the series by Doug Hellman that started in the Python two era that was called the module of the week, and it got updated to the Python three module of the week, which I'll include a link to that also. And it's just a great way to sort of study what is all inside of python as far as looking at the batteries and the things that are there and all the different modules. And I can't think of a better way to kind of learn some of the deeper areas of python than to kind of look at each know maybe one a week as way of practicing and learning a little bit more about it. And then Christopher and I, we covered in some detail. We had a discussion in episode 171. The title of the episode was making each line of code efficient. And we discussed iter tools. We discussed more iter tools. We talked about enumerate and functions like all and any and stuff like that, ways that you can kind of write code, how there's lots of built in kind of functionality that you're not having to recreate this functionality yourself, that a lot of that's built into Python, and you can write efficient python with very few lines of code. So, yeah, so kind of a trend across a variety of articles there. I don't know if you'd heard of any of those other libraries.

Speaker:B [0:17:51 - 0:17:56]
I've come across a couple of them, but none of them are common in my toolkit, for sure.

Speaker:A [0:17:56 - 0:18:08]
Yeah. I think it kind of depends on the way that you work. It sounds like a lot of them would be handy in the testing environment. Those things like the dirty equals and that freeze gun, I thought was interesting as far as freezing time.

Speaker:B [0:18:08 - 0:18:19]
Yeah. And I think that's sort of always the debate with the, should it be included as a battery or not? Right. I've seen a couple conversations, particularly about the requests library.

Speaker:A [0:18:19 - 0:18:20]
Yeah.

Speaker:B [0:18:20 - 0:18:40]
Everyone's like, everyone uses it. Why isn't it just part of the standard library? And the counterargument is because then we can control the release path and it isn't tied to a Python release. And so even if everybody is using it, there's an argument to keep it out. Right. So, yeah, I think that's anytime you have these kinds of things, it's always a trade off.

Speaker:A [0:18:40 - 0:18:49]
That one's a hybrid in a way that I don't want to call it abandoned, but it kind of got taken over, and I think the Python packaging authority is in charge of request now.

Speaker:B [0:18:49 - 0:18:50]
Oh, really?

Speaker:A [0:18:50 - 0:19:21]
I hadn't heard that it shifted ownership in some way. I'd have to maybe do the research again, but it's definitely under a different GitHub than originally as far as the path of it and its development. But, yeah, it's crucial. But should it be built in? Yeah. And then that gets into a whole other debate that we talked about with Waziwasm and even these other embedded tools. Where can't we just get Python down to its core as opposed to everything else added? What was your trend or area that you wanted to talk about?

Speaker:B [0:19:21 - 0:21:06]
So I've got an article actually on trends. It's called three Python trends in 2023, and it's from the site Jerry codes. I assume that's not his real name. Okay, Jerry might be, but I'm suspicious when the last name is codes for a programmer. This article is highlighted way back in issue 563, which is at the beginning of the year. Hadn't highlighted it in one of the podcasts before, but I thought it would be kind of interesting. He talks about three different things, and they're sort of predictions, or you could kind of claim that the trends have continued, because he was talking about this in February, and they're still valid. So the first one that Jerry covers is he subtitled it Python handshake emoji rust. And he mentions the PI three library that allows Rust to call Python and Python to call Rust. And this gives several examples of libraries where this inter language friendship is sort of blossoming. Pedantic polars and rough are sort of the three of the more popular components out there that have rust chunks in them. Rust is far safer than C and almost as fast, so it works as a nice companion when you need that whole get down closer to the metal thing that some Python libraries do. So this is something that he was saying is a trend in 2023, and I think it's very much continued and will keep happening in 2024. And as rough keeps getting new features, it's starting to win the battle as to which linter you should be using as well. We talked a couple of episodes back about the fact that they've added formatting to it, and the fact that it allows single quotes and doesn't scream about them is enough to convert from black, in my opinion. But anyways, I'll get off my soapbox.

Speaker:A [0:21:07 - 0:21:14]
If anything was a trend, people talking about rough was like one of the most common things I heard the second.

Speaker:B [0:21:14 - 0:22:37]
Trend Jerry mentions is web apps. There were several announcements, pyodide, PyScript, those things sort of the year built on WASM, and all the things that are happening in that space. He also mentions a couple of framework libraries that are Python libraries for building web things, streamlit, nice GuI, and Pine cone spelt with a y. So there seems to be almost a resurgence in this space. And I do wonder, particularly with the increase in popularity of HTMX, there's been a bit of a push back away from single page applications, because HTMX lets you get a lot closer to that without all of the overhead. So I think these simpler libraries in combination with something like HTMX gives you a fair amount of power. So it's almost a resurgence of Web 1.0 there without the restrictions and handcuffs that Web 1.0 had. And the third trend is typing and type safety. Each new release of Python adds more typing features, and we've talked about it many times in the podcast, and it essentially just keeps having a deep effect on the language, and I suspect that one's going to keep happening. As I mentioned, Jerry wrote this article in February. The trends he selected were pretty prognostic. They're still happening, and I suspect we're still going to get new additions of a lot of this into 2024 as well.

Speaker:A [0:22:37 - 0:22:50]
Yeah, I think the other two things that you dug into there, the HTMX is one of the other phrases, or the other libraries, or whatever you want to call it. What is HTMX?

Speaker:B [0:22:50 - 0:22:58]
No, it's a library. It's definitely a library. It's a JavaScript library. That means you don't have to write JavaScript, which is the best kind.

Speaker:A [0:22:58 - 0:23:32]
The beauty of it, right? Just add it and you're good. I've heard so much about it. That's probably just before Pycon I started to hear about it, and then I sat down at a table and that was like all these people wanted to talk about around me. And then I kept seeing it in other places, and I never quite got somebody on the show about it this year. But next year I do plan to dig into it and talk a little bit about it. I know that you did a course kind of digging into it, and so you've had your hands on it a little bit more. Do you continue to see it growing?

Speaker:B [0:23:33 - 0:23:39]
I think it's a little bit like your favorite band. That is a low overnight success. That's on their fifth album.

Speaker:A [0:23:39 - 0:23:40]
Okay.

Speaker:B [0:23:40 - 0:23:50]
It's been around for a couple of years. It used to be called something else. So I think it's just kind of hit the peak where enough people have figured out that this is helpful.

Speaker:A [0:23:50 - 0:23:51]
Okay.

Speaker:B [0:23:51 - 0:24:33]
There's some popularity going there. It's also got a really strong social media presence, for whatever reason. A few months back, somebody on the HTMX hashtag and Twitter discovered that there's a chapter in my book on it and posted, hey, this is coming. Isn't this great? And I couldn't believe the traffic. And it was a lot of memes with horses and laser beams coming in their eyes. And the fact that I had said the word HTMX, that's funny, it just sort of caused everything to blow up. So there's definitely a group of folks who are very excited about pushing it.

Speaker:A [0:24:34 - 0:29:24]
Yeah, this week I want to shine a spotlight on another real Python video course. This year we've talked a lot about object oriented programming, or oop, in Python, the method of structuring a program by bundling related properties and behaviors into individual objects. The course I'm featuring this week builds on a previous Python basics course, object oriented programming, and both courses are built from the section of the real Python book, Python Basics a practical introduction to Python three. This one's titled Python Basics building systems with classes. In the course, Ian Curry takes you through how to compose classes together by creating layers of functionality, how to inherit and override behavior from other classes to create variations, extending a parent class, how to use the super function, and how to creatively mix and match these approaches. Object oriented programming can be a bit intimidating for someone starting on their Python journey, and this course continues with a steady hand to lead you deeper into the topic. And now you can check out the companion course with exercises and a challenge to practice these techniques and hone your skills. This companion course is titled Python Basics Exercises, building systems with classes and the instructor is Martin Royce. And like all the video courses on real Python, the course is broken into easily consumable sections, and you get additional resources and code examples for the techniques shown. All the course lessons have a transcript, including closed captions. And for all the basics courses, you'll see that we've created quizzes for you to test your knowledge as you work through them. Check out the video course. You can find a link in the show notes, or you can find it using the enhanced search tool on realpython.com. One of the other trends in articles that I wanted to talk about briefly is this sort of desire for speed. We've had a handful of different things kind of digging into it. We talked a lot about how what's going to be happening in Python 313 with the gillectomy, or whatever you want to call it, the idea of trying to remove the gill and the plans for how to do that and ways that might approach speed. We've talked about the rustification, we talked about a lot of these other things, but one of the more interesting ones that I felt like kind of came out of left field a little bit was the introduction of Mojo. There was a very flashy launch of it without being able to play in it. You had to sign up to be in their playground. This is back in May. So the first article was Mojo, a superset of Python. And this was from Jeremy Howard from Fast AI, where he was talking about it. And Mojo is a new programming language which is a superset or builds on top of Python. So it's very pythonic in its design, but it is its own language. And the goal is to fix bottlenecks for Python's performance in specifically machine learning and AI areas where you're dealing with these massive data processing situations where, like, okay, not only that, but also how would you potentially deploy these models? It was very interesting to kind of watch it in the background. There's a follow up link that I'll include that's just titled why Mojo. This one's from Chris Latner, who's from Modular. We talked a lot about Chris and his background and how he's always been a compiler person. The subtitle of that is a backstory and rationale for why we created the Mojo language. And then a little later in the year, we got the SDK release for Linux. So I have that link. And that allowed another article to kind of come out, which was a mojo head to head with Python. And the person who did this comparison, Maxim Saplin. His article covers this Mandelbrot based benchmark of Python, variations of Numba, comparing it to the newly available Mojo. And although Mojo was fast, it takes up a lot more work than this author expected to translate Python to it. And with the right parameters, Numb was still beating it. So kind of an interesting comparison. I think there's going to be continuing trends of that. I think there was another article that I didn't find it in the list there that, where we talked about it did finally get released on the Mac platform, and luckily it's working on Apple Silicon. So mojo looks like it's kind of across all the platforms now, which is nice. So if you're interested in diving into this new language that is sort of this subset of python. Here's a set of articles that can kind of give you the background at it in a way that you can kind of dig in. What's the last topic you wanted to dig into?

Speaker:B [0:29:24 - 0:29:31]
So I don't know if I want to fully declare 2023 the year microservices died.

Speaker:A [0:29:31 - 0:29:32]
Okay.

Speaker:B [0:29:32 - 0:29:46]
But there definitely was a lot of counter coverage throughout the year. I saw a whole bunch of articles basically saying, you shouldn't do this, or you shouldn't do this first, or we did it and we regret it.

Speaker:A [0:29:49 - 0:29:50]
Learn from us.

Speaker:B [0:29:50 - 0:32:27]
The one I want to highlight is by David Sudden of Kraken Technologies, who wrote a pretty good article on how they structure their python monolith. And he didn't really get into the microservices thing, but it was essentially the hey, at scale, you can still do this even with Python. It was linked in issue 586, and the code at Kraken has over 25,000 modules, not including their tests, and over 400 developers actively working on it. So this is not a small undertaking. And very few articles out there talk about structure at this size. And those that do are almost always, this is how you cloudify, or this is why you should break it down into microservices. So it was kind of great to see somebody with practical experience saying, no, you can still do this with a monolith. And these are the design decisions we made and why David spends a fair amount of time talking about how they layer their modules in their architecture and specifically what kind of approach they took at Kraken in order to stop it from being unmaintainable. One of the key decisions they made was to enforce that dependencies are always downward facing in their stack, so the module on bottom can't have dependencies on the module above it in the layer. And then to enforce this, they actually have a linting tool, which screams if the rules are violated. So the linter goes through and checks what modules are being called. And they essentially have groupings of modules where they say, hey, this is a layer a thing, and layer B, it checks. Oh, okay. Layer B is allowed to call A, but isn't allowed to call C, for example. The article then goes on to talk about how they recognize and address their technical debt, which isn't so much about monoliths as it is about having large chunks of software that you're maintaining. And if you're going to keep adding features as you go, you have to think about this and how to grow it. And if you don't clean up as you go along, you're going to end up with a mess. So this becomes part of managing larger software projects. And then finally the article finishes up by recognizing reality, which is kind of what I just finished saying a minute ago. All design decisions are trade offs. So David then kind of touches know where there are rough spots and how they try to work around those rough spots at Kraken. So I love when larger projects reveal how they think about things. This is good content for developers to learn from. You're not on a large project. There isn't a lot of content like this. And if you are on a large project, there might be some takeaway ideas in here that you can beg, borrow, and steal for your own painful journey.

Speaker:A [0:32:27 - 0:32:48]
I'm trying to think of when the microservice trend was at its peak. I feel like it was the year that I kind of started in Python because I heard so much about it that year, which was 2017, 2018. And then I feel like ever since then, it's been on the wane.

Speaker:B [0:32:49 - 0:32:53]
My gut is it's been about a decade, but I don't have any evidence.

Speaker:A [0:32:53 - 0:32:54]
To say.

Speaker:B [0:32:56 - 0:33:07]
That could be an old man. After a certain point in time, it just becomes, well, we talked about it before COVID and that's the new line for everything.

Speaker:A [0:33:07 - 0:33:08]
The gray area.

Speaker:B [0:33:08 - 0:33:10]
Yeah, it must be old technology.

Speaker:A [0:33:10 - 0:33:31]
It was before COVID It's interesting that the trend, I'm guessing a lot of companies just stayed with what they had. This is why it's working. I agree. The idea of showing how they're doing things and their design philosophies. I really do enjoy those kind of articles, too. I feel like that helps.

Speaker:B [0:33:32 - 0:34:09]
Well, and I think particularly because it's Python as well. Right. This was specific to a Python conversation because this is one of the things, not that Python is a toy language, but because so many people do use it for like 50 line scripts, I think folks think, oh, well, you can't really engineer in that. And a lot of the type hinting and type safety kind of conversations are almost always around that kind of space. Well, because Python isn't strongly typed, you can't use it to do something of a certain size. So it's always nice to see a counterexample where it's like, oh, yeah, we've got hundreds of thousands of lines of code here, pure python, and it's working fine.

Speaker:A [0:34:09 - 0:37:19]
Right. Yeah. We don't have a discussion for you this week. I have a single project that I wanted to kind of at least cover. It's sort of an article and a project at the same time, I feel like it's something I can kind of leave you with to play with. This is from Lucas Krimpov. It's on Hacker noon, and the title of it is Python and folium to visualize my outdoor activities. Lucas is a hiker, kind of an outdoors person, and he has a dream of hiking from Munich to Venice and tracking across the beautiful Alps, which I think would be quite the intense hike if you were to take that on. He likes to track his hikes and his exercise and other outdoor activities. He's found that a lot of these outdoor or sport apps, things like adidas running, comout, I'm not familiar with that one. But Strava I'm familiar with and a few others, allow you to export the activities that you do as a unique file format, a thing called a GPX file. GPX stands for GPS exchange format. Takes you into how to work with these GPX files and walks you through creating up a Jupyter notebook. And he's using Folium, which we've talked about a few times across the last couple of years. It's a really great library for doing interactive maps, and a very popular way to do that is inside of a Jupyter notebook. He goes into plotting these GPX trails onto this interactive map. He uses a library called GPX PI, which makes sense. And the GPX stuff, I guess, includes not only the GPS data, but elevation also through this you can kind of practice Foleum's feature groups, their polyline feature, and then also this nice feature of adding different types of markers across the maps for the interactivity, like what you want to turn on and turn off and kind of check between different things. He mentions that to stay tuned because he's got much more to come. Things like deploying a website using an AWS, plotting elevation and speed profiles, using Python, and plotly enhancing trails of pictures, and much more. He provides the link to his GitHub, which has the Jupyter notebook in it. The notebook that he shares on this hacker noon article is kind of formatted a little weird, but you can kind of see the documentation he's done inside there that gives a lot of detail of not only the sets of specific data and GPS areas and elevations and things like that, but it has. Why is it organized like this and so forth? You can dig in and learn a little bit more about working with it if you're interested. For additional resources, there's a couple of resources on real Python about folium. One is a tutorial by Martin Royce, and then Kimberly Fessel created a video course based on that that actually dug in a little bit deeper and covered even more standardized features of working with folium to create webmaps for your data. Well, Chris, thanks for helping me cover all these picoders, articles and projects across 2023.

Speaker:B [0:37:19 - 0:37:23]
It's been an interesting year. We'll see what 2024 has in store for us.

Speaker:A [0:37:23 - 0:38:11]
Yes. All right, well, see you next year. Cheers. I want to thank Christopher Trudeau for coming on the show and helping me wrap up all the pycoders news and articles and projects. And I want to thank you for listening to the Real Python podcast. Make sure that you click that follow button in your podcast player, and if you see a subscribe button somewhere, remember that the real Python podcast is free. If you like the show, please leave us a review. You can find show notes with links to all the topics we spoke about inside your podcast player or@realpython.com Slash podcast. And while you're there, you can leave us a question or a topic idea. I've been your host, Christopher Bailey, and look forward to talking to you soon.

You can see that the speaker labels are based on identifying the voice from the audio, but AssemblyAI just uses labels like A and B to identify multiple speakers.

If you already know who is A and who is B you can pass that information to the API, but you must be careful when you do this. For example, the voiceover which introduces the podcast is often neither the host nor the guest, but is a freelancer who does the voiceover work, and is assigned label A 🙂


About this website

BotFlo1 was created by Aravind Mohanoor as a website which provided training and tools for non-programmers who were2 building Dialogflow chatbots.

This website has now expanded into other topics in Natural Language Processing, including the recent Large Language Models (GPT etc.) with a special focus on helping non-programmers identify and use the right tool for their specific NLP task.

1 BotFlo was previously called MiningBusinessData. That is why you see that name in many videos

2 And still are building Dialogflow chatbots. Dialogflow ES first evolved into Dialogflow CX, and Dialogflow CX itself evolved to add Generative AI features in mid-2023