Google Summer of Code 2024 - Kiwix

Uchechukwu Orji
View Project on GitHub

In the summer of 2024, I participated in Google Summer of Code with Kiwix developing an Automated Download Speed Testing solution that tests the download speed of users to Kiwix download mirrors from various locations on Earth. The goal of the project was to provide the organization with data and analytics that would help them re-configure MirrorBrain to direct users to the “best” mirrors for their region. Overall, this would ensure fast and reliable download speeds for all users.

About the Organization

Kiwix is a non-profit organization and a free and open-source software project dedicated to providing offline access to free educational content. By compressing copies of entire websites into a single ZIM file such that they can fit on a user’s device, it provides applications that can read these local copies, thus, enabling people with no or limited internet access to enjoy the same browsing experience as anyone else.

Project Details

At a high level, the project involved:

Project Overview

Work Done

The project was a large 350-hour project that spanned over 20 pull requests with Python and SQL the primary languages used in development. The code for the project lives on kiwix/mirrors-qa.

In order to avoid tight-coupling of logic, the project was split into three core services namely:

All services were containerized using Docker, making it easy to set up and deploy.

Challenges

The biggest challenges I faced during the project revolved around the limitations of the open source version of Metabase. At the time of writing, it didn’t have support for charts like the Box Plotand the Pivot table didn’t have support for sticky columns.

Also, filter names could not be scoped to a specific tab in the dashboard. This made it difficult to have filters that shared the same name in different tabs but didn’t necessarily have to be connected.

Another challenge I had was developing a script to download all Wireguard configuration files from ProtonVPN. I hacked around it by digging through their core libraries and experimenting with their ABC classes. It was a thrilling experience and I couldn’t have done it without ripgrep to scan for “helpful” variable names and methods across their core libraries.

Future Work

At the time of writing, there is a plan to automatically retrieve metabase data, process it and use it to configure MirrorBrain.

Things I Learned

Working on the project made me become more proficient with industry tools like GitHub Actions and Docker. Prior to the project, I had only used some of their well-known features for my hobby projects.

Also, I learnt a lot more about SQL than I had ever used before. Features like window functions and even creating custom functions were things I had never used before but they ended up playing a crucial role in the achievement the goal of the project.

Acknowledgements

I express gratitude to Google for providing me with this opportunity to contribute to Open Source Software and develop my skills at the same time.

Thanks to the team at Kiwix for reviewing my pull requests during the submission phase, accepting my proposal and making this a reality.

Most thanks of all goes to my project mentor Renaud Gaudin for his feedback and help with challenges during the project. It was a great experience working with him and I always looked forward to his code reviews. Working with him helped me improve my coding style and approach to problem-solving.