Google Summer of Code 2025 - Reengineering ZIMFarm from the Ground Up

Uchechukwu Orji
View Project on GitHub

I participated again in Google Summer of Code with Kiwix, this time, reengineering the Zimfarm project. Zimfarm is a semi-decentralized software solution to build ZIM files efficiently. This means scraping web contents, packaging them into a ZIM file and uploading the results to an online ZIM files repository.

About the Organization

Kiwix is a non-profit organization and a free and open-source software project dedicated to providing offline access to free educational content. By compressing copies of entire websites into a single ZIM file such that they can fit on a user’s device, it provides applications that can read these local copies, thus, enabling people with no or limited internet access to enjoy the same browsing experience as anyone else.

Project Details

At a high level, the project comprises:

Work Done

As of September 1st, 2025, the reengineering of the ZIMFarm project spanned over 100 pull requests, with Python and TypeScript serving as the primary programming languages used in development. The code for the project lives on openzim/zimfarm.

Starting from the backend, I began with the introduction of Hatch as a dependency manager to pin all dependencies to a specific version. I proceeded to upgrade some of the existing libraries to major versions while others were replaced entirely with more feature-rich ones. The most notable replacements in this ambitious reengineering were:

As a consequence of these upgrades and replacements, a few breaking changes were introduced (it was inevitable to keep them away). This meant the versioning of the backend API to v2, not only to take full advantage of features from library replacements but also to clean up parts of the old API that were fragile and inelegant such as:

Aside from the library switches and upgrades, the reengineering introduced significant features including but not limited to:

To avoid turning the list into a long and boring changelog, I’ve highlighted only a few select improvements (not ranked in any way). If you are curious, you can browse the full list of pull requests on Github)

I tried to keep the UI largely the same as the orignal (partly because I am not good at design 😅) and only made ambitious changes in the UI where necessary. Relying heavily on Vuetify, I gave the UI a more modern design and introduced some additional features and pages.

Challenges

Reengineering a project of this size was by no means a small feat and it was challenging as much as it was exciting. The biggest challenges I faced (in no particular order) during the project revolved around

Again, this list is not exhaustive and does not fully capture the variety of hurdles I faced during the project, but it highlights some of the more interesting ones.

The frontend became a hands-on learning journey that helped me improve my proficiency with Vue 3 and TypeScript. Prior to this, I had only used them sparingly (I cannot remember the last time I did something frontend-related). But after the first couple of weeks, I began to find my feet, thanks to foundational knowledge in JavaScript.

Future Work

We plan to wrap up this GSoC project with the deployment of the zimfarm-upgrade branch on September 8th, 2025. Of course, the journey doesn’t end here. There’s still plenty of work ahead with issues of different priorities ranging from prio1, prio2, and prio3 to a broader backlog waiting to be tackled.

I’m very glad to share that I’ll continue working with Kiwix as a contractor until at least the end of 2025, which is both a testament to their trust in the work I’ve done and to how much I’ve developed since my first pull request to their codebase two years ago.

Things I Learned

Reengineering the project exposed me to more situations where I had to make decisions regarding backwards-compatibility and the trade-offs associated with it.

Navigating the challenges mentioned earlier meant I had to introspect heavily in order to be able to come up with an elegant solution while still maintaing strict type-checking standards. Also, I learnt newer things about SQL, most importantly the JSON functions and how to wrangle data in the database.

If last year’s participation in the Google Summer of Code changed the way I wrote code, this year’s edition deepened my engineering discipline and how I think about building systems.

Acknowledgements

I express gratitude to Google for providing me with this opportunity to contribute to Open Source Software for the second time in a row.

Thanks to the team at Kiwix for reviewing my pull requests during the submission phase with the same responsiveneess, accepting my proposal and making this a reality.

Whether you are a newbie or a seasoned developer looking to get started with open-source and collaborative development, I implore you to start with the Kiwix codebase. The team is incredibly responsive, offers constructive feedback, and makes the contribution process both welcoming and rewarding. You’ll not only sharpen your technical skills but also get the chance to work on projects that make a real impact.

Most thanks of all goes to my project mentor Benoît Béraud for his feedback and help with challenges during the project. Without his feedback, none of this would have been possible as he almost always had a suggestion when I hit a wall. His careful organization of the issues and detailed explanations meant there was a little to and fro on the issues, thus, accelerating the rate of development.

Working with him significantly improved my approach to problem-solving.