bookingkit is one of the leading Berlin-based companies from the leisure industry, highly recognized in the sector. The company operates as a neutral Global Distribution System (GDS) to provide travel agencies and marketing networks access to tours and activities in the form of a digital inventory. It is also a web-based and integrated solution (software as a service) for providers of tours and activities.
We were approached with an idea depicting how they would like to change their platform’s architecture to enable scaling it even further. The chosen solution is based on sharding - creating several instances sharing the same codebase but with different datasets derived from the source and divided by the chosen factor.
Such an approach, similar to microservice architecture, creates certain challenges like:
That can be achieved for instance by creating an API gateway in front of the application instances, and that is what we aimed for.
In addition, a solution had to:
All of this has to be highly available - bookingkit is working with demanding partners such as Trip Advisor or Get Your Guide, which means their services have to be available 24/7.
faster responses for an API request
attractions in Europe which can be booked in real time
place in Digitalization of CX category (Digital Champions Award)
When you’ve been building a platform for years it is not easy or sometimes even viable to go from a monolith to full microservices right away. Such an operation could consume too much time and resources and basically is quite risky. But an API Gateway is a necessary step forward in creating a distributed system. The new service should do everything that the bookingkit’s API could do.
So we then, we knew the issues and the main goal. The question that stood was: “How exactly are we going to achieve that?”. To find the complete answer couple of steps were needed. It all started with an intensive technical workshop.
The first concept we’ve discussed was creating a service that would not only mimic the existing API but would also be a copy of the whole database, in that case combining all the data that was supposed to be sharded into a number of instances connected to it.
After the workshops and a couple of talks, it became apparent that we should validate the assumptions with a Proof of Concept (the chosen tech was Flask, a Python framework). The results were somewhat pleasing, but still, we saw potential issues with performance, overall high complexity of the new service and other risks regarding data integrity and migrations.
Another technical meeting was held in our office after which it was decided to verify the lighter version of the service. Only this time it was supposed to “just” relay all the requests to the instances behind it and combine the responses into one. Also in search of better performance, this time we’ve decided to compare two techs - Python and Golang.
That was the moment when it “clicked”, not only had we confirmed the solution to be the right fit, but we also found that choosing Go can yield surprisingly better results - the TTFB was twice as fast as the solution using Python.
Verified concepts:
Considered technologies:
API gateway proxying and aggregating data to/from instances was created. The programming language chosen was Golang, infrastructure that was also a big part of the solution is in AWS (AWS ECS), we also prepared robust CI/CD pipeline with automated tests of the whole API written in Cypress, automated performance tests in k6, automatic deployment from the pipeline to staging/production environments. Everything was also automated in Terraform.
we’ve established that the Gateway pattern will be suitable for this project covering all the functional requirements and being lightweight
Go is the way to go! The application written in Golang provides twice as fast responses for an API request than the one created in Python/Flask.
the initial plan for the infrastructure set-up was conceived - our non-functional requirements covered usage of AWS services, high-availability and possibly low response latencies
specific Gateway response latency measured in TTFB (Time To First Byte) become our key goal to be met.