What I wish I knew about building and deploying a full-stack ML project
The goal of this post is to share some of the things I learned while building Scribe - an AI note-taking tool for recorded meetings. If you haven’t shipped full-stack ML apps before, this article should help make it a smoother ride for you.
Scribe is a web application that takes in an audio/video recording or the transcript of the meeting and produces a summary of that meeting that is then sent to the user’s inbox. To achieve this, we first transcribe the audio using OpenAI’s Whisper speech-to-text model and then use GPT to generate a summary from the transcript.
Our tech stack:
Next.js for frontend
Stripe for payments
Supabase for database and authentication
Flask for server
Vercel and AWS for hosting
Docker for making life easier
We have the Whisper model running on our server and for summarization we use OpenAI API. You can check out the source code here.
So with the context out of the way, let’s dive right in!
Business goals first, systems goals second
This is by far the most important insight. As engineers, we are always tempted to build the most beautiful and robust systems we can. This means we can have a hard time translating the business requirements into architecture and explaining the design choices to non-technical people.
To avoid these issues, you need to think about what you want to achieve with your product, then think about what system design and metrics that would get you there. Coming from the problems that the system tries to solve to technical solutions and not the other way around is the right direction. If you don’t do this, then you’re on a sure path to refactoring your entire codebase in the future. This also makes it easier to have conversations with your cofounders or teammates because you will ask the right questions and will be able to explain things as you already thought about tradeoffs and have solid reasoning behind each decision.
In the case of Scribe, our #1 goal was to get the product in front of users as quickly as possible. Since this was a side project, we also wanted it to be cost-efficient. Finally, we wanted to be able to tell when something went wrong, so we can quickly fix it.
One of the ways I messed this up was when I didn’t consider the costs associated with running a Celery worker on a GPU instance 24/7. Without any users it doesn’t do much useful work and it also takes quite a bit of money from our pockets. I had latency and availability concerns in my head, which actually were not a problem as our users needed only a 24-hour turnaround. So I had to redesign large parts of our backend to accommodate for batching and the cloud infrastructure.
Don’t be like me, figure out what the product needs and then build. And since your product will almost surely change, you should still be prepared to make changes to your architecture.
Double your timelines
When you’re dealing with a new type of a project you will certainly encounter more problems than you think. Even if you are already familiar with parts of the task or if there are any meaningful modifications, this still applies.
The reason for this is that the reality has a surprising amount of detail. There are a bunch of things that are right in front of us, but we are totally blind to them. In my case, it turned out to be deploying an ML project.
I had some experience deploying pretty static React apps using easy to use platforms. Naively, I thought that deploying a backend that runs ML models in production would be similar.
It’s not.
Deployment literally took more time than the rest of this project and I had to learn new technologies and platforms to get it done. I went through 100500 Azure and AWS services to learn things like Azure App Service doesn’t support mounting Azure Storage for Docker Compose applications, which is not even directly mentioned in the docs. It can be a struggle to do something so simple as setting up CLI access to AWS resources or connecting to a virtual machine, because there is one small thing that you missed.
Quick tip: learn Docker as soon as you can - it will make your development process a lot smoother (especially when you have a team of engineers and need to run a lot of processes at once) and ease the pain of deployment (from what I saw, deploying without Docker containers can be more complicated).
So the truth is - you need to be ready to deal with all of those small annoying details. There is just no way around them. But that’s the beauty of learning and doing things - you will become better and stop noticing these details. They will become so trivial that details will become invisible again, but now you will be on the other side.
Find the right support system
Some of us can keep consistently working on the project through all the problems on their own for long periods of time, but this doesn’t apply to everyone. The idea that was very exciting in the beginning turns into something mundane, you start to doubt if this thing is worth the time and energy investment, other ideas and responsibilities come in the way. The best way that I know how to combat this is to work with others.
Levels of enthusiasm come in waves and if you are on your own, you’ll be tempted to drop the project at the bottom of the trough. But if you have partners, it is likely that your waves are not exactly in sync, which means that there will be some destructive interference. This will hopefully cancel out the lowest of the lows. The goal is not to always be at your best level, the goal is to not get to your worst.
Finding the right partners is the tricky part. There is no easy way to know if you are a good match and the only option that I see is to start working with others as early as you can. This will allow you to train your internal model which can help you to notice important cues about your partners early on. This model will be especially useful if later you decide to work on startups.
One more thing you can play around is the timeline of your side project. Instead of pursuing longer and more intensive ideas, you can focus on smaller ventures while getting valuable experience and avoiding the problem that I described above. But at some point you will have to face this challenge.
Build, share, repeat
The last one thing that I want to get across - build and share things. This is the most exciting and productive way of learning which can unlock opportunities that you never thought about. And sharing is really crucial, because others can also benefit from your work and lessons. Which I hope I was able to do today.