The GetUpside Marketplace is our core product—where all our business happens across all our channels, for users, merchants, and internal operators. As we grow, our Marketplace incurs (strategic) tech debt, with system complexities we no longer need. But as it is with most tech companies, addressing that debt is often deprioritized relative to product improvements that directly impact our quarterly goals.
So we put our heads together, and created a structure and a team to prioritize work on the Marketplace and measure its impact.
If your company is looking for best practices on how to prioritize internal work and improve the way your system runs, read on.
The State of the Marketplace
Before we prioritized this work internally, the GetUpside Marketplace was high-functioning but costly.
We’ve always had high uptime and operated well at our current transaction levels, but the associated operational and engineering costs began to scale alongside our product.
Unfortunately these costs—beyond the financial ones—are often hidden from view because there are no numbers we track that expose their toll on the team.
By thinking in terms of “costs,” it’s easier to bring maintenance-related issues to light and prioritize platform improvements in your roadmap. How can you do this with your team? Ensure your roadmap prioritization considers these three things.
#1 Reduce Toil
The Google Site Reliability Engineering ebook defines toil as, “the kind of work tied to running a production service that tends to be manual, repetitive, automatable, tactical, devoid of enduring value, and that scales linearly as a service grows.”
While you can’t fully eliminate toil—especially if you’re a smaller startup or organization—it’s important to track it because too much toil can be harmful to the business.
The costs that can help you track toil are:
- Financial costs: The proportional cost of engineers’ salaries tied to toil
- Opportunity costs: The hours spent on toil is time not spent on moving the business forward
- Morale costs: The morale of the team as the level of toil increases
Once you define these costs and find a way to track and measure them on an ongoing basis, you can ensure that work gets prioritized to keep toil below a level that you and your business are willing to accept.
#2 Improve Quality Control
Product and engineering teams often think that quality control means having some type of QA process to test code before it is deployed. However, it’s imperative for quality control to be baked into whatever product or platform is being built.
When we look at how to prioritize work to improve our quality control, we measure against the following costs:
- Financial costs: We look at how much money is lost every time we have to reprocess a user’s transaction.
- Reputational costs: As a qualitative measure, we see how miscalculations negatively affect our reputation with our merchant partners. It makes them question our quality control processes, and the lack of trust slows down ongoing engagements.
- Morale costs: This is also qualitative, but important to keep in mind. Engineers deep in the code know the inconsistencies and potential risks associated with a quality issue, but the lack of priority on resolving them makes them less engaged and less likely to bring up issues in the future.
At GetUpside, we often prioritize work that will achieve clear quality control outcomes. The one initiative we continue to work on is what we call “auditability.” Are the pipelines and algorithms that are core to our business observable and monitorable across our company? Do we have the tools in place to ensure that when there is an inevitable quality control issue, we are proactively finding it, not our customers?
#3 Mitigate Outages
According to Gartner, downtime can cost companies $5,600 per minute and up to $300,000 per hour in web application downtime.
It’s critical to work on ways to prevent outages and to help ensure business continuity. However, as you grow your business, things can get lost in the shuffle. GetUpside is working to hire 30+ engineers this year, so it’s inevitable that certain code might get into production without the “right” pair of eyes reviewing it first.
The costs associated with the risks of outages include:
- Financial Costs: Every minute of an outage means transactions are not flowing through our system which directly impacts our revenue. We also have strict service-level agreements with partners that have financial penalties for non-compliance.
- Reputational Costs: Depending on where and when the outage happens, our reputation with consumers or merchants will suffer.
- Morale Costs: Having to fight fires increases the toil on the engineering team, causing a downward spiral in morale as the number of outages increase.
As product teams develop roadmaps, it’s critical to implement processes that reduce the chance of potential outages — from improved testing coverage to scaling services for projected growth.
What’s next for you?
Our platform prioritization rubric is just one of the many examples of how GetUpside Engineering is leading the way in process and organizational structure — not to mention the tech we’re actually building. If you’re interested in learning more or joining the team, reach out. We’re hiring.