All software program is constructed atop a core set of assumptions. As new code is added and new use-cases emerge, software program can develop into unmoored from these assumptions. When this occurs, a elementary rigidity arises between revisiting these foundational assumptions—which often entails plenty of work—or attempting to assist new habits atop the present structure. The latter method is often suggested, to save lots of time and cut back threat.
Nevertheless, there are occasions when it’s price revising the core structure of a giant software program utility. Lately at Slack we did simply that, taking a step again to alter how our backend and shoppers (the desktop and cell purposes) work on a foundational degree.
Slack launched in 2013 with a easy structure—every consumer belonged to a single workspace, the place they joined channels and despatched messages. To view messages from a unique workspace (that you simply had been additionally logged in to), you wanted to click on into that workspace.
This mannequin held till 2017, after we launched Enterprise Grid, which lets Slack’s largest clients divide their organizations into a number of workspaces, every with a selected focus. At first Enterprise Grid customers had been often in only a single workspace, however over time utilization patterns modified, and right this moment these customers typically belong to a number of workspaces. Concurrently, we’ve constructed methods for Slack shoppers to share knowledge throughout a number of workspaces on the identical Grid, such because the Threads and Unreads views and cross-workspace channels.
This led to a pure query: if knowledge is shared between a number of workspaces on the identical Grid, and customers want to modify between these workspaces to do their jobs, why not as a substitute present a single, unified view of all the info a consumer can entry inside their Grid? Not solely would this present a superior consumer expertise, it will remove a category of bugs attributable to syncing org-wide knowledge throughout a number of workspaces. And it will enhance efficiency, since knowledge for a number of workspaces might be loaded in a single API request.
With this perception, the Unified Grid mission was born. However as a result of Slack was architected with the belief that the majority knowledge is explicit to a single workspace, it was initially unclear whether or not Unified Grid was even possible. Nonetheless, we determined that as a result of the product continued to push towards the boundaries of a workspace-centric structure, we needed to strive.
Enterprise Grid: The evolution of Slack’s structure
To grasp what made Unified Grid such an bold mission, it’s price zooming out to research Slack’s structure and the way it’s developed through the years.
In 2013, Slack launched with a comparatively easy mannequin. Customers belonged to workspaces inside which they joined channels and despatched messages. Every workspace represented a buyer, and all the info for a selected workspace was saved on a single database server, or “shard.” Slack shoppers authenticated their API requests utilizing session tokens containing the consumer ID and workspace ID (known as “workspace tokens”); the backend then parsed the workspace ID and used it to affiliate every API request with a workspace, route queries to that workspace’s database shard and carry out entry management. This mannequin additionally prolonged to the shopper, the place the info for every workspace was saved in a separate repository with distinct login classes.
As Slack grew, we seen that particular person divisions inside the similar firm typically created separate Slack workspaces. We wished to present firms a easy solution to administer these workspaces through a single UI, the place they might implement safety insurance policies and deal with billing throughout their whole group. Thus, Enterprise Grid, our resolution for our largest and most complicated clients, was born.
To assist Enterprise Grid, we launched the idea of an “org” that successfully served as a “mum or dad” to a number of workspaces. Customers nonetheless navigated Slack from the attitude of a person workspace, however now it was additionally attainable for knowledge to be saved on the org degree. For instance, clients may create cross-workspace (XWS) channels, which had been saved on the org’s database shard and visual throughout a number of workspaces. This meant that the Slack backend was required to question knowledge on each the workspace shard and, if absent there, on the org shard (for workspaces that are a part of an Enterprise Grid). As a result of Enterprise Grid customers might be assigned permissions on the extent of the workspace and/or org, the backend additionally needed to verify permissions at each the workspace and org-level.
The altering panorama
Initially, since finish customers had been often in a single workspace, their expertise didn’t change a lot in Enterprise Grid. Nevertheless, over time the best way clients use Slack has developed. Now, a good portion of customers do belong to a number of workspaces on the grid, which led to context switching and missed exercise.
We wished to handle these issues, and a number of other infrastructure-level modifications we’d made steered a method ahead. With the Vitess migration, we started sharding knowledge alongside axes aside from workspace or org ID, which means that the workspace or org was now not required to route queries to the suitable database shard for our most essential tables. We additionally enhanced our real-time messaging (RTM) stack to take away the necessity to fan-out org-wide knowledge to each workspace on the grid (and a few of our largest clients have hundreds of workspaces!). Lastly, we up to date shoppers to share org-wide knowledge throughout all workspaces inside their grid. Leveraging these infrastructure investments, we constructed views that aggregated content material from a number of workspaces, like our Threads and Unreads view.
Nevertheless, even with these enhancements, our workspace-centric structure nonetheless induced vital frustration. We knew that to really clear up the issue, we’d want to maneuver to an org-wide structure, although this might entail updating hundreds of APIs, database queries, and permissions checks.
Prototyping the trail
Execs—to not point out engineers—had been understandably involved about the price of Unified Grid, and never satisfied that the payoff could be definitely worth the effort. Due to this fact, quite than begin by tackling what had been doubtlessly hundreds of damaged APIs, we determined to construct a proof of idea to raised perceive the advantages of Unified Grid and the work that will be required to ship it end-to-end.
At Slack, we name this prototyping the trail—that’s, constructing incrementally, proving out and refining our concepts as we go. As a result of we’re a few of the heaviest customers of Slack, we knew that if we may use Unified Grid in our day-to-day work, we’d begin getting good alerts about what did and didn’t work. And because the mission grew in maturity, we may decide in additional of our friends, gathering beneficial suggestions from them.
First, we wanted to have the ability to boot the Slack shopper in Unified Grid mode, with an org-wide view of all of the consumer’s channels quite than a workspace-scoped view. To this finish, we constructed a brand new boot API which returns knowledge for all of the workspaces and channels the consumer belongs to throughout the whole Grid. We up to date shoppers to retailer this boot knowledge on the org-level, since customers in Unified Grid now not navigate from the attitude of a single Grid workspace at a time.
As soon as the shopper may boot, we up to date our homegrown API framework such that an API might be marked suitable with the brand new Unified Grid shopper. We then started fixing APIs and client-side checks as we encountered points, prioritizing people who impacted our day-to-day work. We had a number of main methods for fixing damaged APIs:
- If an API didn’t depend on workspace context for routing—maybe as a result of it had been migrated to a brand new sharding scheme in the course of the Vitess migration—we allowed it to be known as in Unified Grid and confirmed that the question nonetheless behaved appropriately. For instance, as a result of the messages desk is now sharded by channel ID, we may effectively fetch messages for a channel with out vital modifications.
- If an API acted straight on a workspace, we may typically immediate customers to pick out a workspace after which cross that workspace to the API. For instance, we up to date the channel creation circulate such that the consumer should choose the workspace through which the channel needs to be created, because the workspace can now not be inferred from the state of the shopper.
- Lastly, if all else failed, we may iterate over the consumer’s related workspaces, trying to resolve the question towards every workspace’s shard. As a result of most customers are in solely a handful of workspaces, this method is surprisingly performant. Nevertheless, there’s a lengthy tail of customers in lots of of workspaces. As a result of such customers are typically directors who don’t work together with all these workspaces, we determined to cap the variety of “related” workspaces at 50 and permit customers to manually configure this checklist. Proscribing the related workspaces for every consumer ensures cheap efficiency and makes Slack usable for these outliers.
Though our prototype had plenty of tough edges, we felt the advantage of diminished context switching and an easier UX. From there, we began opting in additional coworkers, ultimately inviting execs like our then-CEO Stewart Butterfield to strive the brand new shopper. His suggestions summed up how we felt: “That is clearly higher.”
From prototype to manufacturing
As talked about above, Unified Grid doubtlessly impacted each API and permission verify invoked by the Slack shopper. It could require vital effort from scores of engineers throughout most of Slack’s product engineering groups to make sure these API and permission checks continued to behave appropriately. Concurrently, we had been constructing IA4, a redesign of the Slack shopper which launched our Exercise, DMs, and Later tabs. So as to keep away from subjecting clients to separate massive modifications on the similar time, Unified Grid turned a foundational part of IA4, and with it a high firm precedence.
We started with spreadsheets itemizing all APIs which had been invoked by Slack shoppers in addition to all permission checks carried out by shoppers and the backend, dividing the work amongst varied associated product groups. In step with prototyping the trail, we requested engineers to take two passes over every API: a primary cross to make the API work effectively sufficient for inside utilization, after which—maybe weeks later—a second cross to make sure the mixing assessments, permissions checks and different edge-cases behaved appropriately. This two-phase method allowed us to manually confirm and get a really feel for performance which was not totally prepared for primetime.
The core group now pivoted our work away from prototyping to extra scalably assist the migration effort with instruments and frameworks:
Docs: Most significantly, we put collectively an in depth information with step-by-step directions for guaranteeing that an API behaves appropriately in Unified Grid, together with the methods for fixing APIs listed within the “Prototyping the trail” part.
Checks: We created a parallel integration check suite which ran all our present integration assessments utilizing org context as a substitute of workspace context. This allow us to reuse hundreds of assessments quite than rewriting them from the bottom up. As anticipated, lots of of check suites had been damaged initially, offering us with a concrete checklist of check suites to repair as a part of marking an API suitable with Unified Grid.
Helpers: We added quite a few comfort helpers to appropriately fetch channels and carry out permissions checks throughout all a consumer’s workspaces on their Enterprise Grid, on each shoppers and the backend. For instance, to verify whether or not a consumer can act as an admin inside a cross-workspace channel, these helpers verify whether or not the consumer is a workspace admin in any of the workspaces with which the channel is shared or is an admin on the org-level.
Shopper Infrastructure: Along with the work wanted to assist these permissions checks, shoppers additionally required new infrastructure emigrate workspace-scoped repositories to the brand new knowledge mannequin. The shoppers solved this downside in numerous methods: some shoppers added an org-level knowledge retailer however continued to avoid wasting knowledge in workspace-scoped repositories, whereas different shoppers moved the whole lot to an org-wide retailer. These knowledge migrations might be accomplished and shipped in parallel with the general Unified Grid mission, which allowed us to de-risk the mission itself.
Conclusion
By Summer time 2023, Unified Grid was in a spot the place a lot of the corporate was utilizing it for his or her day-to-day work. We started rolling out to clients in Fall 2023 and accomplished the rollout in March 2024. What had begun as a barely purposeful prototype was, virtually two years later, a core part of our redesigned shopper and a strong basis atop which to maintain innovating.
It’s a truism that you simply shouldn’t try massive rewrites of present software program purposes. However like all truisms, it’s solely virtually at all times true. Typically, when the structure of an utility drifts far sufficient from how that utility is used, prototyping a path in the direction of rewriting the core basis is definitely one of the best ways to attain your targets.
Now that Unified Grid is dwell, we’re excited to see what’s subsequent. What else may be constructed atop a extra versatile info structure? No matter it’s, we all know that we’ll be prototyping the trail to new, intuitive product experiences effectively into the long run. If that’s one thing that excites you too, come join us.