A title for your blog

Learning Elixir, Phoenix and Ash Part 3: Multi-tenancy

The first ever Ruby on Rails app I built was a photo sharing web site. One of the first features I implemented was sign up and logging in as a user and then everything that happened in the app was scoped to that user. This was an obvious requirement and I never thought much of it. This was multi-tenancy before I even knew what that meant. Fast forward a few years and multi-tenancy gems started to appear, blog posts were written, and multi-tenancy started to become ā€œa thingā€. This still feels a bit weird to me that it’s called out as its own thing.

Ash has multi-tenancy builtin and has a way to explicitly configure your Resources for multi-tenancy.

defmodule MyApp.Factories.Order do
  ...
  multitenancy do
    strategy :attribute
    attribute :factory_id
  end

Ash enables two types of multi-tenancy1 out of the box, attribute and context. Attribute is shown in the above code. It’s where you have a ā€œGod objectā€ that all other objects reference. In this code it’s the Factory. Context multi-tenancy is at a lower level and how it’s implemented is dependent on the data-store. For example in AshPostgres each tenant has their own database schema rather than everything being in public.

Attribute multi-tenancy is how I have always done it in Rails. You scope all your queries to the current ā€œtenantā€ eg. current_factory. For example in Rails

# Wrong! A hacker could change the id in the URL and access other Users Orders
Order.find(params[:id])

# Correct. This will raise Not Found if you try to access an Order from a different Factory
current_factory.orders.find(params[:id])

You just have to make sure you remember to do it. In Ash though if you defined something as multitenancy Ash enforces the tenant. Then whenever you make a query and forget to pass in the tenant you will crash.

Queries against the MyApp.Factories.Order resource require a tenant to be specified

This is awesome, no more accidentally leaking information. But it’s been a pain to figure out how it works.

Querying Resources

Whenever you make a query you need to pass in the tenant. This is fine except there are several of ways to do this.

  1. Compose it into the query
factory = Ash.load!(socket.assigns.current_user, :factory).factory

{:ok, customer} =
  MyApp.Factories.Customer
  |> Ash.Query.for_read(:get_by_id, %{id: customer.id})
  |> Ash.Query.set_tenant(factory)
  |> Ash.read_one()

{:ok, customer} =
  MyApp.Factories.Customer
  |> Ash.Changeset.for_create(:create, %{name: "Jane Smith"})
  |> Ash.Changeset.set_tenant(factory) # <--- Note this is a Changeset
  |> Ash.read_one()

This makes sense. When you construct the Query you add the tenant constraint. This will add where factory_id = factory.id to the SQL. Or

  1. Some methods take the tenant as a parameter.
Ash.read!(MyApp.Factories.Customer, tenant: socket.assigns[:current_tenant])

Ash.get!(MyApp.Factories.Customer, id, tenant: socket.assigns.current_user.factory_id)

This second method is using what I think is called a code interface. It’s not clear which way is better or preferred (idiomatic), but this is obviously way shorter 😁 This also shows one of the other oddities; where does the tenant come from?

Conn and Socket

When implementing multi-tenancy in Rails it’s helpful to set your tenant object as a ā€œglobalā€ somewhere so you can access it from (mostly) anywhere in the code. You can do this either in your application controller, the same way you would set the current_user. Or more recently you can use the Current object. This avoids a query every time you need to find the current tenant.

In Phoenix there are a couple of data structures that can sort of fulfil the same role, the conn struct and the socket struct.

The conn is a struct that gets passed to every function in the web side2 of your app when you are using the traditional request based app. socket is passed around when you are using live views. (The question I keep asking myself is why have both? Surely the conn could be repurposed for LiveView as well as ā€œdead viewsā€3.)

The problem I have is figuring out how to get the tenant into the socket. While debugging something else I discovered there is an exisiting field in the socket.assigns hash called current_tenant.

#Phoenix.LiveView.Socket<
  id: "phx-GCNc06drdWo8U4si",
  endpoint: GidocaPhxWeb.Endpoint,
  view: GidocaPhxWeb.OrderLive.Index,
  parent_pid: nil,
  root_pid: #PID<0.8003.0>,
  router: GidocaPhxWeb.Router,
  assigns: %{
    __changed__: %{current_user: true, current_tenant: true},
    current_user: #GidocaPhx.Accounts.User<
      factory: #Ash.NotLoaded<:relationship, field: :factory>,
      __meta__: #Ecto.Schema.Metadata<:loaded, "users">,
      confirmed_at: nil,
      id: "b2f3d266-e183-4585-836d-0f487a3298a0",
      email: #Ash.CiString<"hmaddocks@me.com">,
      factory_id: "9bb073f3-7277-4767-9a3f-0518e3a7a737",
      aggregates: %{},
      calculations: %{},
      ...
    >,
    flash: %{},
    current_tenant: nil, # <---
    live_action: :index
  },
  transport_pid: #PID<0.7994.0>,
  ...
>

I have no idea where this came from but it’s there so I tried to set the key to the factory.id. We can see that the current_user is also in the assigns hash. This appears to be set in the authentication code.

conn
|> delete_session(:return_to)
|> store_in_session(user)
|> assign(:current_user, user)
|> put_flash(:info, message)
|> redirect(to: return_to)

There’s a lot about this code that has me confused. This is building the conn but there are also functions talking about the session. We store the user in the session as a bare object and assign the user to the :current_user in the conn. Why both? And the other thing; this is conn related code, but I’m using LiveView so I need to get the tenant into the socket. The current_user makes its way into the socket so something must be copying it from one place to the other.

After getting nowhere a member of the Ash Discord said I have to store the tenant in the session and write a Plug to set it in the conn. A Plug is a composable function or module used to transform the HTTP request and response, enabling functionality such as authentication, logging, and parameter parsing within the connection pipeline.

This is what I ended up with

defmodule MyAppWeb.Plugs.SetTenant do
  @moduledoc """
  Sets a default tenant if none.
  """

  alias .Factories.Factory

  @current_tenant "current_tenant"

  def init(opts), do: opts

  def call(conn, _opts) do
    with %{} <- conn.assigns[:current_user],
         {:tenant, tenant} when not is_nil(tenant) <-
           {:tenant, Plug.Conn.get_session(conn, :current_tenant)} do
      tenant = Ash.ToTenant.to_tenant(tenant, Factory)
      Ash.PlugHelpers.set_tenant(conn, tenant)
    else
      # no user logged in
      nil ->
        conn

      {:tenant, nil} ->
        conn
    end
  end
end

I find these with statements really hard to read. What this code does is

  1. Check the conn.assigns to see if there is a :current_user. This means we’re logged in.
  2. Get the :current_tenant from the session in the conn if it’s not nil
  3. Convert the Factory to a tenant. More about this later
  4. Then, using the Ash.PlugHelpers, set the tenant on the conn

This seems to work, as in if I remove this my app breaks, but I don’t really understand what’s happening here yet. This line Ash.PlugHelpers.set_tenant(conn, tenant) is setting the tenant in a private field in the conn and if we check out the docs for PlugHelpers we can see that it has a matching get_tenant function. I have tried to get the tenant using this function but it always returns nil.

Then I need to add my plug into the request pipeline

pipeline :browser do
  plug :accepts, ["html"]
  plug :fetch_session
  plug :fetch_live_flash
  plug :put_root_layout, html: {MyAppWeb.Layouts, :root}
  plug :protect_from_forgery
  plug :put_secure_browser_headers
  plug :load_from_session
  plug MyAppWeb.Plugs.SetTenant
end

I don’t know if there is a specific order these should go in but I need stuff from the session so I thought this was the best place (maybe it’s in the wrong place which is why Ash.PlugHelpers.get_tenant doesn’t work ĀÆ\_(惄)_/ĀÆ).

Finally we set the current_tenant in the session so that it can be picked up by the Plug.

conn
  |> delete_session(:return_to)
  |> store_in_session(user)
  |> assign(:current_user, user)
  |> put_flash(:info, message)
  |> put_session(:current_tenant, Ash.load!(user, :factory).factory) # <---
  |> redirect(to: return_to)

That was a mission! Unfortunately it’s not over yet. The problem is this seems to be inconsistently applied. I’m finding several situations where the current_tenant isn’t being set. In those cases I’m falling back to querying for the factory or using the factory_id from the socket.assigns.current_user. I must be missing something.

Forms

Forms are the last part of the puzzle. My expectation as a Rails developer, is that you either add the tenant to the params after the form has been submitted, but before you create the resource or you add the tenant id to the form template as a hidden field.

But no, you add it as a parameter when the form object is created before it’s rendered by the template4. I would never have figured this out without help. This was also a case where the tenant wasn’t in the socket so I got it from the current_user. I’m beginning to think that’s the best way and I can drop the whole conn and Plug business.

# Do it once for create
AshPhoenix.Form.for_create(MyApp.Factories.Customer, :create,
  as: "customer",
  actor: socket.assigns.current_user,
  tenant: socket.assigns.current_user.factory_id
)

# Do it once for update
AshPhoenix.Form.for_update(customer, :update,
  as: "customer",
  actor: socket.assigns.current_user,
  tenant: socket.assigns.current_user.factory_id
)

ToTenant

As I mentioned when talking about the Plug there was a call to Ash.ToTenant. This is like an interface and gives the Factory the ability to act like a tenant. This is most useful where the tenant ā€œkeyā€ isn’t an ID, eg when you are using schema based tenants. It also enables you to pass a Factory as a tenant instead of a factory.id. I think Ash will do the conversion anyway, but I added this code for completeness

defimpl Ash.ToTenant do
  def to_tenant(%{id: id}, _resource), do: "#{id}"
end

Conclusion

There’s still a lot about this that I’m not certain about and I’m sure I have got a lot of it wrong. The inconsistent behaviour of the current_tenant field in the conn and socket still bothers me. And while writing this it’s become clear that the whole business of storing and setting the Factory as the tenant isn’t necessary because I can just grab the factory_id from the current_user. This makes me think my design needs some work.

My app works at the moment but hopefully someone will confirm one way or another the right way to do this.


  1. Two of the most recent Ruby on Rails applications I have worked on have implemented multi-tenancy by having entirely separate stand alone databases for each client. This is good on one hand because there is absolutely no way you could accidentally expose the wrong clients information. But there are several downsides; it’s tempting to let the schemas drift out of sync, and when that happens it’s a nightmare to resolve. And from a developer point of view people working on the app aren’t exposed to the security implications of hosting clients in a single database. So if they move on to another company they don’t have that knowledge.

  2. One of the nice things about the design of Phoenix web apps is they are structured in two halves, your application code and the web code. So for an app called MyApp in the lib directory you will have my_app where your business logic and persistence code lives, and my_app_web where your presentation logic lives. This separation isn’t enforced, you can still reach into your database from your web views, but the separation reminds you not to do that.

  3. My totally noob perspective is that Phoenix started out as a traditional stateless request based framework. Then they added websockets using a channel based mechanism. Then they added the option to push server side rendered HTML through the websockets; LiveView. And now it appears they are moving to make stateful LiveView the default mechanism for rendering views. If this is true thensocket might take over as the default? Other than the existence of both the conn and the socket I really like this aspect of Phoenix’s design. Every function takes the conn which it can choose to build upon and at the end of the pipeline the conn is used to make the response. If I was going to write my own Ruby framework I would copy this design.

  4. I inspected the rendered form HTML and didn’t find a hidden tenant field.

#ash #elixir #phoenix