Ash Elixir•3y ago

flat file json or yml data layer

I have a very deeply nested JSON object that we think might be really nice to wrap a GraphQL query API over. Unless I missed it, I don't see data layers similar to the CSV data layer that are targeting flat files in JSON/YML but I think the example of the CSV data layer would be a good starting spot in terms of seeing what needs to be implemented for the behavior. I only really need to support read/query actions, and ideally I'd like to provide the data from the previous GQL field as the context for any child fields. I'm pretty sure I could devise a pretty quick strategy for this in Absinthe just based on dated but previous experience but I'm attracted to the resource model of ash and some of the adjacent functionality I could implement around the GQL API. Is there a direction I should start in regards to read/query only access to flat files and context passing between associations that exist as sub-resources on a type that plays well with Ash OOTB? Is writing my own data layer the best or only design choice?

9 Replies

ZachDaniel•3y ago

Hey there! Is this flat file information in a file that changes or doesn't change?

jedschneiderOP•3y ago

in the current strategy, the flat file would be read-only and any changes would be versioned objects in the same S3 location.

ZachDaniel•3y ago

ah, okay but you can't do the work at compile time, you'll want to read the file on every request

jedschneiderOP•3y ago

right, it would be accessed via HTTP or S3 url, in the root query, was directionally what i was thinking. rootQuery -> get s3 object childField -> lens over object nextChildField -> lens over childObject context Each field would map to an Ash Resource, each resource would need to know how to take the input JSON and return resources (well, either a property on the existing ash resource, or a nested one)

ZachDaniel•3y ago

TBH I think you might be surprised at how easy it is to get what you want without needing to write a custom data layer. With the simple datalayer (the default) and some embedded resources you can get pretty much everything you want. If your structure is basically the same, you can do this:

defmodule SomeNestedResource do
  use Ash.Resource, data_layer: :embedded
  
  attributes do
    attribute :foo, :integer
  end
end

defmodule MyApp.Resource do
  use Ash.Resource # no data layer

  attributes do
    uuid_primary_key :id # everything needs a primary key in ash. This will generate one automatically on every read, but you could do something like this if there is a primary key in the data already
    # attribute :id, :integer, primary_key? true
    attribute :foo, :string
    attribute :some_nested_resources. {:array, SomeNestedResource}
  end

  actions do
    defaults [:create, :read]
  end

  preparations do
    # preparations run 
    prepare fn query, _ -> 
      Ash.Query.before_action(query, fn query -> 
        data = 
          read_file()
          |> Enum.map(fn json_object -> 
            __MODULE__
            # this will cast all of the embedded resources
            |> Ash.Changeset.for_create(:create, json_object) 
            |> YourApi.create!()
          end)

        Ash.DataLayer.Simple.set_data(query, data)
      end)
    end
  end
end

defmodule SomeNestedResource do
  use Ash.Resource, data_layer: :embedded
  
  attributes do
    attribute :foo, :integer
  end
end

defmodule MyApp.Resource do
  use Ash.Resource # no data layer

  attributes do
    uuid_primary_key :id # everything needs a primary key in ash. This will generate one automatically on every read, but you could do something like this if there is a primary key in the data already
    # attribute :id, :integer, primary_key? true
    attribute :foo, :string
    attribute :some_nested_resources. {:array, SomeNestedResource}
  end

  actions do
    defaults [:create, :read]
  end

  preparations do
    # preparations run 
    prepare fn query, _ -> 
      Ash.Query.before_action(query, fn query -> 
        data = 
          read_file()
          |> Enum.map(fn json_object -> 
            __MODULE__
            # this will cast all of the embedded resources
            |> Ash.Changeset.for_create(:create, json_object) 
            |> YourApi.create!()
          end)

        Ash.DataLayer.Simple.set_data(query, data)
      end)
    end
  end
end

If the embedded resources are mostly the same, then you can do things like add arguments to the create/update actions on those resources:

# in embedded resource
actions do
  create :create do
    argument :foo, :string
    change set_attribute(:bar, arg(:foo)) # map nested attribute name to a different name 
  end
end

# in embedded resource
actions do
  create :create do
    argument :foo, :string
    change set_attribute(:bar, arg(:foo)) # map nested attribute name to a different name 
  end
end

You could also use "regular elixir" to do said transformation in the preparation:

        data = 
          read_file()
          |> transform_into_your_structure()
          |> Enum.map(fn json_object -> 
            __MODULE__
            # this will cast all of the embedded resources
            |> Ash.Changeset.for_create(:create, json_object) 
            |> YourApi.create!()
          end)

        data = 
          read_file()
          |> transform_into_your_structure()
          |> Enum.map(fn json_object -> 
            __MODULE__
            # this will cast all of the embedded resources
            |> Ash.Changeset.for_create(:create, json_object) 
            |> YourApi.create!()
          end)

And finally, if you have lots of transformations you need to do and you want to do them lazily based on what is selected in the graphql, you would instead use calculations instead of attributes, for example:

attributes do
  attribute :json, :map do
    private? true
  end
end

calculate :foo, :string, GetFoo

attributes do
  attribute :json, :map do
    private? true
  end
end

calculate :foo, :string, GetFoo

and GetFoo would look like this:

defmodule GetFoo do
  use Ash.Calculation
  def calculate(records, _, _) do
    Enum.map(records, fn record -> 
      record.json["foo"] |> transform_in_some_way()
    end)
  end
end

defmodule GetFoo do
  use Ash.Calculation
  def calculate(records, _, _) do
    Enum.map(records, fn record -> 
      record.json["foo"] |> transform_in_some_way()
    end)
  end
end

And that calculation pattern should work for embedded resources too, so you can have a calculation that produces an embedded resource where you just set the json attribute (which won't appear in the graphql), and then have calculations that will handle nested attributes. One of those strategies ought to do the trick, and then Ash & AshGraphql will handle the rest for you

jedschneiderOP•3y ago

@Zach Daniel this is such a thoughtful response, thanks so much, I'm gonna give this a go this week and see how it works out. Really appreciate it!

ZachDaniel•3y ago

LMK how it goes! The other nice thing about this set up is that your data will be filterable and sortable OOTB We filter and sort in memory collections like what you’d return from your actions, so it will be filterable automatically via the generated graphql

jedschneiderOP•3y ago

yeh, thats part of what i was counting on, and granular policies on the resource layer

ZachDaniel•3y ago

(Of course we do it in the database for other data layers)

Gaming

Programming

flat file json or yml data layer

Did you find this page helpful?