Ash Framework•6mo ago

Ash query with 1 filter performs worse than the same Ecto query

We've noticed quite bad performance when querying data produced by paper_trail using Ash. The same kind of queries made with regular Ecto.Query perform much better. The tables involved are huge (it's paper trail after all), but the column by which we filter is indexed. Upon examining the difference between the produced queries, the only difference I see is in the WHERE clause: Ash:

OurApp.Resource.Version
|> Ash.Query.new()
|> Ash.Query.filter(query, document_id: document_id)
|> Ash.read!()

OurApp.Resource.Version
|> Ash.Query.new()
|> Ash.Query.filter(query, document_id: document_id)
|> Ash.read!()

... WHERE (d0."document_id"::bigint::bigint = $1::bigint::bigint) [12345]

... WHERE (d0."document_id"::bigint::bigint = $1::bigint::bigint) [12345]

Ecto:

Repo.all(from(s in OurApp.Resource.Version, where: s.document_id == ^document_id))

Repo.all(from(s in OurApp.Resource.Version, where: s.document_id == ^document_id))

... WHERE (d0."document_id" = $1) [12345]

... WHERE (d0."document_id" = $1) [12345]

The document_id is added to the Version resources using attributes_as_attributes([:document_id]) within paper_trail. Why is this casting produced in the Ash version of the query necessary? Is it something that can be avoided by better definitions and or configuration?

10 Replies

Rebecca Le•6mo ago

what version of Ash are you using? I think Zach fixed something about that recently https://discord.com/channels/711271361523351632/1245194918490542132/1379839226966835281

arconautOP•6mo ago

In prod we currently have ash 3.4.67 and ash_posgres 2.5.10. But I've tried locally with ash 3.5.18 and ash_postgres 2.6.6 - same SQL query is produced.

ZachDaniel•6mo ago

😡will fix Question: is the performance issue related to not using an index? nvm I can see that it is from your original question 🙂 Is document_id actually a bigint in your database? So AFAIK, this should not stop the query from using an index I did some local testing to that effect. I removed the double query casting issue in a previous version, but it hit up on some edge cases where the type casting was doing work and so I added back in in certain cases temporarily (we're not intentionally double type-casting, but it was a cosmetic issue so better that than queries failing). So the thing that we need to figure out at this point is: - is the type casting actually preventing you from using the index - if not, is the type casting actually the source of the issue I'm working on a more targetd patch to ash_sql to reduce the double casting, but we'll need to confirm what the cause is. i.e if you have access to production, can you run that same ecto query but like this:

Repo.all(from(s in OurApp.Resource.Version, 
  where: fragment("?::bigint::bigint", [s.document_id]) == fragment("?::bigint::bigint", [^document_id]))

Repo.all(from(s in OurApp.Resource.Version, 
  where: fragment("?::bigint::bigint", [s.document_id]) == fragment("?::bigint::bigint", [^document_id]))

Also be aware that depending on how you are testing, the first time you run any ecto query you pay a caching cost, so if you're testing locally for example by running the ash query and then running the ecto query, make sure you run the ash query and the ecto query once before comparing. Okay, so I have a fix for the double casting. BUT: keep in mind that in general index usage is typically helped with type casts, because it solves for ambiguity around the type fo something during planning. This may be slightly outdated depending on the postgres version you are on, but generally speaking type casting into the types expected by an index at least won't hurt. So its possible there is some other kind of performance issue 🙂 i.e maybe there is some kind of issue with loading data into Ash structs etc. on ash_sql version 0.2.80 that query should now be generated as

WHERE (d0."document_id"::bigint = $1::bigint) [12345]

WHERE (d0."document_id"::bigint = $1::bigint) [12345]

I did some benchmarking, and there is overhead to using an Ash action for doing a lookup id, but its roughly 100 usec which probably isn't worth optimizing at this point. i.e 100usec to lookup by id for Ecto, and 200usec to lookup by id for Ash. Thats important, and there are a whole slew of code optimizations that could be made to that end, but still, 200usec is pretty fast. @arconaut any updates on this? The latest ash_sql version 0.2.80 should fix the double casting, but I'd still be pretty surprised if thats your issue

Failz•6mo ago

Run the ash query through explain/analyze and share the results.

arconautOP•6mo ago

Sorry, a bit of a busy day yesterday, didn't get the chance to come back to it. I'll try to look into this today. I can confirm that the double-casting is gone indeed in ash_sql 0.2.80. I've run EXPLAIN ANALYZE for both versions of the query, each time with a new connection (subsequent queries are indeed use cache). The version with casting (as ash_sql 0.2.80 produces), often this even times out after 15 sec:

Gather  (cost=1000.00..1085496.95 rows=60757 width=616) (actual time=6635.784..9291.554 rows=20 loops=1)
  Workers Planned: 2
  Workers Launched: 1
  ->  Parallel Seq Scan on proofreading_issues_versions p0  (cost=0.00..1078421.25 rows=25315 width=616) (actual time=7942.632..9267.804 rows=10 loops=2)
        Filter: ((document_id)::bigint = '791785'::bigint)
        Rows Removed by Filter: 5861568
Planning Time: 0.253 ms
Execution Time: 9291.610 ms

Gather  (cost=1000.00..1085496.95 rows=60757 width=616) (actual time=6635.784..9291.554 rows=20 loops=1)
  Workers Planned: 2
  Workers Launched: 1
  ->  Parallel Seq Scan on proofreading_issues_versions p0  (cost=0.00..1078421.25 rows=25315 width=616) (actual time=7942.632..9267.804 rows=10 loops=2)
        Filter: ((document_id)::bigint = '791785'::bigint)
        Rows Removed by Filter: 5861568
Planning Time: 0.253 ms
Execution Time: 9291.610 ms

Index Scan using proofreading_issues_versions_document_id_index on proofreading_issues_versions p0  (cost=0.43..550.18 rows=162 width=616) (actual time=0.061..1.317 rows=20 loops=1)
  Index Cond: (document_id = 791785)
Planning Time: 0.370 ms
Execution Time: 1.345 ms

Index Scan using proofreading_issues_versions_document_id_index on proofreading_issues_versions p0  (cost=0.43..550.18 rows=162 width=616) (actual time=0.061..1.317 rows=20 loops=1)
  Index Cond: (document_id = 791785)
Planning Time: 0.370 ms
Execution Time: 1.345 ms

It looks like the index is ignored indeed when casting the argument. The document_id is of type integer in that table, is that perhaps the reason? We may have manually written the DB migrations for the paper_trail tables (from not knowing how to generate them back then). If it helps: - PostgreSQL 17.4

Failz•6mo ago

The slow seq scan version is definitely missing the index. Seq scan means it's reading the entire table. Can you share the output of \d proofreading_issues_versions. Might also be worth running analyze on the proofreading_issues_versions table. The query planner estimated rows=25315 rows would be returned when it really returned rows=10. Analyze will update the statistics the postgres query planner uses and help it make better choices. edit: I didn't read the bottom paragraph of your post before responding. 🤦‍♂️ If the db column is integer and the query is casting to bigint I think that explains the index miss. I think there are certain situations where postgres will handle this gracefully, but I don't know or understand all the use cases for it. Initially I would have thought that because your query parameter of 791785 is within the integer range it would have been able to convert it, but obviously its missing the index. If the query parameter would have been something above the integer max value I would expect a index miss.

ZachDaniel•6mo ago

Yes, that will be the problem 🙂 Let me look at something Okay. So right now we are casting all integers to bigints, and I think no one cared because the migration generator also creates all integer fields as bigints. I will figure out a way for you to express on the attribute that it is a regular integer to avoid the type casting, and then you should be good to go 👌

arconautOP•6mo ago

Thanks, Zach! Sounds good! As homework for our team - we should be more careful with integer types (maybe we need to implement some DB migrations)

ZachDaniel•6mo ago

Something that could also be pretty useful is for Ash to have a tool you can run to find things like this Since we know all of the types of everything @arconaut I think you actually have the tools to do this right now

defmodule MyApp.Types.SmallInt do
  use Ash.Type.NewType, subtype_of: :integer

  def storage_type(_), do: :int4
end

defmodule MyApp.Types.SmallInt do
  use Ash.Type.NewType, subtype_of: :integer

  def storage_type(_), do: :int4
end

See how that does for you then use that in place of your integer type

arconautOP•6mo ago

Oh, interesting, thanks!

Gaming

Programming

Ash query with 1 filter performs worse than the same Ecto query

Did you find this page helpful?