st_distance vs <-> in ash_geo for nearest neighbor search/filter (knn)

I'm just digging into ash_geo and attempting to implement a knn filter, as described here: https://postgis.net/workshops/postgis-intro/knn.html
Unlike a distance search, the “nearest neighbour” search doesn’t include any measurement restricting how far away candidate geometries might be, features of any distance away will be accepted, as long as they are the nearest. PostgreSQL solves the nearest neighbor problem by introducing an “order by distance” (<->) operator that induces the database to use an index to speed up a sorted return set. With an “order by distance” operator in place, a nearest neighbor query can return the “N nearest features” just by adding an ordering and limiting the result set to N entries.
I see the st_distance function, in both ash_geo and geo_postgis, but don't yet grok how to use it like the <-> operator, to get an unbounded, sorted result set. I initially dove in with st_distance but then realized I was looking at something ~like:
filter expr(^st_distance(^arg(:search_point), location) < some_integer_distance?) # not quite what we want
filter expr(^st_distance(^arg(:search_point), location) < some_integer_distance?) # not quite what we want
The application idea is a user types in an address, I geocode that and get {long, lat}, and run a knn against that to see the nearest points of interest. Is there a nearest function I'm overlooking or some other way to filter and order geometries? Manually run raw SQL? Thanks! (Sorry if this is the wrong place, I didn't see an ash_geo tag)
14 Replies
ZachDaniel
ZachDaniel2y ago
You can use fragment to embed a raw sql fragment in your expression, i.e fragment("? + ?", field1, field2), does that help?
\ ឵឵឵
\ ឵឵឵2y ago
For kNN, you definitely want order by ? <-> ?, as the index performance is significantly better. The rest gets a little finnicky: - Currently using this in a sort clause requires that you create an expression-based calculation. - This will need to be a module-based Ash.Calculation, implementing expression/2. - Even module-based calcs only receive the context, but not a changeset, so you would need to use the set_context change builtin. - As far as I know, set_context doesn't support the arg(:?) syntax, but does accept an MFA which will receive the changeset, so you need to provide one that will extract the argument(s) you want. - Your calc module can then use these to build the expression, which you can then sort on. Take all of the above with one or two bits of salt, but either way it might be a more maintainable solution for now to do a manual read/modify_query.
ZachDaniel
ZachDaniel2y ago
FWIW, you can technically use Ash.Calculation.new and put that in a sort clause calculations can accept arguments, also
calculate :score_plus_n, :integer, expr(score + ^arg(:n)) do
argument :n, :integer, allow_nil?: false
end
calculate :score_plus_n, :integer, expr(score + ^arg(:n)) do
argument :n, :integer, allow_nil?: false
end
load(score_plus_n: %{n: 10}) and sort: [score_plus_n: {:asc, %{n: 10}}]
\ ឵឵឵
\ ឵឵឵2y ago
This part seems fine:
calculate :distance_to, :float, expr(fragment("? <-> ?", location, ^arg(:search_point)))
argument :search_point, App.NarrowedPointType, allow_nil?: false
end
calculate :distance_to, :float, expr(fragment("? <-> ?", location, ^arg(:search_point)))
argument :search_point, App.NarrowedPointType, allow_nil?: false
end
But it looks like the arguments to the calculation in load are statically supplied in the example. The argument is meant to be a geometry (point) supplied as user input. Can load, sort or calculate capture arguments from the action?
ZachDaniel
ZachDaniel2y ago
Kind of. But the idea is that you pass the values from the client into the call to load/sort for example Like in gql those calculation arguments are added as arguments to the field
axdc
axdcOP2y ago
I have my system working! The action:
read :nearest do
argument :location, :geo_any do
allow_nil? false
constraints geo_types: :point, force_srid: 4326
end

modify_query {Panacea.Lociary.ManualNearest, :modify, []}
end
read :nearest do
argument :location, :geo_any do
allow_nil? false
constraints geo_types: :point, force_srid: 4326
end

modify_query {Panacea.Lociary.ManualNearest, :modify, []}
end
The "modified" query (I guess I'm more making one from scratch using the ash argument):
defmodule Panacea.Lociary.ManualNearest do
import Ecto.Query

def modify(ash_query, ecto_query) do

location = ash_query.arguments.location
modified_ecto_query =
from listing in Panacea.Lociary.Listing,
where: not (listing.location |> is_nil()),
order_by: [asc: fragment("? <-> ?", listing.location, ^location)],
limit: 10

IO.inspect(modified_ecto_query)
{:ok, modified_ecto_query}
end
end
defmodule Panacea.Lociary.ManualNearest do
import Ecto.Query

def modify(ash_query, ecto_query) do

location = ash_query.arguments.location
modified_ecto_query =
from listing in Panacea.Lociary.Listing,
where: not (listing.location |> is_nil()),
order_by: [asc: fragment("? <-> ?", listing.location, ^location)],
limit: 10

IO.inspect(modified_ecto_query)
{:ok, modified_ecto_query}
end
end
I'm going to have to filter out some obviously bad data in addition to nils (i've got some null islands and some just... strange entries) but it's working! (I'm not sure if this is optimal or the pros/cons vs a calculate-based approach)
ZachDaniel
ZachDaniel2y ago
I'd suggest using a calculation and a preparation, as opposed to modify_query.
require Ash.Query # <- don't forget this for `Ash.Query.where`

actions do
read :nearest do
argument :location, :geo_any do
allow_nil? false
constraints geo_types: :point, force_srid: 4326
end

# you can also use a module here
prepare fn query, _ ->
query
|> Ash.Query.limit(10)
|> Ash.Query.filter(not is_nil(location))
|> Ash.Query.sort(distance_from: {:asc, %{location: query.arguments.location}})
end
end
end

calculations do
calculate :distance_from, :is_this_an_integer, expr(fragment("? <-> ?", location, ^arg(:location)) do
argument :location, :geo_any, constraints: [geo_types: :point, force_srid: 4326]
end
end
require Ash.Query # <- don't forget this for `Ash.Query.where`

actions do
read :nearest do
argument :location, :geo_any do
allow_nil? false
constraints geo_types: :point, force_srid: 4326
end

# you can also use a module here
prepare fn query, _ ->
query
|> Ash.Query.limit(10)
|> Ash.Query.filter(not is_nil(location))
|> Ash.Query.sort(distance_from: {:asc, %{location: query.arguments.location}})
end
end
end

calculations do
calculate :distance_from, :is_this_an_integer, expr(fragment("? <-> ?", location, ^arg(:location)) do
argument :location, :geo_any, constraints: [geo_types: :point, force_srid: 4326]
end
end
\ ឵឵឵
\ ឵឵឵2y ago
I like this, because what I really want to say is:
filter expr(not is_nil(location))
sort :distance_from, location: arg(:location)
limit 10
filter expr(not is_nil(location))
sort :distance_from, location: arg(:location)
limit 10
Reading the body of the prepare fn, this seems like it is closest to that.
axdc
axdcOP2y ago
I'm getting error: undefined variable "listing", reading the prepare docs. Haven't used this part of Ash yet! Is a benefit of using Ash primitives like preparations and calculations rather than modify_query that it will be simpler to integrate with other ash functionality like pagination?
ZachDaniel
ZachDaniel2y ago
require Ash.Query at the top and yes, its exactly that kind of thing 🙂 oh, lol Ash.Query.where is not a thing its Ash.Query.filter , sorry 😆 have been writing ecto code recently
axdc
axdcOP2y ago
given that correction:
* filter: Invalid reference listing.location at relationship_path [:listing]
at filter
* filter: Invalid reference listing.location at relationship_path [:listing]
at filter
Nice, definitely wanna stay in-ecosystem for all the goodies 🙂
ZachDaniel
ZachDaniel2y ago
Sorry, remove listing. I copied it from your example and in ecto you use bindings, in Ash its implicit
axdc
axdcOP2y ago
require Ash.Query
require Ash.Query
read :nearest do
argument :location, :geo_any do
allow_nil? false
constraints geo_types: :point, force_srid: 4326
end

prepare fn query, _ ->
query
|> Ash.Query.filter(not is_nil(location))
|> Ash.Query.sort(distance_from: {:asc, %{location: query.arguments.location}})
end

pagination do
offset? true
end
end
read :nearest do
argument :location, :geo_any do
allow_nil? false
constraints geo_types: :point, force_srid: 4326
end

prepare fn query, _ ->
query
|> Ash.Query.filter(not is_nil(location))
|> Ash.Query.sort(distance_from: {:asc, %{location: query.arguments.location}})
end

pagination do
offset? true
end
end
calculations do
calculate :distance_from,
:geometry,
expr(fragment("? <-> ?", location, ^arg(:location))) do
argument :location, :geo_any, constraints: [geo_types: :point, force_srid: 4326]
end
end
calculations do
calculate :distance_from,
:geometry,
expr(fragment("? <-> ?", location, ^arg(:location))) do
argument :location, :geo_any, constraints: [geo_types: :point, force_srid: 4326]
end
end
ZachDaniel
ZachDaniel2y ago
The other benefit to this strategy is that you can also do things like:
Resource
|> Ash.Query.for_read(:nearest, %{location: location})
|> Ash.Query.load(distance_from: %{location: location})
Resource
|> Ash.Query.for_read(:nearest, %{location: location})
|> Ash.Query.load(distance_from: %{location: location})
You can return the distance or Ash.Query.filter(distance_from(location: location) < ^some_threshold)

Did you find this page helpful?