T
TanStack4w ago
ambitious-aqua

Pattern: Populating a collection "as needed" with multiple Queries throughout app - no "base" query.

This thread explores using various useQuery hooks throughout the app to "feed" our collections, only adding records as they're called for throughout the app. We don't want to force the user to load entire db tables since 95% of the time they will only use a small % of them. We do want to progressively build up a store and use it's querying methods to performantly select from the data that has been fetched. As of 7/31 we have "manual sync update methods" per this issue: https://github.com/TanStack/db/issues/294 Though I'm not certain that quite addresses our goal, as it still seems to require the collection to be build on a default query. Within our app, we can't be certain which query will be called upon first to begin populating a given collection. In theory I'd like to do this:
import { dispatchCollection, vehicleDispatchCollection, workOrderCollection } from '@/db/dispatch.collections'

export const useDbDispatchDateQuery = (date: string) => {
return useQuery({
queryKey: [GET_DB_DISPATCH_DATE_QUERY_KEY, { date }],
queryFn: async () => {
const result = await mainGraphQLClient.request(GET_DB_DISPATCH_DATE, { date })

if (!result.dbDispatchesForDispatchDate) return
const { dispatches, vehicleDispatches, workOrders } = result.dbDispatchesForDispatchDate

dispatchCollection.utils.syncUpsert(dispatches)
vehicleDispatchCollection.utils.syncUpsert(vehicleDispatches)
workOrderCollection.utils.syncUpsert(workOrders)

return result.dbDispatchesForDispatchDate
},
enabled: !!date,
})
}
import { dispatchCollection, vehicleDispatchCollection, workOrderCollection } from '@/db/dispatch.collections'

export const useDbDispatchDateQuery = (date: string) => {
return useQuery({
queryKey: [GET_DB_DISPATCH_DATE_QUERY_KEY, { date }],
queryFn: async () => {
const result = await mainGraphQLClient.request(GET_DB_DISPATCH_DATE, { date })

if (!result.dbDispatchesForDispatchDate) return
const { dispatches, vehicleDispatches, workOrders } = result.dbDispatchesForDispatchDate

dispatchCollection.utils.syncUpsert(dispatches)
vehicleDispatchCollection.utils.syncUpsert(vehicleDispatches)
workOrderCollection.utils.syncUpsert(workOrders)

return result.dbDispatchesForDispatchDate
},
enabled: !!date,
})
}
...with the hope that it would update any records with a matching key (unique ID), and insert any new records. I'm already seeing a few related questions/issues/discord messages around a similar topic, so hopefully this thread can be a good centralized place to discuss the optimal approach for this. Pinging @Kyle Mathews since he helpfully pointed out the manual sync update.
GitHub
Add Manual Sync Updates API to @tanstack/query-db-collection · Iss...
Background The @tanstack/query-db-collection package integrates TanStack Query with TanStack DB collections, providing automatic synchronization between query results and collection state. Currentl...
24 Replies
skilled-lime
skilled-lime4w ago
if you want to lazily load data as needed — check out and comment on these proposals https://github.com/TanStack/db/issues/343 & https://github.com/TanStack/db/issues/315
GitHub
Paginated / Infinite Collections · Issue #343 · TanStack/db
A common request we are receiving is to lazily load data into a "query collection" using the infinite query pattern. We need to consider how to support this in a way that is then useable ...
GitHub
Partitioned collections · Issue #315 · TanStack/db
A very common use case, and question, is how to handle collections where you don't want to download all of it. Such as issues in an issue tracker, downloading by project/status/createdData etc....
ambitious-aqua
ambitious-aquaOP4w ago
Reading them now. Also you mentioned a "derived" collection that joins the query cache of multiple queries. That sounds like it could work, but I'm not seeing anything in the docs about derived collections or how to build them?
skilled-lime
skilled-lime4w ago
also you don't want to manually dispatch stuff like that as a normal course queries are derived collections
skilled-lime
skilled-lime4w ago
Overview | TanStack DB Docs
TanStack DB Documentation Welcome to the TanStack DB documentation. TanStack DB is a reactive client store for building super fast apps on sync. It extends TanStack Query with collections, live querie...
ambitious-aqua
ambitious-aquaOP4w ago
Oh, I was thinking react query queries, not live queries.
skilled-lime
skilled-lime4w ago
so
const dispatchCollection = createLiveQueryCollection({
startSync: true,
query: (q) =>
q.from({ todo: graphqlCollection }).where(({ item }) => eq(item.type, `dispatch`)),
})
const dispatchCollection = createLiveQueryCollection({
startSync: true,
query: (q) =>
q.from({ todo: graphqlCollection }).where(({ item }) => eq(item.type, `dispatch`)),
})
ambitious-aqua
ambitious-aquaOP4w ago
So that shows deriving from imported collection instances, not react query queries - does that mean that I should replace my useQuery hooks with their own createCollection( queryCollectionOptions({}) ) and then use the pattern you just shared to extract the relevant data types from each of them?
skilled-lime
skilled-lime4w ago
yup! Just convert the response into a flat array & w/ a type field of some sort and then you can easily spilt them into their own collections
ambitious-aqua
ambitious-aquaOP4w ago
So when multiple collections have a given ID, presumably I'll need to write custom logic so it knows how to reconcile which one should win? (probably just comparing "modifiedAt" in most cases)
skilled-lime
skilled-lime4w ago
what collections are you talking about?
ambitious-aqua
ambitious-aquaOP4w ago
Let's say I have three queries throughout the app that all fetch datasets that include Dispatch records, and though their filtersets are unique, they likely all contain some overlap - same database rows, but fetched at different times. The derived collection would be trying to combine all Dispatch records from all three query collections, and would see the same unique ID up to three times.
skilled-lime
skilled-lime4w ago
ah ok, then you could do a full outer join then to merge the overlapping collections
ambitious-aqua
ambitious-aquaOP4w ago
Hmm, is there a way to create a queryCollectionOptions with a dynamic param like date? Right now both enabled and queryKey can't access date.
export const dbDispatchDateQueryCollection = createCollection(
queryCollectionOptions({
queryKey: [GET_DB_DISPATCH_DATE_QUERY_KEY, { date }],
queryFn: async ({ date }: { date: string }) => {
const result = await mainGraphQLClient.request(GET_DB_DISPATCH_DATE, { date })

return result.dbDispatchesForDispatchDate
},
getKey: item => item.id,
enabled: !!date,
})
)
export const dbDispatchDateQueryCollection = createCollection(
queryCollectionOptions({
queryKey: [GET_DB_DISPATCH_DATE_QUERY_KEY, { date }],
queryFn: async ({ date }: { date: string }) => {
const result = await mainGraphQLClient.request(GET_DB_DISPATCH_DATE, { date })

return result.dbDispatchesForDispatchDate
},
getKey: item => item.id,
enabled: !!date,
})
)
skilled-lime
skilled-lime4w ago
Make a factory function?
ambitious-aqua
ambitious-aquaOP4w ago
Wouldn't that mean that the collections are destroyed when the component that calls them for a given date are un-mounted? Edit: Apparently not. AI studio has informed me that I am dumb. So is this the right idea? (aside from the goofy short gcTime)
// src/lib/createCollectionFactory.ts (NEW UTILITY FILE)
import { Collection } from '@tanstack/react-db';

// The TTL should be slightly longer than the collection's gcTime to avoid race conditions.
const CACHE_TTL = 6000; // 6 seconds for a 5-second gcTime

// This is a higher-order function: a function that creates a factory function.
export function createCollectionFactory<TItem, TParams extends string | number>(
creatorFn: (params: TParams) => Collection<TItem>
) {
const cache = new Map<TParams, Collection<TItem>>();
const timeouts = new Map<TParams, NodeJS.Timeout>();

return (params: TParams): Collection<TItem> => {
// If a cleanup timer was set for this key, cancel it because we're using it again.
if (timeouts.has(params)) {
clearTimeout(timeouts.get(params));
timeouts.delete(params);
}

// If the collection is already in the cache, return it.
if (cache.has(params)) {
return cache.get(params)!;
}

// Otherwise, create a new collection instance.
const newCollection = creatorFn(params);
cache.set(params, newCollection);

// IMPORTANT: Listen for when the collection is cleaned up by Tanstack DB's gcTime.
// When it is, we remove it from our factory's cache to prevent memory leaks.
const unsubscribe = newCollection.subscribe(() => {
if (newCollection.status === 'cleaned-up') {
cache.delete(params);
timeouts.delete(params); // Clean up any stray timeout
unsubscribe(); // Clean up the subscription itself
}
});

return newCollection;
};
}
// src/lib/createCollectionFactory.ts (NEW UTILITY FILE)
import { Collection } from '@tanstack/react-db';

// The TTL should be slightly longer than the collection's gcTime to avoid race conditions.
const CACHE_TTL = 6000; // 6 seconds for a 5-second gcTime

// This is a higher-order function: a function that creates a factory function.
export function createCollectionFactory<TItem, TParams extends string | number>(
creatorFn: (params: TParams) => Collection<TItem>
) {
const cache = new Map<TParams, Collection<TItem>>();
const timeouts = new Map<TParams, NodeJS.Timeout>();

return (params: TParams): Collection<TItem> => {
// If a cleanup timer was set for this key, cancel it because we're using it again.
if (timeouts.has(params)) {
clearTimeout(timeouts.get(params));
timeouts.delete(params);
}

// If the collection is already in the cache, return it.
if (cache.has(params)) {
return cache.get(params)!;
}

// Otherwise, create a new collection instance.
const newCollection = creatorFn(params);
cache.set(params, newCollection);

// IMPORTANT: Listen for when the collection is cleaned up by Tanstack DB's gcTime.
// When it is, we remove it from our factory's cache to prevent memory leaks.
const unsubscribe = newCollection.subscribe(() => {
if (newCollection.status === 'cleaned-up') {
cache.delete(params);
timeouts.delete(params); // Clean up any stray timeout
unsubscribe(); // Clean up the subscription itself
}
});

return newCollection;
};
}
Then use it like this:
// src/features/dispatch/dispatch.source-collections.ts (REVISED FACTORY)
import { createCollection } from '@tanstack/react-db';
import { queryCollectionOptions } from '@tanstack/query-db-collection';
import { createCollectionFactory } from '@/lib/createCollectionFactory'; // <-- IMPORT THE NEW UTILITY

// ... other imports and type definitions ...
type SourceItem = DispatchType | VehicleDispatchType | WorkOrderType;

// --- Factory for the Date-based Query ---
export const getSourceCollectionByDate = createCollectionFactory((date: string) => {
return createCollection(
queryCollectionOptions<SourceItem>({
queryKey: ['dbDispatchDate', { date }],
queryFn: async () => { /* ... fetch and flatten data ... */ },
getKey: (item) => `${item.__typename}:${item.id}`,
// You can configure gcTime here if you want it to be longer or shorter than 5s
// gcTime: 30000, // e.g., 30 seconds
})
);
});

// --- Factory for the Job-based Query ---
export const getSourceCollectionByJob = createCollectionFactory((jobId: string) => {
return createCollection(
queryCollectionOptions<SourceItem>({
queryKey: ['dispatchesByJob', { jobId }],
queryFn: async () => { /* ... fetch and flatten data ... */ },
getKey: (item) => `${item.__typename}:${item.id}`,
})
);
});
// src/features/dispatch/dispatch.source-collections.ts (REVISED FACTORY)
import { createCollection } from '@tanstack/react-db';
import { queryCollectionOptions } from '@tanstack/query-db-collection';
import { createCollectionFactory } from '@/lib/createCollectionFactory'; // <-- IMPORT THE NEW UTILITY

// ... other imports and type definitions ...
type SourceItem = DispatchType | VehicleDispatchType | WorkOrderType;

// --- Factory for the Date-based Query ---
export const getSourceCollectionByDate = createCollectionFactory((date: string) => {
return createCollection(
queryCollectionOptions<SourceItem>({
queryKey: ['dbDispatchDate', { date }],
queryFn: async () => { /* ... fetch and flatten data ... */ },
getKey: (item) => `${item.__typename}:${item.id}`,
// You can configure gcTime here if you want it to be longer or shorter than 5s
// gcTime: 30000, // e.g., 30 seconds
})
);
});

// --- Factory for the Job-based Query ---
export const getSourceCollectionByJob = createCollectionFactory((jobId: string) => {
return createCollection(
queryCollectionOptions<SourceItem>({
queryKey: ['dispatchesByJob', { jobId }],
queryFn: async () => { /* ... fetch and flatten data ... */ },
getKey: (item) => `${item.__typename}:${item.id}`,
})
);
});
Then to create a unified live query, we'd need to manage the generated collections in global state, right? eg.
// src/hooks/useRegisterSourceCollection.ts
// ...imports

export type SourceItem = DispatchType | VehicleDispatchType | WorkOrderType;
export const activeSourceCollectionsAtom = atom<Collection<SourceItem>[]>([]);

// A module-level map to keep track of removal timers for each collection instance.
const removalTimers = new Map<Collection<SourceItem>, NodeJS.Timeout>();

export const useRegisterSourceCollection = (collection: Collection<SourceItem> | null) => {
const setActiveSourceCollections = useSetAtom(activeSourceCollectionsAtom);

useEffect(() => {
if (!collection) return;

// Check if a removal timer is pending for this collection.
// If so, the user has returned to a view using this collection before it was GC'd.
if (removalTimers.has(collection)) {
// Cancel the pending removal.
clearTimeout(removalTimers.get(collection)!);
removalTimers.delete(collection);
}

// Add the collection to the global registry if it's not already there.
setActiveSourceCollections((prev) => {
if (prev.includes(collection)) {
return prev;
}
return [...prev, collection];
});

// On unmount, schedule the collection for removal from the global registry.
return () => {
// Set a timer to remove the collection after the delay.
const timerId = setTimeout(() => {
console.log(`Removing collection from active registry due to gcTime expiration...`);
setActiveSourceCollections((prev) => prev.filter((c) => c !== collection));
removalTimers.delete(collection);
}, REMOVAL_DELAY_MS);

removalTimers.set(collection, timerId);
};
}, [collection, setActiveSourceCollections]);
};
// src/hooks/useRegisterSourceCollection.ts
// ...imports

export type SourceItem = DispatchType | VehicleDispatchType | WorkOrderType;
export const activeSourceCollectionsAtom = atom<Collection<SourceItem>[]>([]);

// A module-level map to keep track of removal timers for each collection instance.
const removalTimers = new Map<Collection<SourceItem>, NodeJS.Timeout>();

export const useRegisterSourceCollection = (collection: Collection<SourceItem> | null) => {
const setActiveSourceCollections = useSetAtom(activeSourceCollectionsAtom);

useEffect(() => {
if (!collection) return;

// Check if a removal timer is pending for this collection.
// If so, the user has returned to a view using this collection before it was GC'd.
if (removalTimers.has(collection)) {
// Cancel the pending removal.
clearTimeout(removalTimers.get(collection)!);
removalTimers.delete(collection);
}

// Add the collection to the global registry if it's not already there.
setActiveSourceCollections((prev) => {
if (prev.includes(collection)) {
return prev;
}
return [...prev, collection];
});

// On unmount, schedule the collection for removal from the global registry.
return () => {
// Set a timer to remove the collection after the delay.
const timerId = setTimeout(() => {
console.log(`Removing collection from active registry due to gcTime expiration...`);
setActiveSourceCollections((prev) => prev.filter((c) => c !== collection));
removalTimers.delete(collection);
}, REMOVAL_DELAY_MS);

removalTimers.set(collection, timerId);
};
}, [collection, setActiveSourceCollections]);
};
and THEN unify with a live query:
// src/features/dispatch/hooks/useUnifiedDispatches.ts
// ...imports

export const useUnifiedDispatches = () => {
// 1. Subscribe to the list of active source collections
const activeSources = useAtomValue(activeSourceCollectionsAtom);

// 2. Create a live query that depends on the list of active sources
const { data, ...rest } = useLiveQuery(
(q) => {
// Handle edge cases: no sources active yet
if (activeSources.length === 0) {
return q.from({ empty: [] }).select(() => ({} as DispatchType));
}

// Dynamically build the query with full outer joins
let query = q.from({ s0: activeSources[0] });
const aliases = ['s0'];

// Chain full joins for all other active sources
for (let i = 1; i < activeSources.length; i++) {
const alias = `s${i}`;
aliases.push(alias);
query = query.fullJoin({ [alias]: activeSources[i] }, (row) =>
// Join on our unique composite key
eq(row.s0.id, row[alias].id)
);
}

// Merge the results and filter for only Dispatches
return query
.select((row) => {
// Coalesce finds the first non-null value, effectively merging the rows.
// We reverse the aliases to prioritize sources added later.
const aliasedSources = aliases.reverse().map(alias => row[alias]);
return coalesce(...aliasedSources) as SourceItem;
})
.where((merged) => eq(merged.__typename, 'Dispatch'))
.select((merged) => merged as DispatchType); // Final cast to the correct type
},
[activeSources] // CRITICAL: Rerun the query builder when the list of sources changes
);

return { data: data ?? [], ...rest };
};
// src/features/dispatch/hooks/useUnifiedDispatches.ts
// ...imports

export const useUnifiedDispatches = () => {
// 1. Subscribe to the list of active source collections
const activeSources = useAtomValue(activeSourceCollectionsAtom);

// 2. Create a live query that depends on the list of active sources
const { data, ...rest } = useLiveQuery(
(q) => {
// Handle edge cases: no sources active yet
if (activeSources.length === 0) {
return q.from({ empty: [] }).select(() => ({} as DispatchType));
}

// Dynamically build the query with full outer joins
let query = q.from({ s0: activeSources[0] });
const aliases = ['s0'];

// Chain full joins for all other active sources
for (let i = 1; i < activeSources.length; i++) {
const alias = `s${i}`;
aliases.push(alias);
query = query.fullJoin({ [alias]: activeSources[i] }, (row) =>
// Join on our unique composite key
eq(row.s0.id, row[alias].id)
);
}

// Merge the results and filter for only Dispatches
return query
.select((row) => {
// Coalesce finds the first non-null value, effectively merging the rows.
// We reverse the aliases to prioritize sources added later.
const aliasedSources = aliases.reverse().map(alias => row[alias]);
return coalesce(...aliasedSources) as SourceItem;
})
.where((merged) => eq(merged.__typename, 'Dispatch'))
.select((merged) => merged as DispatchType); // Final cast to the correct type
},
[activeSources] // CRITICAL: Rerun the query builder when the list of sources changes
);

return { data: data ?? [], ...rest };
};
Then finally create/use it:
import { useLiveQuery } from '@tanstack/react-db';
import { getSourceCollectionByDate } from '../dispatch.source-collections';
import { useRegisterSourceCollection } from '../hooks/useRegisterSourceCollection';
import { useUnifiedDispatches } from '../hooks/useUnifiedDispatches';

function DispatchScreen({ dateString }) {
// 1. Get the source collection instance for this specific date from the factory.
const sourceCollection = getSourceCollectionByDate(dateString);

// 2. Register this source collection with our global store.
// It will be automatically removed when this component unmounts.
useRegisterSourceCollection(sourceCollection);

// 3. Any component in the app can now use this hook to get ALL dispatches.
const { data: allKnownDispatches, isLoading } = useUnifiedDispatches();

// You can still perform additional client-side filtering if needed for the view
const dispatchesForThisDate = allKnownDispatches.filter(d => d.date === dateString);

if (isLoading) return <Spinner />;

// ... render your UI with `dispatchesForThisDate`
}

// In your Job Details Component
function JobDetails({ jobId }) {
// 1. This component gets and registers a DIFFERENT source collection.
const sourceCollection = getSourceCollectionByJob(jobId);
useRegisterSourceCollection(sourceCollection);

// 2. It can also use the SAME unified hook to get all known dispatches.
const { data: allKnownDispatches } = useUnifiedDispatches();

// This component's data need is different, so it filters differently.
const dispatchesForThisJob = allKnownDispatches.filter(d => d.workOrder?.jobId === jobId);

// ... render UI
}
import { useLiveQuery } from '@tanstack/react-db';
import { getSourceCollectionByDate } from '../dispatch.source-collections';
import { useRegisterSourceCollection } from '../hooks/useRegisterSourceCollection';
import { useUnifiedDispatches } from '../hooks/useUnifiedDispatches';

function DispatchScreen({ dateString }) {
// 1. Get the source collection instance for this specific date from the factory.
const sourceCollection = getSourceCollectionByDate(dateString);

// 2. Register this source collection with our global store.
// It will be automatically removed when this component unmounts.
useRegisterSourceCollection(sourceCollection);

// 3. Any component in the app can now use this hook to get ALL dispatches.
const { data: allKnownDispatches, isLoading } = useUnifiedDispatches();

// You can still perform additional client-side filtering if needed for the view
const dispatchesForThisDate = allKnownDispatches.filter(d => d.date === dateString);

if (isLoading) return <Spinner />;

// ... render your UI with `dispatchesForThisDate`
}

// In your Job Details Component
function JobDetails({ jobId }) {
// 1. This component gets and registers a DIFFERENT source collection.
const sourceCollection = getSourceCollectionByJob(jobId);
useRegisterSourceCollection(sourceCollection);

// 2. It can also use the SAME unified hook to get all known dispatches.
const { data: allKnownDispatches } = useUnifiedDispatches();

// This component's data need is different, so it filters differently.
const dispatchesForThisJob = allKnownDispatches.filter(d => d.workOrder?.jobId === jobId);

// ... render UI
}
Though that's a lot of hoops to jump through just to get to use the querying/selector capability of db. To avoid the factories, I'm creating a "default" query for each table that gets the most common recent records - they'll all run at launch. Then I'll use the new utils.syncUpsert function to add the "as needed" results from my various useQuery hooks.
skilled-lime
skilled-lime4w ago
Not at computer still hard to give this a close read -- but yeah just directly writing out could definitely be easier Using a weakmap is probably easier than doing manual cleanup Once nothing is using it it'd get GCed
ambitious-aqua
ambitious-aquaOP4w ago
I commented with an attempt at implementing the WeakMap on the partitioned collections issue: https://github.com/TanStack/db/issues/315#issuecomment-3145541811
GitHub
Partitioned collections · Issue #315 · TanStack/db
A very common use case, and question, is how to handle collections where you don&#39;t want to download all of it. Such as issues in an issue tracker, downloading by project/status/createdData etc....
skilled-lime
skilled-lime4w ago
Nice! Yeah that looks right the pattern you want
ambitious-aqua
ambitious-aquaOP4w ago
Is it correct to interpret the lack of examples in the docs as "we really want to nudge you in the direction of downloading it all to the client and selecting client-side"?
skilled-lime
skilled-lime4w ago
GitHub
Partitioned collections · Issue #315 · TanStack/db
A very common use case, and question, is how to handle collections where you don&#39;t want to download all of it. Such as issues in an issue tracker, downloading by project/status/createdData etc....
GitHub
Paginated / Infinite Collections · Issue #343 · TanStack/db
A common request we are receiving is to lazily load data into a &quot;query collection&quot; using the infinite query pattern. We need to consider how to support this in a way that is then useable ...
ambitious-aqua
ambitious-aquaOP4w ago
Yeah, I see that the desire for the functionality is acknowledged in those. I'm more trying to put my brain in the frame of mind as the devs/creators to fully grasp the intent of how and why it's been designed the way that it has.
skilled-lime
skilled-lime4w ago
well it's just do one thing at a time 😆
ambitious-aqua
ambitious-aquaOP4w ago
in terms of expanding the library's features?
skilled-lime
skilled-lime4w ago
right nothing is born fully formed

Did you find this page helpful?