Foreign data wrapper
Typically, joins or any derived tables from subqueries or CTEs are not available on the foreign server and have to be executed locally. I.e., all rows remaining after the simple WHERE
clause in your example have to be retrieved and processed locally like you observed.
If all else fails you can execute the subquery SELECT id FROM lookup_table WHERE x = 5
and concatenate results into the query string.
More conveniently, you can automate this with dynamic SQL and EXECUTE
in a PL/pgSQL function. Like:
CREATE OR REPLACE FUNCTION my_func(_c1 int, _l_id int)
RETURNS TABLE(id int, c1 int, c2 int, c3 int) AS
$func$
BEGIN
RETURN QUERY EXECUTE
'SELECT id,c1,c2,c3 FROM big_table
WHERE c1 = $1
AND id = ANY ($2)'
USING _c1
, ARRAY(SELECT l.id FROM lookup_table l WHERE l.x = _l_id);
END
$func$ LANGUAGE plpgsql;
Related:
Or try this search on SO.
Or you might use the meta-command gexec
in psql. See:
Or this might work: (Feedback says does not work.)
SELECT id,c1,c2,c3
FROM big_table
WHERE c1 = 2
AND id = ANY (ARRAY(SELECT id FROM lookup_table WHERE x = 5));
Testing locally, I get a query plan like this:
Index Scan using big_table_idx on big_table (cost= ...)
Index Cond: (id = ANY ($0))
Filter: (c1 = 2)
InitPlan 1 (returns $0)
-> Seq Scan on lookup_table (cost= ...)
Filter: (x = 5)
Bold emphasis mine.
The parameter $0
in the plan inspires hope. The generated array might be something Postgres can pass on to be used remotely. I don't see a similar plan with any of your other attempts or some more I tried myself. Can you test with your fdw?
Related question concerning postgres_fdw
:
General technique in SQL
That's a different story. Just use a CTE. But I don't expect that to help with the FDW.
WITH cte AS (SELECT id FROM lookup_table WHERE x = 5)
SELECT id,c1,c2,c3
FROM big_table b
JOIN cte USING (id)
WHERE b.c1 = 2;
PostgreSQL 12 changed (improved) behavior, so that CTEs can be inlined like subqueries, given some preconditions. But, quoting the manual:
You can override that decision by specifying MATERIALIZED
to force separate calculation of the WITH query
So:
WITH cte AS MATERIALIZED (SELECT id FROM lookup_table WHERE x = 5)
...
Typically, none of this should be necessary if your DB server is configured properly and column statistics are up to date. But there are corner cases with uneven data distribution ...