I had this problem for a while. It’s very difficult to describe, and even more difficult to report, because I do not know a good way to reproduce the problem.
The reason I am writing about it is, that if somebody ever had or will have a similar problem, then a) you know there is a way to fix it and b) if there is more than one person experiencing the same problem, together we can find the root cause.
So… when we import data from our external service providers databases, we use a EC2 machine with a Postgres instance running on it, as our “proxy”. We have several foreign data wrappers installed on the said EC2 instance, and all the external databases (which use different DBMS’s) are mapped to the Postgres database, from where they are mapped to our Data Warehouse. The Data Warehouse resides on RDS, which means, that only a Postgres FDW is available.
We didn’t have any issues while we were only using this setup to refresh materialized views in our Data Warehouse. But recently we started to use the same proxy to communicate with one of the external databases from the OLTP database. And that’s when strange things started to happen.
They happen when we have “a complex” query, and that’s what I can’t quantify. I can’t say “if we have more than five external tables joined” or “if we have more than one join condition on more than two tables” … it just happens at some point. What happens? The query starts to return only the first row of the result set.
When I run the same query on proxy, it would return a correct number of rows. So the specific FDW does not appear to be a problem. Then what? I do not know the answer. They way I’ve fixed it – I’ve created a view on proxy, which would join all the tables I need, and mapped this view to the OLTP database. First I was reluctant to do it, because I was sure that the conditions won’t be pushed correctly to the lowest level, and thus the query would be incredibly slow, but life proved me wrong:). It works beautifully – and very fast.
So, for now the problem is solved, but I am still wondering, what exactly causes the problem in the original query…