Pavan Deolasee [Tue, 8 Aug 2017 04:43:16 +0000 (10:13 +0530)]
More thorough checks for distribution columns while creating inheritance
We now also do checks during CREATE TABLE. Also amend alter_table test case so
that a few tables are distributed using round robin method so that the new
checks/limitations don't come in their way. Also new test cases added to ensure
that the other checks for inheritance are exercised too.
Pavan Deolasee [Fri, 4 Aug 2017 11:00:37 +0000 (16:30 +0530)]
Check for partitioned table correctly.
While checking where to forward DROP TABLE command, we were not checking for
partitioned table correctly. That resuled in incorrectly sending DROP TABLE to
remote coordinator for temporary partitioned tables.
Pavan Deolasee [Fri, 4 Aug 2017 07:33:49 +0000 (13:03 +0530)]
Correct a mistake occurred during merging sequence.c code
We were incorrectly overwriting the 'cached' value in the SeqTable element,
thus causing another request to the GTM when nextval is fetched. This resulted
in an unintentional gaps in the sequence values. This patch fixes that, though
we might still get gaps unless sequence_range is set to 1. But this is by
design to reduce repeated round trips to the GTM.
Pavan Deolasee [Wed, 2 Aug 2017 06:52:14 +0000 (12:22 +0530)]
Make temporary tables use shared storage on datanodes
Since a temporary table may be accessed by multiple backends on a datanode, XL
mostly treats such tables as regular tables. But the technique that was used to
distingush between temporary tables that may need shared storage vs those which
are accessed only by a single backend, wasn't very full proof. We were relying
on global session activation to make that distinction. This clearly fails when
a background process, such as autovacuuum process, tries to figure out whether
a table is using local or shared storage. This was leading to various problems,
such as, when the underlying file system objects for the table were getting
cleaned up, but without first discarding all references to the table from the
shared buffers.
We now make all temp tables to use shared storage on the datanodes and thus
simplify things. Only EXECUTE DIRECT anyways does not set up global session, so
I don't think this will have any meaningful impact on the performance.
This should fix the checkpoint failures during regression tests.
Pavan Deolasee [Mon, 31 Jul 2017 14:21:58 +0000 (19:51 +0530)]
Don't run ALTER ENUM in an autocommit block on remote nodes
Before PG 10, Postgres did not allow ALTER ENUM to be run inside a transaction
block. So we used to run these commands in auto-commit mode on the remote
nodes. But now Postgres has removed the restriction. So we also run the
statements in transaction block.
This fixes regression failures in the 'enum' test case.
Pavan Deolasee [Mon, 31 Jul 2017 12:06:38 +0000 (17:36 +0530)]
Copy distribution information correctly to ProjectSet path
ProjectSet is a new path type in PG 10 and we'd missed to copy the distribution
information correctly to the path. This was resulting in failures in many
regression test cases. Lack of distribution information, prevented the
distributed query planner from adding a Remote Subplan node on top of the plan,
thus resulting in local execution of the plan. Since the underlying table is
actually a distributed table, local execution fails to fetch any data.
Fix this by properly copying distribution info. Several regression failures are
fixed automatically with this patch.
Tomas Vondra [Sun, 30 Jul 2017 21:52:54 +0000 (23:52 +0200)]
Partially accept plan changes in updatable_views
Upstream commit
215b43cdc8d6b4a1700886a39df1ee735cb0274d significantly
reworked planning of leaky functions. In practice that change means we
no longer have to push leaky functions into a subquery. Which greatly
simplifies some plans, including the two in this patch.
This commit accepts the plans only partially, though. It uses the plans
from upstream, and adds a Remote Subquery Scan node at the top, so we
accept the general plan shape change.
But there are a few additional differences that need futher evaluation,
particularly in target lists (Postgres-XL generating more entries than
upstream) and SubPlans (Postgres-XL only generating one subplan, while
upstream generates two alternative ones).
Tomas Vondra [Sun, 30 Jul 2017 18:41:44 +0000 (20:41 +0200)]
Accept aggregation plan changes in xc_remote tests
The plans changed mainly due to abandoning the custom implementation
two-phase aggregation code, and using the upstream parallel aggregation.
That means we have stopped showing schema name in target lists, so
instead of
Output: pg_catalog.avg((avg(xcrem_employee.salary)))
the EXPLAIN now shows
Output: avg(xcrem_employee.salary)
and we also do projection at the scan nodes, so the target list only
shows the necessary subset of columns.
A somewhat surprising change is that the plans switch from distributed
aggregate plans like this one
-> Aggregate
-> Remote Subquery Scan
-> Aggregate
-> Seq Scan
to always performing simple (non-distributed) aggregate like this
-> Aggregate
-> Remote Subquery Scan
-> Seq Scan
This happens due to create_grouping_paths() relying on consider_parallel
flag when setting try_distributed_aggregate, disabling distributed
aggregation when consider_parallel=false. Both affected plans are however
for UPDATE queries, and PostgreSQL disables parallelism for queries that
do writes, so we end up with try_distributed_aggregate=false.
We should probably enable distributed aggregates in these cases, but we
can't ignore consider_parallel entirely, as we likely need some of the
checks. We will probably end up with consider_distributed flag, set in
a similar way to consider_parallel, but that's more an enhancement than
a bug fix.
Tomas Vondra [Sun, 30 Jul 2017 16:17:27 +0000 (18:17 +0200)]
Reject SQL functions containing utility statements
The check was not effective for the same reason as
5a54abb7acd, that is
not accounting for XL wrapping the original command into RawStmt. Fix
that by checking parsetree->stmt, and also add an assert checking we
actually got a RawStmt in the first place.
Tomas Vondra [Sun, 30 Jul 2017 15:27:36 +0000 (17:27 +0200)]
Produce proper error message for COPY (SELECT INTO)
Produce the right error message for COPY (SELECT INTO) queries, that is
ERROR: COPY (SELECT INTO) is not supported
instead of the incorrect
ERROR: COPY query must have a RETURNING clause
The root cause is that the check in BeginCopy() was testing raw_query,
but XL wraps the original command in RawStmt, so we should be checking
raw_query->stmt instead.
Tomas Vondra [Sun, 30 Jul 2017 13:25:29 +0000 (15:25 +0200)]
Add explicit VACUUM to inet test to actually do IOS
Some of the queries in inet test are meant to exercise Index Only Scans.
Postgres-XL was not however picking those plans due to stale stats on
the coordinator (reltuples and relpages in pg_class).
On plain PostgreSQL the tests work fine, as CREATE INDEX also updates
statistics stored in the pg_class catalog. For example this
CREATE TABLE t (a INT);
INSERT INTO t SELECT i FROM generate_series(1,1000) s(i);
SELECT relpages, reltuples FROM pg_class
WHERE relname = 't';
CREATE INDEX ON t(a);
SELECT relpages, reltuples FROM pg_class
WHERE relname = 't';
will show zeroes before the CREATE INDEX command, and accurate values
after it completes.
On Postgres-XL that is not the case, and we will return zeroes even after
the CREATE INDEX command. To actually update the statistics we need to
fetch information from the datanodes the way VACUUM does it.
Fixed by adding an explicit VACUUM call right after the CREATE INDEX, to
fetch the stats from the datanodes and update the coordinator catalogs.
Tomas Vondra [Sat, 29 Jul 2017 22:29:24 +0000 (00:29 +0200)]
Tweak the query plan check in join regression test
The test expects the plan to use Index Scan, but with 1000 rows the
differences are very small. With two data nodes, we however compute
the estimates as if the tables had 500 rows, making the cost difference
even smaller.
Fixed by increasing the total number of rows to 2000, which means each
datanode has about 1000 and uses the same cost estimates as upstream.
Tomas Vondra [Sat, 29 Jul 2017 16:54:59 +0000 (18:54 +0200)]
Remove extra snprintf call in pg_tablespace_databases
The XL code did two function calls in the else branch, about like this:
else
/* Postgres-XC tablespaces also include node name in path */
sprintf(fctx->location, "pg_tblspc/%u/%s_%s", tablespaceOid,
TABLESPACE_VERSION_DIRECTORY, PGXCNodeName);
fctx->location = psprintf("pg_tblspc/%u/%s_%s", tablespaceOid,
TABLESPACE_VERSION_DIRECTORY,
PGXCNodeName);
which is wrong, as only the first call is actually the else branch, the
second call is executed unconditionally.
In fact, the two calls attempt to construct the same location string,
but the sprintf call assumes the 'fctx->location' string is already
allocated. But it actually is not, so it's likely to cause a segfault.
Fixed by removing the sprintf() call, keeping just the psprintf() one.
Noticed thanks to GCC 6.3 complaining about incorrect indentation.
Backpatch to XL 9.5.
Tomas Vondra [Sat, 29 Jul 2017 16:51:55 +0000 (18:51 +0200)]
Fix confusing indentation in gtm_client.c
GCC 6.3 complains that the indentation in gtm_sync_standby() is somewhat
confusing, as it might mislead people to think that a command is part of
an if branch. So fix that by removing the unnecessary indentation.
Tomas Vondra [Fri, 28 Jul 2017 22:34:53 +0000 (00:34 +0200)]
Refactor the construction of distributed grouping paths
The code generating distributed grouping paths was originally structured
like this:
if (try_distributed_aggregation)
{ ... }
if (can_sort && try_distributed_aggregation)
{ ... }
if (can_hash && try_distributed_aggregation)
{ ... }
It's refactored like this, to resemble the upstream part of the code:
if (try_distributed_aggregation)
{
...
if (can_sort)
{ ... }
if (can_hash)
{ ... }
}
Tomas Vondra [Fri, 28 Jul 2017 22:00:09 +0000 (00:00 +0200)]
Accept plan change in xc_groupby regression test
The plan changed in two ways. Firstly, the targetlists changed due to
abandoning the custom distributed aggregation and reusing the upstream
partial aggregation code. That means we're not prefixing the aggregate
with schema name, etc.
The plan also switches from distributed aggregation to plain aggregation
with all the work done on top of a remote query. This happens simply due
to costing, as the tables are tiny and two-phase aggregation has some
overhead. The original implementation (as in XL 9.5) distributed the
aggregate unconditionally, ignoring the costing.
Parf of the problem is that the query groups by two columns from two
different tables, resulting in overestimation of the number of groups.
That means the optimizer thinks distributing the aggregation would not
reduce the number of rows, which increases the cost estimate as each
row requires network transfer and the finalize aggregate also depends
on the number of input rows.
We could make the tables larger and the optimizer would eventually
switch to distributed aggregate. For example this seems to do the
trick:
insert into xc_groupby_tab1 select 1, mod(i,1000)
from generate_series(1,20000) s(i);
insert into xc_groupby_tab2 select 1, mod(i,1000)
from generate_series(1,20000) s(i);
But it does not seem worth it, considering it's just a workaround
for the estimation issue and the increased duration. And we already
have other regression tests testing plausible queries benefiting from
distributed aggregation. So just accept the plan change.
Pavan Deolasee [Wed, 26 Jul 2017 07:26:00 +0000 (12:56 +0530)]
Accept some of the differences in 'tidscan' test case
Most of these differences arise from the the fact that rows are fetched from
multiple datanodes in XL and hence TIDs can be duplicated. There are some other
differences because of EXPLAIN output.
The test case does not yet pass because of other unaddressed failures.
Pavan Deolasee [Wed, 26 Jul 2017 06:17:28 +0000 (11:47 +0530)]
Fix remaining problems in 'triggers' test case.
We don't support different column ordering for partitions in XL. So enforce
that either by ensuring consistent column ordering or accepting errors. Also we
don't support triggers in XL, so some changes were necessary to take that into
account.
Pavan Deolasee [Wed, 26 Jul 2017 05:13:24 +0000 (10:43 +0530)]
Fix 'union' test case.
Since
93cbab90b0c6fc3fc4aa515b93057127c0ee8a1b we expect that the child table
has columns at the same position as the parent table. So fix the test case to
follow the rule and resolve expected output differences arising from that.
The test case should pass with this change.
Pavan Deolasee [Wed, 26 Jul 2017 04:48:51 +0000 (10:18 +0530)]
Don't try to fetch table details using the old name after ExecRenameStmt
This used to work before PG 10, but some changes must have caused
non-deterministic behaviour. It anyways seems unsafe to lookup the catalogs
using the old name once ExecRenameStmt has finished. The lookup may or may not
see the old tuple, depending on whether CommandCounterIncrement has happened in
between. We now fetch the requried details before calling ExecRenameStmt and
use that info for subsequent processing.
This fixes some wierd issues in 'alter_table' test case where we were failing
to send ALTER TABLE RENAME TO command to remote nodes and causing inconsistent
catalog entries between the coordinator and the remote nodes.
Pavan Deolasee [Mon, 24 Jul 2017 05:42:57 +0000 (11:12 +0530)]
Accept some obvious regression differences in the 'join' test case
These are only placements of Remote FQS or Remote Subplan nodes in the newly
added explain plans in the test case. There are some remaining failures in the
test case which will need more scrutiny.
Tomas Vondra [Mon, 17 Jul 2017 23:22:26 +0000 (01:22 +0200)]
Accept changes in foreign_data test due to unsupported FDW
Postgres-XL does not support Foreign Data Wrappers, so accept failures
and output differences in the foreign_data regression test as expected.
Tomas Vondra [Mon, 17 Jul 2017 22:35:22 +0000 (00:35 +0200)]
Accept failures in triggers test du to unsupported features
Postgres-XL does not support triggers, naturally causing many failures
in the triggers regression test. So treat the error messages as expected
behavior and remove the NOTICEs produced by the triggers.
Another failure was caused by INSERT in a subquery, which is another
feature unsupported by Postgres-XL.
Tomas Vondra [Mon, 17 Jul 2017 22:01:31 +0000 (00:01 +0200)]
Accept two query plans in the join regression test
Both plans keep the same join algorithm as on PostgreSQL, and the plans
match those produced by Postgres-XL 9.5.
Tomas Vondra [Mon, 17 Jul 2017 21:26:25 +0000 (23:26 +0200)]
Resolve failures related to pg_node_tree type input
The input function for pg_node_tree pseudotype simply returns an error,
essentially disabling input from text. Some of the regression tests do
CREATE TABLE t AS SELECT * FROM pg_class;
or something like that, which works fine on PostgreSQL, but it fails on
Postgres-XL as we need to send data between nodes. Unfortunately the
pg_class now contains pg_node_tree column (relpartbound), causing falure
when reading the column on the remote node.
So instead use another catalogue, for example pg_attribute, which does
not contain any such column.
Tomas Vondra [Mon, 17 Jul 2017 20:32:18 +0000 (22:32 +0200)]
Drop the foo table in rangefuncs regression tes
The table was made non-temporary by commit
e98209019b4d2012, but it was
not dropped, wich caused failures in the returning regression test. So
add the missing DROP TABLE.
Tomas Vondra [Sun, 16 Jul 2017 20:20:14 +0000 (22:20 +0200)]
Accept failures in large_object regression test
Large objects are unsupported on Postgres-XL, so the failures are
expected.
Tomas Vondra [Sun, 16 Jul 2017 20:11:47 +0000 (22:11 +0200)]
Accept plan changes in aggregates regression test
The plan changes come from the upstream, and XL only adds Remote Fast
Query Execution at the top.
Tomas Vondra [Sun, 16 Jul 2017 19:53:04 +0000 (21:53 +0200)]
Remove check of old-style C functions from misc test
The old-style C functions do not exist anymore, so remove the block of
code testing them (and failing). This test was removed from upstream,
and was only kept by mistake during the merge.
Tomas Vondra [Sun, 16 Jul 2017 19:36:56 +0000 (21:36 +0200)]
Resolve all failures in rangefuncs regression test
The failures were fairly trivial in nature:
* extra output for test block removed in PostgreSQL 10
* non-deterministic ordering of results
* functions referencing temporary tables, which does not work on XL
* triggers not supported on XL
Resolving the first two issue is simple - remove the extra blocks and
add ORDER BY to stabilize the ordering.
To fix the temp tables vs. functions issue I've simply made all the
tables non-temporary. The triggers were used merely to generate some
notices, so removing those from the expected output was enough.
Tomas Vondra [Sun, 16 Jul 2017 18:52:59 +0000 (20:52 +0200)]
Accept type name in duplicate key error message
The output routine for anyarray data type (anyarray_out) is modified to
include name of the data type in the text representation. Which is also
used when an error message includes the value, for example duplicate
key errors.
Tomas Vondra [Sun, 16 Jul 2017 14:22:46 +0000 (16:22 +0200)]
Modify enum tests to not rely on SAVEPOINT
Postgres-XL does not support SAVEPOINT (or subtransactions in general).
Some regression tests use this to test cases that are expected to fail,
and we want to keep testing that. So instead of removing the tests,
split them into independent transactions (and tweak the expected output
accordingly).
Note: The tests are still failing, though. Apparently XL does not undo
the effect of ALTER TYPE ... ADD VALUE on rollback.
Tomas Vondra [Sun, 16 Jul 2017 14:09:17 +0000 (16:09 +0200)]
Accept minor changes in sequence regression tests
Mostly just renames of the objects, received from upstream.
Tomas Vondra [Sat, 15 Jul 2017 17:01:50 +0000 (19:01 +0200)]
Accept plan changes in the equivclass regression test
The plans are fairly close to those generated on Postgres-XL 9.5, with
some minor differences due to upstream optimizer changes (e.g. missing
Subquery Scan when we can do Index Scan instead).
The main change is that when a Merge Join is identified as unique, we
may replace
-> Materialize
-> Sort
Sort Key: ec1.f1 USING <
-> Remote Subquery Scan on all (datanode1)
-> Index Scan using ec1_pkey on ec1
Index Cond: (ff = '42'::bigint)
with
-> Remote Subquery Scan on all (datanode_1)
-> Sort
Sort Key: ec1.f1 USING <
-> Index Scan using ec1_pkey on ec1
Index Cond: (ff = '42'::bigint)
as there will be no rescans on the inner relation (so we do not need
the additional Materialize step).
Tomas Vondra [Sat, 15 Jul 2017 16:50:20 +0000 (18:50 +0200)]
i# Summary (one line, 50 chars or less) ===========
Accept simple plan changes in tsearch regression test
The accepted plan changes are fairly simple, switching plain aggregation
to a distributed one.
Tomas Vondra [Sat, 15 Jul 2017 15:39:17 +0000 (17:39 +0200)]
Ensure grouping sets get properly distributed data
Grouping sets are stricter about distribution of input data, as all the
execution happens on the coordinator - there is no support for partial
grouping sets yet, so we can either push all the grouping set work to
the remote node (if all the sets include the distribution key), or make
sure that there is a Remote Subquery on the input path.
This is what Postgres-XL 9.6 was doing, but it got lost during merge
with PostgreSQL 10 which significantly reworked this part of the code.
Two queries still produce incorrect result, but those are not actually
using the grouping sets paths because
GROUP BY GROUPING SETS (a, b)
gets transformed into simple
GROUP BY a, b
and ends up using parallel aggregation. The bug seems to be that the
sort orders mismatch for some reason - the remote part produces data
sorted by "a" but the "Finalize GroupAggregate" expects input sorted
by "a, b" leading to duplicate groups in the result.
Tomas Vondra [Sat, 15 Jul 2017 14:20:23 +0000 (16:20 +0200)]
Adjust plans for new queries in privileges tests
The upstream privileges regression test added multiple checks of explain
plans, so the plans needed to be adjusted for Postgres-XL (by adding the
Remote Subquery nodes to appropriate places).
There are two plans that however mismatch the upstream version, using
a different join algorithm (Nested Loop vs. Hash Join). Turns out this
happens due to Postgres-XL not collecting stats for expression indexes,
and the two queries rely on that feature. Without the statistics the
estimates change dramatically, triggering a plan change.
We need to extend analyze_rel_coordinator() to collect stats not only
for the table, but for all indexes too. But that's really a matter for
a separate commit.
Tomas Vondra [Sat, 15 Jul 2017 00:02:07 +0000 (02:02 +0200)]
Stabilize ordering and accept plan in domain test
Trivial result ordering stabilization by adding ORDER BY clause, and
accepting explain output with additional Remote Subquery node (on top
of the upstream plan).
Tomas Vondra [Fri, 14 Jul 2017 23:43:53 +0000 (01:43 +0200)]
Stabilize result ordering in xc_having regression test
Stabilized by adding ORDER BY clause to two queries. Generate explain
plans both for the original and modified queries.
Tomas Vondra [Fri, 14 Jul 2017 23:28:34 +0000 (01:28 +0200)]
Add block of expected output to truncate test
The expected output was missing block testing ON TRUNCATE triggers. Add
it and tweak to reflect Postgres-XL limitations (no triggers or RESTART
IDENTITY).
There seems to be an issue in handling sequences with START WITH clause,
where we don't quite respet that and start with a higher value. And we
also don't increment by 1 for some reason.
Tomas Vondra [Fri, 14 Jul 2017 22:54:20 +0000 (00:54 +0200)]
Stabilize ordering of results in xc_FQS test
Ordering of some results in xc_FQS tests became unstable, so stabilize
it by adding ORDER BY clauses. Instead of just changing the explain
plans in the same way, generate plans both for the original query
(without the ORDER BY clause) and the new one.
Tomas Vondra [Fri, 14 Jul 2017 22:47:35 +0000 (00:47 +0200)]
Accept trivial plan change in xc_for_update test
The join is detected to be unique, which is indicated by 'Inner Unique'
key in the explain plan.
Tomas Vondra [Fri, 14 Jul 2017 19:57:48 +0000 (21:57 +0200)]
Accept int2vector not being hash distributable
Upstream commit
5c80642aa8de8393b08cd3cbf612b325cedd98dc removed support
for hashing int2vector data type, as it was dead code (upstream). That
means we can no longer distribute table by hash on this datatype.
We could reintroduce the hash function in Postgres-XL, but int2vector
seems rarely used as distribution key, so let's just fix the tests. If
needed, we can add the functions in the future.
Tomas Vondra [Thu, 13 Jul 2017 18:58:06 +0000 (20:58 +0200)]
Replace WARNING about skipped stats with a SELECT
ANALYZE throws a WARNING that extended statistics was not built due to
statistics target being 0 for some of the columns, but that gets thrown
on the datanode and we never forward it to the coordinator.
So explicitly check contents of pg_statistic_ext if the statistic was
built or not.
Tomas Vondra [Thu, 13 Jul 2017 17:51:36 +0000 (19:51 +0200)]
Accept output changes due to psql \d format tweaks
The format used by psql \d and \d+ changed a bit, splitting the single
Modifiers column into Collation, Nullable, Default. Additional commands
changed too, for example \dew+ now uses "options" instead of "Options"
and so on.
This commit accepts all such output changes across all regression tests.
Tomas Vondra [Thu, 13 Jul 2017 17:14:16 +0000 (19:14 +0200)]
Stabilize plan changes and ordering in xc_groupby test
The regression test was failing because many queries started producing
results with unstable ordering, so fix that by adding ORDER BY clause
to many of them.
This of course affects the plans that were part of the test too, so
instead of running query with ORDER BY clause and then checking plan
for a query without it, check plans for both query versions. This makes
the test somewhat bigger, but it seems to be worth it and the impact on
test duration is negligible.
Multiple query plans changed in a non-trivial ways, too. This commit
accepts changes that are clearly inherited from the upstream, which was
verified by running the query on PostgreSQL 10 and accepting simple
changes (essentially adding "Remote Subquery" to reasonable places in
the plan or "Remote Fast Query Execution" at the top).
More complicated plan changes (e.g. switching from Group Aggregate to
Hash Aggregate or back, join algorithms etc.) are left unaccepted for
additional analysis.
The SQL script also generates multiple query plans that are not included
in the expected output. This is intentional, as the generated plans are
incorrect and produce incorrect ordering of results. The bug is that
queries like
SELECT sum(a) FROM t GROUP BY 1
end up producing results sorted by 'a' and not 'sum(a)'.
Tomas Vondra [Thu, 13 Jul 2017 16:56:16 +0000 (18:56 +0200)]
Remove 'current transaction is aborted' from rowsecurity.out
The rowsecurity test suite hits unsupported features on two places:
* WHERE CURRENT OF clause for cursors
* SAVEPOINTS (subtransactions)
which results in 'current transaction is aborted' errors in the rest of
the transaction. Those errors were added to the expected output file,
making the regression test succeed, but it may easily mask issues in the
aborted part of the transaction.
We need to rework the rest so that it skips the unsupported features,
and still exercises the test on the remaining parts.
Tomas Vondra [Thu, 13 Jul 2017 16:48:36 +0000 (18:48 +0200)]
Stabilize expected plans/output for rowsecurity tests
This is a mix of simple fixes to stabilize the rowsecurity test.
1) Adding ORDER BY to multiple queries, to stabilize the output order.
2) Update expected query output by copying it from upstream (PostgreSQ).
Apparently some of this got broken during merge conflict resolution.
3) Accept simple plan changes that add Remote Subquery either at the top
of the plan, or in InitPlan / SubPlan or CTE.
4) Accept plan changes inherited from upstream, particularly removal of
the "Subquery Scan" nodes from the plan.
5) Add expected output for "viewpoint from regress_rls_dave" section,
missing in the expected file for some reason.
Tomas Vondra [Thu, 13 Jul 2017 16:38:06 +0000 (18:38 +0200)]
Accept simple plan changes in select_views test
All accepted plan changes are simply adding Remote Subquery at the top
of a plan merged from upstream, in a fairly obviously correct way.
There are two additional fixes, either adding a missing block of
expected output (copied from upstream), or removing an extra output.
Tomas Vondra [Thu, 13 Jul 2017 16:22:30 +0000 (18:22 +0200)]
Build extended stats on coordinators during ANALYZE
When running ANALYZE on a coordinator, we simply fetch the statistics
built on datanodes, and keep stats from a random datanode (assuming all
datanodes are similar in terms of data volume and data distribution).
This was only done for regular per-attribute stats, though, not for the
extended statistics added in PostgreSQL 10, causing various failures in
stats_ext tests due to missing statistics. This commit fixes this gap
by using the same approach as for simple statistics - we collect stats
from datanodes and keep the first result we receive for each statistic.
While working on this I realized this approach has some inherent issues,
particularly on columns that are distribution keys. As we keep stats
from a random node, we completely ignore MCV and histograms from the
remaining nodes. That may cause planning issues, but addressing it is
out of scope for this commit.
Pavan Deolasee [Thu, 13 Jul 2017 09:22:22 +0000 (14:52 +0530)]
Merge remote-tracking branch 'remotes/PGSQL/master' of PG 10
This merge includes all commits upto
bc2d716ad09fceeb391c755f78c256ddac9d3b9f
of PG 10.
Pavan Deolasee [Thu, 13 Jul 2017 07:24:55 +0000 (12:54 +0530)]
Ensure that child table inherits distribution stretegy from the parent.
For partitioned tables or in general inherited tables, we now enforce that the
child table always inherit the distribution strategy of the parent. This not
only makes it far easier to handle various cases correctly, but would also
allow us to optimise distributed queries on partitioned tables much easily.
Tank.zhang <
6220104@qq.com> originally reported a problem with partitioned
tables and incorrect query execution. Upon investigations, we decided to make
these restrictions to simplify things.
Tom Lane [Wed, 12 Jul 2017 22:00:04 +0000 (18:00 -0400)]
Fix ruleutils.c for domain-over-array cases, too.
Further investigation shows that ruleutils isn't quite up to speed either
for cases where we have a domain-over-array: it needs to be prepared to
look past a CoerceToDomain at the top level of field and element
assignments, else it decompiles them incorrectly. Potentially this would
result in failure to dump/reload a rule, if it looked like the one in the
new test case. (I also added a test for EXPLAIN; that output isn't broken,
but clearly we need more test coverage here.)
Like commit
b1cb32fb6, this bug is reachable in cases we already support,
so back-patch all the way.
Heikki Linnakangas [Wed, 12 Jul 2017 19:03:38 +0000 (22:03 +0300)]
Reduce memory usage of tsvector type analyze function.
compute_tsvector_stats() detoasted and kept in memory every tsvector value
in the sample, but that can be a lot of memory. The original bug report
described a case using over 10 gigabytes, with statistics target of 10000
(the maximum).
To fix, allocate a separate copy of just the lexemes that we keep around,
and free the detoasted tsvector values as we go. This adds some palloc/pfree
overhead, when you have a lot of distinct lexemes in the sample, but it's
better than running out of memory.
Fixes bug #14654 reported by James C. Reviewed by Tom Lane. Backport to
all supported versions.
Discussion: https://www.postgresql.org/message-id/
20170514200602.1451.46797@wrigleys.postgresql.org
Alvaro Herrera [Wed, 12 Jul 2017 18:39:44 +0000 (14:39 -0400)]
commit_ts test: Set node name in test
Otherwise, the script output has a lot of pointless warnings.
This was forgotten in
9def031bd2821f35b5f506260d922482648a8bb0
Tom Lane [Wed, 12 Jul 2017 17:24:16 +0000 (13:24 -0400)]
Avoid integer overflow while sifting-up a heap in tuplesort.c.
If the number of tuples in the heap exceeds approximately INT_MAX/2,
this loop's calculation "2*i+1" could overflow, resulting in a crash.
Fix it by using unsigned int rather than int for the relevant local
variables; that shouldn't cost anything extra on any popular hardware.
Per bug #14722 from Sergey Koposov.
Original patch by Sergey Koposov, modified by me per a suggestion
from Heikki Linnakangas to use unsigned int not int64.
Back-patch to 9.4, where tuplesort.c grew the ability to sort as many
as INT_MAX tuples in-memory (commit
263865a48).
Discussion: https://postgr.es/m/
20170629161637.1478.93109@wrigleys.postgresql.org
Tomas Vondra [Wed, 12 Jul 2017 16:41:51 +0000 (18:41 +0200)]
Resolve most failures in stats_ext regression tests
This addresses simple failures in stats_ext regression tests, namely:
1) Failure to drop column that happens to be a distribution key of the
table (picked automatically by CREATE TABLE). Resolved by explicitly
distributing the table by another columns.
2) Commenting out DDL on FDWs, which are unsupported in Postgres-XL, and
we already have this covered by plenty of other tests.
3) Commenting out a DO block, checking that we can't create statistics
on a TOAST table. The error message will vary depending on what OID
is assigned to the TOAST table. There's no nice way to catch that
error in Postgres-XL, due to missing support for subtransactions.
4) Accept trivial plan changes, that simply add Remote Subquery into the
plan, or ship the whole query to the nodes (Fast Query Execution).
There additional plan changes that change the plans in other ways,
but those need more investigation and are not accepted by this commit.
Heikki Linnakangas [Wed, 12 Jul 2017 14:07:35 +0000 (17:07 +0300)]
Fix variable and type name in comment.
Kyotaro Horiguchi
Discussion: https://www.postgresql.org/message-id/
20170711.163441.
241981736.horiguchi.kyotaro@lab.ntt.co.jp
Heikki Linnakangas [Wed, 12 Jul 2017 12:30:52 +0000 (15:30 +0300)]
Fix ordering of operations in SyncRepWakeQueue to avoid assertion failure.
Commit
14e8803f1 removed the locking in SyncRepWaitForLSN, but that
introduced a race condition, where SyncRepWaitForLSN might see
syncRepState already set to SYNC_REP_WAIT_COMPLETE, but the process was
not yet removed from the queue. That tripped the assertion, that the
process should no longer be in the uqeue. Reorder the operations in
SyncRepWakeQueue to remove the process from the queue first, and update
syncRepState only after that, and add a memory barrier in between to make
sure the operations are made visible to other processes in that order.
Fixes bug #14721 reported by Const Zhang. Analysis and fix by Thomas Munro.
Backpatch down to 9.5, where the locking was removed.
Discussion: https://www.postgresql.org/message-id/
20170629023623.1480.26508%40wrigleys.postgresql.org
Heikki Linnakangas [Wed, 12 Jul 2017 09:30:50 +0000 (12:30 +0300)]
Remove unnecessary braces, to match the surrounding style.
Mostly in the new subscription-related commands. Backport the few that
were also present in older versions.
Thomas Munro
Discussion: https://www.postgresql.org/message-id/CAEepm=3CyW1QmXcXJXmqiJXtXzFDc8SvSfnxkEGD3Bkv2SrkeQ@mail.gmail.com
Tom Lane [Tue, 11 Jul 2017 20:48:59 +0000 (16:48 -0400)]
Fix multiple assignments to a column of a domain type.
We allow INSERT and UPDATE commands to assign to the same column more than
once, as long as the assignments are to subfields or elements rather than
the whole column. However, this failed when the target column was a domain
over array rather than plain array. Fix by teaching process_matched_tle()
to look through CoerceToDomain nodes, and add relevant test cases.
Also add a group of test cases exercising domains over array of composite.
It's doubtless accidental that CREATE DOMAIN allows this case while not
allowing straight domain over composite; but it does, so we'd better make
sure we don't break it. (I could not find any documentation mentioning
either side of that, so no doc changes.)
It's been like this for a long time, so back-patch to all supported
branches.
Discussion: https://postgr.es/m/4206.
1499798337@sss.pgh.pa.us
Pavan Deolasee [Tue, 11 Jul 2017 08:34:11 +0000 (14:04 +0530)]
Ensure all partitions of a partitioned table has the same distribution.
To optimise and simplify XL's distributed query planning, we enforce that all
partitions of a partitioned table use the same distribution strategy. We also
put further restrictions that all columns in the partitions and the partitioned
table has matching positions. This can cause some problems when tables have
dropped columns etc, but we think it's far better to optimise XL's plans than
supporting all corner cases. We can look at removing some of these
restrictions later once the more usual queries run faster.
These restrictions allow us to unconditionally push down Append and MergeAppend
nodes to datanodes when these nodes are processing partitioned tables.
Some regression tests currently fail because of these added restrictions. We
would look at them in due course of time.
Tom Lane [Mon, 10 Jul 2017 20:26:20 +0000 (16:26 -0400)]
Stamp 10beta2.
Alvaro Herrera [Mon, 10 Jul 2017 15:51:32 +0000 (11:51 -0400)]
Translation updates
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash:
c5a8de3653bb1af6b0eb41cc6bf090c5522df52b
Tom Lane [Mon, 10 Jul 2017 15:00:09 +0000 (11:00 -0400)]
On Windows, retry process creation if we fail to reserve shared memory.
We've heard occasional reports of backend launch failing because
pgwin32_ReserveSharedMemoryRegion() fails, indicating that something
has already used that address space in the child process. It's not
very clear what, given that we disable ASLR in Windows builds, but
suspicion falls on antivirus products. It'd be better if we didn't
have to disable ASLR, anyway. So let's try to ameliorate the problem
by retrying the process launch after such a failure, up to 100 times.
Patch by me, based on previous work by Amit Kapila and others.
This is a longstanding issue, so back-patch to all supported branches.
Discussion: https://postgr.es/m/CAA4eK1+R6hSx6t_yvwtx+NRzneVp+MRqXAdGJZChcau8Uij-8g@mail.gmail.com
Heikki Linnakangas [Mon, 10 Jul 2017 12:36:02 +0000 (15:36 +0300)]
Fix missing tag in the docs.
Masahiko Sawada
Discussion: https://www.postgresql.org/message-id/CAD21AoBCwcTNMdrVWq8T0hoOs2mWSYq9PRJ_fr6SH8HdO+m=0g@mail.gmail.com
Heikki Linnakangas [Mon, 10 Jul 2017 12:29:36 +0000 (15:29 +0300)]
Fix check for empty hostname.
As reported by Arthur Zakirov, Gcc 7.1 complained about this with
-Wpointer-compare.
Discussion: https://www.postgresql.org/message-id/CAKNkYnybV_NFVacGbW=VspzAo3TwRJFNi+9iBob66YqQMZopwg@mail.gmail.com
Andrew Gierth [Mon, 10 Jul 2017 10:40:08 +0000 (11:40 +0100)]
Fix COPY's handling of transition tables with indexes.
Commit
c46c0e5202e8cfe750c6629db7852fdb15d528f3 failed to pass the
TransitionCaptureState object to ExecARInsertTriggers() in the case
where it's using heap_multi_insert and there are indexes. Repair.
Thomas Munro, from a report by David Fetter
Discussion: https://postgr.es/m/
20170708084213.GA14720%40fetter.org
Heikki Linnakangas [Mon, 10 Jul 2017 09:28:57 +0000 (12:28 +0300)]
Allow multiple hostaddrs to go with multiple hostnames.
Also fix two other issues, while we're at it:
* In error message on connection failure, if multiple network addresses
were given as the host option, as in "host=127.0.0.1,127.0.0.2", the
error message printed the address twice.
* If there were many more ports than hostnames, the error message would
always claim that there was one port too many, even if there was more than
one. For example, if you gave 2 hostnames and 5 ports, the error message
claimed that you gave 2 hostnames and 3 ports.
Discussion: https://www.postgresql.org/message-id/
10badbc6-4d5a-a769-623a-
f7ada43e14dd@iki.fi
Tom Lane [Mon, 10 Jul 2017 04:44:05 +0000 (00:44 -0400)]
Doc: remove claim that PROVE_FLAGS defaults to '--verbose'.
Commit
e9c81b601 changed this, but missed updating the documentation.
The adjacent claim that we use TAP tests only in src/bin seems pretty
obsolete as well. Minor other copy-editing.
Tom Lane [Mon, 10 Jul 2017 04:08:19 +0000 (00:08 -0400)]
Doc: clarify wording about tool requirements in sourcerepo.sgml.
Original wording had confusingly vague antecedent for "they", so replace
that with a more repetitive but clearer formulation. In passing, make the
link to the installation requirements section more specific. Per gripe
from Martin Mai, though this is not the fix he initially proposed.
Discussion: https://postgr.es/m/CAN_NWRu-cWuNaiXUjV3m4H-riWURuPW=j21bSaLADs6rjjzXgQ@mail.gmail.com
Tom Lane [Mon, 10 Jul 2017 00:11:21 +0000 (20:11 -0400)]
Doc: desultory copy-editing for v10 release notes.
Improve many item descriptions, improve markup, relocate some items
that seemed to be in the wrong section.
Tomas Vondra [Sun, 9 Jul 2017 20:02:21 +0000 (22:02 +0200)]
Properly redistribute results of Gather Merge nodes
The optimizer was not generating correct distributed paths with Gather
Merge nodes, because those nodes always looked as if the data was not
distributed at all. There were two bugs causing this:
1) Gather Merge did not copy distribution from the subpath, leaving it
NULL (as if running on coordinator), so no Remote Subquery needed.
2) create_grouping_paths() did not check if a Remote Subquery is needed
on top of Gather Merge anyway.
After fixing these two issues, we're now generating correct plans (at
least judging by select_parallel regression suite).
Tomas Vondra [Sun, 9 Jul 2017 19:34:48 +0000 (21:34 +0200)]
Accept reasonable plan changes in select_parallel
All the accepted plan changes are simply adding Remote Subquery, and
seem correct and reasonable. Where possible, I've verified that the
older XL versions produce the same (or very similar) plan.
There are also three additional minor fixes:
1) An extra EXPLAIN query, as EXPLAIN ANALYZE hides the part below
Remote Subquery, making it mostly useless. The extra EXPLAIN
shows the whole plan and addresses this.
2) Postgres-XL does not support subtransactions, so the block setting
effective_io_concurrency was failing, and aborting the surrounding
transaction. Removing the EXCEPTION clause may cause issues on
systems not supporting this GUC, but that should be rare.
3) Removed a section of expected output, matching a block removed
from the SQL script.
Tom Lane [Sun, 9 Jul 2017 19:27:21 +0000 (15:27 -0400)]
Doc: update v10 release notes through today.
Tom Lane [Sun, 9 Jul 2017 16:51:15 +0000 (12:51 -0400)]
Doc: fix backwards description of visibility map's all-frozen data.
Thinko in commit
a892234f8.
Vik Fearing
Discussion: https://postgr.es/m/
b6aaa23d-e26f-6404-a30d-
e89431492d5d@2ndquadrant.com
Tomas Vondra [Sun, 9 Jul 2017 16:33:35 +0000 (18:33 +0200)]
Accept pgxc_node as containing no pinned objects
The misc_sanity test checks for catalogs containing no pinned objects
after initdb. As pgxc_node is empty at that point (as expected), it got
caught by the new regression test. So add it to expected output.
Tomas Vondra [Sun, 9 Jul 2017 16:19:38 +0000 (18:19 +0200)]
Remove the setup_storm call omitted from last commit
Should have been part of the previous commit removing storm_catalog, but
I forgot to include this bit.
Tomas Vondra [Sun, 9 Jul 2017 14:07:26 +0000 (16:07 +0200)]
Remove storm_catalog schema
The storm_catalog schema is supposed to contain the same catalogs and
views as pg_catalog, but filtered to the current database. The use case
for this is multi-tenant systems, which was a StormDB feature.
But on XL this is mostly irrelevant, and the schema was not populated
since commit
8096e3edf17b260de15472eb04567d1beec1e3e6 which disabled
this part of initdb.
So instead of fixing the regression failures in misc_sanity caused by
this (initdb-time schema with no pinned objects), just rip all the
remaining bits out, including the pgxc_catalog_remap GUC etc.
This also removes the setup_storm() call disabled by
8096e3edf1, as the
function got removed since then.
Tomas Vondra [Sun, 9 Jul 2017 12:45:52 +0000 (14:45 +0200)]
Correct function line number in an error message
The line number was incorrect for some reason. Should be 24 and not 23.
Tomas Vondra [Sun, 9 Jul 2017 12:42:52 +0000 (14:42 +0200)]
Make the last query in object_address work again
The last query in the test was not really doing anything as it failed
on the first object unsupported by Postgres-XL. So remove all such
unsupported objects, to make it work again.
Tomas Vondra [Sun, 9 Jul 2017 12:36:38 +0000 (14:36 +0200)]
Add missing block of expected output to object_address
The regression test was clearly missing a block of expected output,
likely due to a mistake during initial merge conflict resolution.
Tomas Vondra [Sun, 9 Jul 2017 11:52:50 +0000 (13:52 +0200)]
Disable support for CREATE PUBLICATION/SUBSCRIPTION
As the in-core logical replication is based on decoding WAL, there's no
easy way to support it on Postgres-XL as the WAL is spread over many
nodes. We essentially forward the actions to coordinators/datanodes,
and each of them has it's own local WAL. Reconstructing the global WAL
(which is needed for publications) would be challenging (e.g. because
replicated tables have data on all nodes), and it's certainly not
something we want to do during stabilization phase.
Supporting subscriptions would be challenging to, although for different
reasons (multiple subscriptions vs. multiple coordinators).
So instead just disable the CREATE PUBLICATION / SUBSCRIPTION commands,
just like we do for other unsupported features (e.g. triggers).
Noah Misch [Sun, 9 Jul 2017 07:43:17 +0000 (00:43 -0700)]
MSVC: Repair libpq.rc generator.
It generates an empty file, so libpq.dll advertises no version
information. Commit
facde2a98f0b5f7689b4e30a9e7376e926e733b8
mistranslated "print O;" in this one place.
Tom Lane [Sat, 8 Jul 2017 16:42:25 +0000 (12:42 -0400)]
Avoid unreferenced-function warning on low-functionality platforms.
On platforms lacking both locale_t and ICU, collationcmds.c failed
to make any use of its static function is_all_ascii(), thus probably
drawing a compiler warning. Oversight in my commit
ddb5fdc06.
Per buildfarm member gaur.
Tomas Vondra [Fri, 7 Jul 2017 23:17:29 +0000 (01:17 +0200)]
Accept trivial plan changes in brin regression test
Those are new plans in upstream, and the only that happened to them is
adding "Remote Fast Query Execution" to the top of the plan.
Tomas Vondra [Fri, 7 Jul 2017 23:11:47 +0000 (01:11 +0200)]
Accept trivial plan changes in join regression tests
This commit only accepts trivial plan changes, caused either by unique
joins patch (which may turn semijoins into regular joins), or by adding
a Remote Subquery on top of a new upstream plan.
There are multiple plans that change in a more complicated way,
requiring a more thorough investigation.
Tomas Vondra [Fri, 7 Jul 2017 22:51:01 +0000 (00:51 +0200)]
Resolve a trivial failure in the union regression test
The expected output for one of the queries contained an extra row, most
likely due to a mistake during merge conflict resolution. There were in
fact 8 rows, yet the expected query output said '(7 rows)'.
Tomas Vondra [Fri, 7 Jul 2017 22:43:12 +0000 (00:43 +0200)]
Accept plan changes in updatable_views regression tests
The accepted plan changes are trivial and exactly match upstream changes
mostly due to commits
215b43cdc8 and
92a43e4857.
I've not accepted changes in the two UPDATE plans at the very end, as
those are more complicated and require more thorough investigation.
Tomas Vondra [Fri, 7 Jul 2017 22:06:05 +0000 (00:06 +0200)]
Accept plan changes in inherit regression tests
The accepted plans are actually new in the upstream, testing planning on
partitioned tables. The changes are just adding "Remote Subquery" node
at the top, to distribute the query to all datanodes.
The plans look reasonable and correct in general, but as a sanity check
I've also reproduced them on XL 9.5 using plain inheritance (because
partitioning is new in PostgreSQL 10). Naturally there are differences
)e.g. with partitioning the planner includes only leaf partitions in
the plan, while with inheritance we include the whole inheritance tree)
but otherwise the plans seem to be close enough.
Alvaro Herrera [Fri, 7 Jul 2017 21:10:36 +0000 (17:10 -0400)]
Fix typo
Noticed while reviewing code.
Tomas Vondra [Fri, 7 Jul 2017 18:12:03 +0000 (20:12 +0200)]
Add missing block of expected output to inherit test
The expected output of 'inherit' regression test was missing a whole
block of code, likely due to incorrect resolution of merge conflict.
Add the missing piece, copied from upstream.
Tomas Vondra [Fri, 7 Jul 2017 18:10:07 +0000 (20:10 +0200)]
Stabilize order of results in insert regresion test
Same issue as in
c2392efc83, but in different regression test. Fixed the
same way, i.e. by adding ORDER BY clauses to stabilize the order.
Tomas Vondra [Fri, 7 Jul 2017 17:27:42 +0000 (19:27 +0200)]
Stabilize order of results in macaddr8 regression test
Same issue as in
c2392efc83, but in different regression test. Fixed the
same way, i.e. by adding ORDER BY clauses to stabilize the order.
Tomas Vondra [Fri, 7 Jul 2017 17:24:46 +0000 (19:24 +0200)]
Stabilize order of results in inet regression test
The tests were relying on ordering of rows implied by storage, but that
does not work in multi-node clusters. Fixed by adding ORDER BY clauses
stabilizing the order, and updating the expected results.
Magnus Hagander [Fri, 7 Jul 2017 12:08:55 +0000 (15:08 +0300)]
Fix out of date comment
Author: Masahiko Sawada <sawada.mshk@gmail.com>
Tomas Vondra [Thu, 6 Jul 2017 17:21:30 +0000 (19:21 +0200)]
Change type to (Node *) to fix compiler warning
get_object_address() expects the second parameter to be (Node *) but
we've been passing (List *), so that compilers were complaining. Just
change the type to fix this.
Tomas Vondra [Thu, 6 Jul 2017 17:08:48 +0000 (19:08 +0200)]
Add OCLASS_PGXC items to several switch statements
Multiple switch statements on oclass values are intentionally missing
the default clause. As the PGXC oclass options were missing, compilers
were complaining about it.
Teodor Sigaev [Thu, 6 Jul 2017 14:18:55 +0000 (17:18 +0300)]
Fix potential data corruption during freeze
Fix oversight in
3b97e6823b94 bug fix. Bitwise AND is used instead of OR and
it cleans all bits in t_infomask heap tuple field.
Backpatch to 9.3
Dean Rasheed [Thu, 6 Jul 2017 11:46:08 +0000 (12:46 +0100)]
Clarify the contract of partition_rbound_cmp().
partition_rbound_cmp() is intended to compare range partition bounds
in a way such that if all the bound values are equal but one is an
upper bound and one is a lower bound, the upper bound is treated as
smaller than the lower bound. This particular ordering is required by
RelationBuildPartitionDesc() when building the PartitionBoundInfoData,
so that it can consistently keep only the upper bounds when upper and
lower bounds coincide.
Update the function comment to make that clearer.
Also, fix a (currently unreachable) corner-case bug -- if the bound
values coincide and they contain unbounded values, fall through to the
lower-vs-upper comparison code, rather than immediately returning
0. Currently it is not possible to define coincident upper and lower
bounds containing unbounded columns, but that may change in the
future, so code defensively.
Discussion: https://postgr.es/m/CAAJ_b947mowpLdxL3jo3YLKngRjrq9+Ej4ymduQTfYR+8=YAYQ@mail.gmail.com