git.postgresql.org Git - postgres-xl.git/log

More thorough checks for distribution columns while creating inheritance

We now also do checks during CREATE TABLE. Also amend alter_table test case so
that a few tables are distributed using round robin method so that the new
checks/limitations don't come in their way. Also new test cases added to ensure
that the other checks for inheritance are exercised too.

Check for partitioned table correctly.

While checking where to forward DROP TABLE command, we were not checking for
partitioned table correctly. That resuled in incorrectly sending DROP TABLE to
remote coordinator for temporary partitioned tables.

Correct a mistake occurred during merging sequence.c code

We were incorrectly overwriting the 'cached' value in the SeqTable element,
thus causing another request to the GTM when nextval is fetched. This resulted
in an unintentional gaps in the sequence values. This patch fixes that, though
we might still get gaps unless sequence_range is set to 1. But this is by
design to reduce repeated round trips to the GTM.

Make temporary tables use shared storage on datanodes

Since a temporary table may be accessed by multiple backends on a datanode, XL
mostly treats such tables as regular tables. But the technique that was used to
distingush between temporary tables that may need shared storage vs those which
are accessed only by a single backend, wasn't very full proof. We were relying
on global session activation to make that distinction. This clearly fails when
a background process, such as autovacuuum process, tries to figure out whether
a table is using local or shared storage. This was leading to various problems,
such as, when the underlying file system objects for the table were getting
cleaned up, but without first discarding all references to the table from the
shared buffers.

We now make all temp tables to use shared storage on the datanodes and thus
simplify things. Only EXECUTE DIRECT anyways does not set up global session, so
I don't think this will have any meaningful impact on the performance.

This should fix the checkpoint failures during regression tests.

Don't run ALTER ENUM in an autocommit block on remote nodes

Before PG 10, Postgres did not allow ALTER ENUM to be run inside a transaction
block. So we used to run these commands in auto-commit mode on the remote
nodes. But now Postgres has removed the restriction. So we also run the
statements in transaction block.

This fixes regression failures in the 'enum' test case.

Copy distribution information correctly to ProjectSet path

ProjectSet is a new path type in PG 10 and we'd missed to copy the distribution
information correctly to the path. This was resulting in failures in many
regression test cases. Lack of distribution information, prevented the
distributed query planner from adding a Remote Subplan node on top of the plan,
thus resulting in local execution of the plan. Since the underlying table is
actually a distributed table, local execution fails to fetch any data.

Fix this by properly copying distribution info. Several regression failures are
fixed automatically with this patch.

Partially accept plan changes in updatable_views

Upstream commit 215b43cdc8d6b4a1700886a39df1ee735cb0274d significantly
reworked planning of leaky functions. In practice that change means we
no longer have to push leaky functions into a subquery. Which greatly
simplifies some plans, including the two in this patch.

This commit accepts the plans only partially, though. It uses the plans
from upstream, and adds a Remote Subquery Scan node at the top, so we
accept the general plan shape change.

But there are a few additional differences that need futher evaluation,
particularly in target lists (Postgres-XL generating more entries than
upstream) and SubPlans (Postgres-XL only generating one subplan, while
upstream generates two alternative ones).

Accept aggregation plan changes in xc_remote tests

The plans changed mainly due to abandoning the custom implementation
two-phase aggregation code, and using the upstream parallel aggregation.

That means we have stopped showing schema name in target lists, so
instead of

    Output: pg_catalog.avg((avg(xcrem_employee.salary)))

the EXPLAIN now shows

    Output: avg(xcrem_employee.salary)

and we also do projection at the scan nodes, so the target list only
shows the necessary subset of columns.

A somewhat surprising change is that the plans switch from distributed
aggregate plans like this one

    ->  Aggregate
        ->  Remote Subquery Scan
            ->  Aggregate
                -> Seq Scan

to always performing simple (non-distributed) aggregate like this

    ->  Aggregate
        ->  Remote Subquery Scan
            -> Seq Scan

This happens due to create_grouping_paths() relying on consider_parallel
flag when setting try_distributed_aggregate, disabling distributed
aggregation when consider_parallel=false. Both affected plans are however
for UPDATE queries, and PostgreSQL disables parallelism for queries that
do writes, so we end up with try_distributed_aggregate=false.

We should probably enable distributed aggregates in these cases, but we
can't ignore consider_parallel entirely, as we likely need some of the
checks. We will probably end up with consider_distributed flag, set in
a similar way to consider_parallel, but that's more an enhancement than
a bug fix.

Reject SQL functions containing utility statements

The check was not effective for the same reason as 5a54abb7acd, that is
not accounting for XL wrapping the original command into RawStmt. Fix
that by checking parsetree->stmt, and also add an assert checking we
actually got a RawStmt in the first place.

Produce proper error message for COPY (SELECT INTO)

Produce the right error message for COPY (SELECT INTO) queries, that is

ERROR: COPY (SELECT INTO) is not supported

instead of the incorrect

ERROR: COPY query must have a RETURNING clause

The root cause is that the check in BeginCopy() was testing raw_query,
but XL wraps the original command in RawStmt, so we should be checking
raw_query->stmt instead.

Add explicit VACUUM to inet test to actually do IOS

Some of the queries in inet test are meant to exercise Index Only Scans.
Postgres-XL was not however picking those plans due to stale stats on
the coordinator (reltuples and relpages in pg_class).

On plain PostgreSQL the tests work fine, as CREATE INDEX also updates
statistics stored in the pg_class catalog. For example this

    CREATE TABLE t (a INT);

    INSERT INTO t SELECT i FROM generate_series(1,1000) s(i);

    SELECT relpages, reltuples FROM pg_class
     WHERE relname = 't';

    CREATE INDEX ON t(a);

    SELECT relpages, reltuples FROM pg_class
     WHERE relname = 't';

will show zeroes before the CREATE INDEX command, and accurate values
after it completes.

On Postgres-XL that is not the case, and we will return zeroes even after
the CREATE INDEX command. To actually update the statistics we need to
fetch information from the datanodes the way VACUUM does it.

Fixed by adding an explicit VACUUM call right after the CREATE INDEX, to
fetch the stats from the datanodes and update the coordinator catalogs.

Tweak the query plan check in join regression test

The test expects the plan to use Index Scan, but with 1000 rows the
differences are very small. With two data nodes, we however compute
the estimates as if the tables had 500 rows, making the cost difference
even smaller.

Fixed by increasing the total number of rows to 2000, which means each
datanode has about 1000 and uses the same cost estimates as upstream.

Remove extra snprintf call in pg_tablespace_databases

The XL code did two function calls in the else branch, about like this:

    else
        /* Postgres-XC tablespaces also include node name in path */
        sprintf(fctx->location, "pg_tblspc/%u/%s_%s", tablespaceOid,
                TABLESPACE_VERSION_DIRECTORY, PGXCNodeName);
        fctx->location = psprintf("pg_tblspc/%u/%s_%s", tablespaceOid,
                                  TABLESPACE_VERSION_DIRECTORY,
                                  PGXCNodeName);

which is wrong, as only the first call is actually the else branch, the
second call is executed unconditionally.

In fact, the two calls attempt to construct the same location string,
but the sprintf call assumes the 'fctx->location' string is already
allocated. But it actually is not, so it's likely to cause a segfault.

Fixed by removing the sprintf() call, keeping just the psprintf() one.

Noticed thanks to GCC 6.3 complaining about incorrect indentation.

Backpatch to XL 9.5.

Fix confusing indentation in gtm_client.c

GCC 6.3 complains that the indentation in gtm_sync_standby() is somewhat
confusing, as it might mislead people to think that a command is part of
an if branch. So fix that by removing the unnecessary indentation.

Refactor the construction of distributed grouping paths

The code generating distributed grouping paths was originally structured
like this:

    if (try_distributed_aggregation)
    { ... }

    if (can_sort && try_distributed_aggregation)
    { ... }

    if (can_hash && try_distributed_aggregation)
    { ... }

It's refactored like this, to resemble the upstream part of the code:

    if (try_distributed_aggregation)
    {
        ...

        if (can_sort)
        { ... }

        if (can_hash)
        { ... }
    }

Accept plan change in xc_groupby regression test

The plan changed in two ways. Firstly, the targetlists changed due to
abandoning the custom distributed aggregation and reusing the upstream
partial aggregation code. That means we're not prefixing the aggregate
with schema name, etc.

The plan also switches from distributed aggregation to plain aggregation
with all the work done on top of a remote query. This happens simply due
to costing, as the tables are tiny and two-phase aggregation has some
overhead. The original implementation (as in XL 9.5) distributed the
aggregate unconditionally, ignoring the costing.

Parf of the problem is that the query groups by two columns from two
different tables, resulting in overestimation of the number of groups.
That means the optimizer thinks distributing the aggregation would not
reduce the number of rows, which increases the cost estimate as each
row requires network transfer and the finalize aggregate also depends
on the number of input rows.

We could make the tables larger and the optimizer would eventually
switch to distributed aggregate. For example this seems to do the
trick:

    insert into xc_groupby_tab1 select 1, mod(i,1000)
      from generate_series(1,20000) s(i);

    insert into xc_groupby_tab2 select 1, mod(i,1000)
      from generate_series(1,20000) s(i);

But it does not seem worth it, considering it's just a workaround
for the estimation issue and the increased duration. And we already
have other regression tests testing plausible queries benefiting from
distributed aggregation. So just accept the plan change.

Accept some of the differences in 'tidscan' test case

Most of these differences arise from the the fact that rows are fetched from
multiple datanodes in XL and hence TIDs can be duplicated. There are some other
differences because of EXPLAIN output.

The test case does not yet pass because of other unaddressed failures.

Fix remaining problems in 'triggers' test case.

We don't support different column ordering for partitions in XL. So enforce
that either by ensuring consistent column ordering or accepting errors. Also we
don't support triggers in XL, so some changes were necessary to take that into
account.

Fix 'union' test case.

Since 93cbab90b0c6fc3fc4aa515b93057127c0ee8a1b we expect that the child table
has columns at the same position as the parent table. So fix the test case to
follow the rule and resolve expected output differences arising from that.

The test case should pass with this change.

Don't try to fetch table details using the old name after ExecRenameStmt

This used to work before PG 10, but some changes must have caused
non-deterministic behaviour. It anyways seems unsafe to lookup the catalogs
using the old name once ExecRenameStmt has finished. The lookup may or may not
see the old tuple, depending on whether CommandCounterIncrement has happened in
between. We now fetch the requried details before calling ExecRenameStmt and
use that info for subsequent processing.

This fixes some wierd issues in 'alter_table' test case where we were failing
to send ALTER TABLE RENAME TO command to remote nodes and causing inconsistent
catalog entries between the coordinator and the remote nodes.

Accept some obvious regression differences in the 'join' test case

These are only placements of Remote FQS or Remote Subplan nodes in the newly
added explain plans in the test case. There are some remaining failures in the
test case which will need more scrutiny.

Accept changes in foreign_data test due to unsupported FDW

Postgres-XL does not support Foreign Data Wrappers, so accept failures
and output differences in the foreign_data regression test as expected.

Accept failures in triggers test du to unsupported features

Postgres-XL does not support triggers, naturally causing many failures
in the triggers regression test. So treat the error messages as expected
behavior and remove the NOTICEs produced by the triggers.

Another failure was caused by INSERT in a subquery, which is another
feature unsupported by Postgres-XL.

Accept two query plans in the join regression test

Both plans keep the same join algorithm as on PostgreSQL, and the plans
match those produced by Postgres-XL 9.5.

Resolve failures related to pg_node_tree type input

The input function for pg_node_tree pseudotype simply returns an error,
essentially disabling input from text. Some of the regression tests do

CREATE TABLE t AS SELECT * FROM pg_class;

or something like that, which works fine on PostgreSQL, but it fails on
Postgres-XL as we need to send data between nodes. Unfortunately the
pg_class now contains pg_node_tree column (relpartbound), causing falure
when reading the column on the remote node.

So instead use another catalogue, for example pg_attribute, which does
not contain any such column.

Drop the foo table in rangefuncs regression tes

The table was made non-temporary by commit e98209019b4d2012, but it was
not dropped, wich caused failures in the returning regression test. So
add the missing DROP TABLE.

Accept failures in large_object regression test

Large objects are unsupported on Postgres-XL, so the failures are
expected.

Accept plan changes in aggregates regression test

The plan changes come from the upstream, and XL only adds Remote Fast
Query Execution at the top.

Remove check of old-style C functions from misc test

The old-style C functions do not exist anymore, so remove the block of
code testing them (and failing). This test was removed from upstream,
and was only kept by mistake during the merge.

Resolve all failures in rangefuncs regression test

The failures were fairly trivial in nature:

* extra output for test block removed in PostgreSQL 10
* non-deterministic ordering of results
* functions referencing temporary tables, which does not work on XL
* triggers not supported on XL

Resolving the first two issue is simple - remove the extra blocks and
add ORDER BY to stabilize the ordering.

To fix the temp tables vs. functions issue I've simply made all the
tables non-temporary. The triggers were used merely to generate some
notices, so removing those from the expected output was enough.

Accept type name in duplicate key error message

The output routine for anyarray data type (anyarray_out) is modified to
include name of the data type in the text representation. Which is also
used when an error message includes the value, for example duplicate
key errors.

Modify enum tests to not rely on SAVEPOINT

Postgres-XL does not support SAVEPOINT (or subtransactions in general).
Some regression tests use this to test cases that are expected to fail,
and we want to keep testing that. So instead of removing the tests,
split them into independent transactions (and tweak the expected output
accordingly).

Note: The tests are still failing, though. Apparently XL does not undo
the effect of ALTER TYPE ... ADD VALUE on rollback.

Accept minor changes in sequence regression tests

Mostly just renames of the objects, received from upstream.

Accept plan changes in the equivclass regression test

The plans are fairly close to those generated on Postgres-XL 9.5, with
some minor differences due to upstream optimizer changes (e.g. missing
Subquery Scan when we can do Index Scan instead).

The main change is that when a Merge Join is identified as unique, we
may replace

   ->  Materialize
         ->  Sort
               Sort Key: ec1.f1 USING <
               ->  Remote Subquery Scan on all (datanode1)
                     ->  Index Scan using ec1_pkey on ec1
                           Index Cond: (ff = '42'::bigint)

with

   ->  Remote Subquery Scan on all (datanode_1)
         ->  Sort
               Sort Key: ec1.f1 USING <
               ->  Index Scan using ec1_pkey on ec1
                     Index Cond: (ff = '42'::bigint)

as there will be no rescans on the inner relation (so we do not need
the additional Materialize step).

i# Summary (one line, 50 chars or less) ===========
Accept simple plan changes in tsearch regression test

The accepted plan changes are fairly simple, switching plain aggregation
to a distributed one.

Ensure grouping sets get properly distributed data

Grouping sets are stricter about distribution of input data, as all the
execution happens on the coordinator - there is no support for partial
grouping sets yet, so we can either push all the grouping set work to
the remote node (if all the sets include the distribution key), or make
sure that there is a Remote Subquery on the input path.

This is what Postgres-XL 9.6 was doing, but it got lost during merge
with PostgreSQL 10 which significantly reworked this part of the code.

Two queries still produce incorrect result, but those are not actually
using the grouping sets paths because

GROUP BY GROUPING SETS (a, b)

gets transformed into simple

GROUP BY a, b

and ends up using parallel aggregation. The bug seems to be that the
sort orders mismatch for some reason - the remote part produces data
sorted by "a" but the "Finalize GroupAggregate" expects input sorted
by "a, b" leading to duplicate groups in the result.

Adjust plans for new queries in privileges tests

The upstream privileges regression test added multiple checks of explain
plans, so the plans needed to be adjusted for Postgres-XL (by adding the
Remote Subquery nodes to appropriate places).

There are two plans that however mismatch the upstream version, using
a different join algorithm (Nested Loop vs. Hash Join). Turns out this
happens due to Postgres-XL not collecting stats for expression indexes,
and the two queries rely on that feature. Without the statistics the
estimates change dramatically, triggering a plan change.

We need to extend analyze_rel_coordinator() to collect stats not only
for the table, but for all indexes too. But that's really a matter for
a separate commit.

Stabilize ordering and accept plan in domain test

Trivial result ordering stabilization by adding ORDER BY clause, and
accepting explain output with additional Remote Subquery node (on top
of the upstream plan).

Stabilize result ordering in xc_having regression test

Stabilized by adding ORDER BY clause to two queries. Generate explain
plans both for the original and modified queries.

Add block of expected output to truncate test

The expected output was missing block testing ON TRUNCATE triggers. Add
it and tweak to reflect Postgres-XL limitations (no triggers or RESTART
IDENTITY).

There seems to be an issue in handling sequences with START WITH clause,
where we don't quite respet that and start with a higher value. And we
also don't increment by 1 for some reason.

Stabilize ordering of results in xc_FQS test

Ordering of some results in xc_FQS tests became unstable, so stabilize
it by adding ORDER BY clauses. Instead of just changing the explain
plans in the same way, generate plans both for the original query
(without the ORDER BY clause) and the new one.

Accept trivial plan change in xc_for_update test

The join is detected to be unique, which is indicated by 'Inner Unique'
key in the explain plan.

Accept int2vector not being hash distributable

Upstream commit 5c80642aa8de8393b08cd3cbf612b325cedd98dc removed support
for hashing int2vector data type, as it was dead code (upstream). That
means we can no longer distribute table by hash on this datatype.

We could reintroduce the hash function in Postgres-XL, but int2vector
seems rarely used as distribution key, so let's just fix the tests. If
needed, we can add the functions in the future.

Replace WARNING about skipped stats with a SELECT

ANALYZE throws a WARNING that extended statistics was not built due to
statistics target being 0 for some of the columns, but that gets thrown
on the datanode and we never forward it to the coordinator.

So explicitly check contents of pg_statistic_ext if the statistic was
built or not.

Accept output changes due to psql \d format tweaks

The format used by psql \d and \d+ changed a bit, splitting the single
Modifiers column into Collation, Nullable, Default. Additional commands
changed too, for example \dew+ now uses "options" instead of "Options"
and so on.

This commit accepts all such output changes across all regression tests.

Stabilize plan changes and ordering in xc_groupby test

The regression test was failing because many queries started producing
results with unstable ordering, so fix that by adding ORDER BY clause
to many of them.

This of course affects the plans that were part of the test too, so
instead of running query with ORDER BY clause and then checking plan
for a query without it, check plans for both query versions. This makes
the test somewhat bigger, but it seems to be worth it and the impact on
test duration is negligible.

Multiple query plans changed in a non-trivial ways, too. This commit
accepts changes that are clearly inherited from the upstream, which was
verified by running the query on PostgreSQL 10 and accepting simple
changes (essentially adding "Remote Subquery" to reasonable places in
the plan or "Remote Fast Query Execution" at the top).

More complicated plan changes (e.g. switching from Group Aggregate to
Hash Aggregate or back, join algorithms etc.) are left unaccepted for
additional analysis.

The SQL script also generates multiple query plans that are not included
in the expected output. This is intentional, as the generated plans are
incorrect and produce incorrect ordering of results. The bug is that
queries like

SELECT sum(a) FROM t GROUP BY 1

end up producing results sorted by 'a' and not 'sum(a)'.

Remove 'current transaction is aborted' from rowsecurity.out

The rowsecurity test suite hits unsupported features on two places:

* WHERE CURRENT OF clause for cursors
* SAVEPOINTS (subtransactions)

which results in 'current transaction is aborted' errors in the rest of
the transaction. Those errors were added to the expected output file,
making the regression test succeed, but it may easily mask issues in the
aborted part of the transaction.

We need to rework the rest so that it skips the unsupported features,
and still exercises the test on the remaining parts.

Stabilize expected plans/output for rowsecurity tests

This is a mix of simple fixes to stabilize the rowsecurity test.

1) Adding ORDER BY to multiple queries, to stabilize the output order.

2) Update expected query output by copying it from upstream (PostgreSQ).
   Apparently some of this got broken during merge conflict resolution.

3) Accept simple plan changes that add Remote Subquery either at the top
   of the plan, or in InitPlan / SubPlan or CTE.

4) Accept plan changes inherited from upstream, particularly removal of
   the "Subquery Scan" nodes from the plan.

5) Add expected output for "viewpoint from regress_rls_dave" section,
   missing in the expected file for some reason.

Accept simple plan changes in select_views test

All accepted plan changes are simply adding Remote Subquery at the top
of a plan merged from upstream, in a fairly obviously correct way.

There are two additional fixes, either adding a missing block of
expected output (copied from upstream), or removing an extra output.

Build extended stats on coordinators during ANALYZE

When running ANALYZE on a coordinator, we simply fetch the statistics
built on datanodes, and keep stats from a random datanode (assuming all
datanodes are similar in terms of data volume and data distribution).

This was only done for regular per-attribute stats, though, not for the
extended statistics added in PostgreSQL 10, causing various failures in
stats_ext tests due to missing statistics. This commit fixes this gap
by using the same approach as for simple statistics - we collect stats
from datanodes and keep the first result we receive for each statistic.

While working on this I realized this approach has some inherent issues,
particularly on columns that are distribution keys. As we keep stats
from a random node, we completely ignore MCV and histograms from the
remaining nodes. That may cause planning issues, but addressing it is
out of scope for this commit.

Merge remote-tracking branch 'remotes/PGSQL/master' of PG 10

This merge includes all commits upto bc2d716ad09fceeb391c755f78c256ddac9d3b9f
of PG 10.

Ensure that child table inherits distribution stretegy from the parent.

For partitioned tables or in general inherited tables, we now enforce that the
child table always inherit the distribution strategy of the parent. This not
only makes it far easier to handle various cases correctly, but would also
allow us to optimise distributed queries on partitioned tables much easily.

Tank.zhang <6220104@qq.com> originally reported a problem with partitioned
tables and incorrect query execution. Upon investigations, we decided to make
these restrictions to simplify things.

Fix ruleutils.c for domain-over-array cases, too.

Further investigation shows that ruleutils isn't quite up to speed either
for cases where we have a domain-over-array: it needs to be prepared to
look past a CoerceToDomain at the top level of field and element
assignments, else it decompiles them incorrectly. Potentially this would
result in failure to dump/reload a rule, if it looked like the one in the
new test case. (I also added a test for EXPLAIN; that output isn't broken,
but clearly we need more test coverage here.)

Like commit b1cb32fb6, this bug is reachable in cases we already support,
so back-patch all the way.

Reduce memory usage of tsvector type analyze function.

compute_tsvector_stats() detoasted and kept in memory every tsvector value
in the sample, but that can be a lot of memory. The original bug report
described a case using over 10 gigabytes, with statistics target of 10000
(the maximum).

To fix, allocate a separate copy of just the lexemes that we keep around,
and free the detoasted tsvector values as we go. This adds some palloc/pfree
overhead, when you have a lot of distinct lexemes in the sample, but it's
better than running out of memory.

Fixes bug #14654 reported by James C. Reviewed by Tom Lane. Backport to
all supported versions.

Discussion: https://www.postgresql.org/message-id/20170514200602.1451.46797@wrigleys.postgresql.org

commit_ts test: Set node name in test

Otherwise, the script output has a lot of pointless warnings.

This was forgotten in 9def031bd2821f35b5f506260d922482648a8bb0

Avoid integer overflow while sifting-up a heap in tuplesort.c.

If the number of tuples in the heap exceeds approximately INT_MAX/2,
this loop's calculation "2*i+1" could overflow, resulting in a crash.
Fix it by using unsigned int rather than int for the relevant local
variables; that shouldn't cost anything extra on any popular hardware.
Per bug #14722 from Sergey Koposov.

Original patch by Sergey Koposov, modified by me per a suggestion
from Heikki Linnakangas to use unsigned int not int64.

Back-patch to 9.4, where tuplesort.c grew the ability to sort as many
as INT_MAX tuples in-memory (commit 263865a48).

Discussion: https://postgr.es/m/20170629161637.1478.93109@wrigleys.postgresql.org

Resolve most failures in stats_ext regression tests

This addresses simple failures in stats_ext regression tests, namely:

1) Failure to drop column that happens to be a distribution key of the
   table (picked automatically by CREATE TABLE). Resolved by explicitly
   distributing the table by another columns.

2) Commenting out DDL on FDWs, which are unsupported in Postgres-XL, and
   we already have this covered by plenty of other tests.

3) Commenting out a DO block, checking that we can't create statistics
   on a TOAST table. The error message will vary depending on what OID
   is assigned to the TOAST table. There's no nice way to catch that
   error in Postgres-XL, due to missing support for subtransactions.

4) Accept trivial plan changes, that simply add Remote Subquery into the
   plan, or ship the whole query to the nodes (Fast Query Execution).
   There additional plan changes that change the plans in other ways,
   but those need more investigation and are not accepted by this commit.

Fix variable and type name in comment.

Kyotaro Horiguchi

Discussion: https://www.postgresql.org/message-id/20170711.163441.241981736.horiguchi.kyotaro@lab.ntt.co.jp

Fix ordering of operations in SyncRepWakeQueue to avoid assertion failure.

Commit 14e8803f1 removed the locking in SyncRepWaitForLSN, but that
introduced a race condition, where SyncRepWaitForLSN might see
syncRepState already set to SYNC_REP_WAIT_COMPLETE, but the process was
not yet removed from the queue. That tripped the assertion, that the
process should no longer be in the uqeue. Reorder the operations in
SyncRepWakeQueue to remove the process from the queue first, and update
syncRepState only after that, and add a memory barrier in between to make
sure the operations are made visible to other processes in that order.

Fixes bug #14721 reported by Const Zhang. Analysis and fix by Thomas Munro.
Backpatch down to 9.5, where the locking was removed.

Discussion: https://www.postgresql.org/message-id/20170629023623.1480.26508%40wrigleys.postgresql.org

Remove unnecessary braces, to match the surrounding style.

Mostly in the new subscription-related commands. Backport the few that
were also present in older versions.

Thomas Munro

Discussion: https://www.postgresql.org/message-id/CAEepm=3CyW1QmXcXJXmqiJXtXzFDc8SvSfnxkEGD3Bkv2SrkeQ@mail.gmail.com

Fix multiple assignments to a column of a domain type.

We allow INSERT and UPDATE commands to assign to the same column more than
once, as long as the assignments are to subfields or elements rather than
the whole column.  However, this failed when the target column was a domain
over array rather than plain array.  Fix by teaching process_matched_tle()
to look through CoerceToDomain nodes, and add relevant test cases.

Also add a group of test cases exercising domains over array of composite.
It's doubtless accidental that CREATE DOMAIN allows this case while not
allowing straight domain over composite; but it does, so we'd better make
sure we don't break it.  (I could not find any documentation mentioning
either side of that, so no doc changes.)

It's been like this for a long time, so back-patch to all supported
branches.

Discussion: https://postgr.es/m/4206.1499798337@sss.pgh.pa.us

Ensure all partitions of a partitioned table has the same distribution.

To optimise and simplify XL's distributed query planning, we enforce that all
partitions of a partitioned table use the same distribution strategy. We also
put further restrictions that all columns in the partitions and the partitioned
table has matching positions. This can cause some problems when tables have
dropped columns etc, but we think it's far better to optimise XL's plans than
supporting all corner cases. We can look at removing some of these
restrictions later once the more usual queries run faster.

These restrictions allow us to unconditionally push down Append and MergeAppend
nodes to datanodes when these nodes are processing partitioned tables.

Some regression tests currently fail because of these added restrictions. We
would look at them in due course of time.

Stamp 10beta2.

Translation updates

Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: c5a8de3653bb1af6b0eb41cc6bf090c5522df52b

On Windows, retry process creation if we fail to reserve shared memory.

We've heard occasional reports of backend launch failing because
pgwin32_ReserveSharedMemoryRegion() fails, indicating that something
has already used that address space in the child process.  It's not
very clear what, given that we disable ASLR in Windows builds, but
suspicion falls on antivirus products.  It'd be better if we didn't
have to disable ASLR, anyway.  So let's try to ameliorate the problem
by retrying the process launch after such a failure, up to 100 times.

Patch by me, based on previous work by Amit Kapila and others.
This is a longstanding issue, so back-patch to all supported branches.

Discussion: https://postgr.es/m/CAA4eK1+R6hSx6t_yvwtx+NRzneVp+MRqXAdGJZChcau8Uij-8g@mail.gmail.com

Fix missing tag in the docs.

Masahiko Sawada

Discussion: https://www.postgresql.org/message-id/CAD21AoBCwcTNMdrVWq8T0hoOs2mWSYq9PRJ_fr6SH8HdO+m=0g@mail.gmail.com

Fix check for empty hostname.

As reported by Arthur Zakirov, Gcc 7.1 complained about this with
-Wpointer-compare.

Discussion: https://www.postgresql.org/message-id/CAKNkYnybV_NFVacGbW=VspzAo3TwRJFNi+9iBob66YqQMZopwg@mail.gmail.com

Fix COPY's handling of transition tables with indexes.

Commit c46c0e5202e8cfe750c6629db7852fdb15d528f3 failed to pass the
TransitionCaptureState object to ExecARInsertTriggers() in the case
where it's using heap_multi_insert and there are indexes. Repair.

Thomas Munro, from a report by David Fetter
Discussion: https://postgr.es/m/20170708084213.GA14720%40fetter.org

Allow multiple hostaddrs to go with multiple hostnames.

Also fix two other issues, while we're at it:

* In error message on connection failure, if multiple network addresses
were given as the host option, as in "host=127.0.0.1,127.0.0.2", the
error message printed the address twice.

* If there were many more ports than hostnames, the error message would
always claim that there was one port too many, even if there was more than
one. For example, if you gave 2 hostnames and 5 ports, the error message
claimed that you gave 2 hostnames and 3 ports.

Discussion: https://www.postgresql.org/message-id/10badbc6-4d5a-a769-623a-f7ada43e14dd@iki.fi

Doc: remove claim that PROVE_FLAGS defaults to '--verbose'.

Commit e9c81b601 changed this, but missed updating the documentation.
The adjacent claim that we use TAP tests only in src/bin seems pretty
obsolete as well. Minor other copy-editing.

Doc: clarify wording about tool requirements in sourcerepo.sgml.

Original wording had confusingly vague antecedent for "they", so replace
that with a more repetitive but clearer formulation. In passing, make the
link to the installation requirements section more specific. Per gripe
from Martin Mai, though this is not the fix he initially proposed.

Discussion: https://postgr.es/m/CAN_NWRu-cWuNaiXUjV3m4H-riWURuPW=j21bSaLADs6rjjzXgQ@mail.gmail.com

Doc: desultory copy-editing for v10 release notes.

Improve many item descriptions, improve markup, relocate some items
that seemed to be in the wrong section.

Properly redistribute results of Gather Merge nodes

The optimizer was not generating correct distributed paths with Gather
Merge nodes, because those nodes always looked as if the data was not
distributed at all. There were two bugs causing this:

1) Gather Merge did not copy distribution from the subpath, leaving it
NULL (as if running on coordinator), so no Remote Subquery needed.

2) create_grouping_paths() did not check if a Remote Subquery is needed
on top of Gather Merge anyway.

After fixing these two issues, we're now generating correct plans (at
least judging by select_parallel regression suite).

Accept reasonable plan changes in select_parallel

All the accepted plan changes are simply adding Remote Subquery, and
seem correct and reasonable. Where possible, I've verified that the
older XL versions produce the same (or very similar) plan.

There are also three additional minor fixes:

1) An extra EXPLAIN query, as EXPLAIN ANALYZE hides the part below
   Remote Subquery, making it mostly useless. The extra EXPLAIN
   shows the whole plan and addresses this.

2) Postgres-XL does not support subtransactions, so the block setting
   effective_io_concurrency was failing, and aborting the surrounding
   transaction. Removing the EXCEPTION clause may cause issues on
   systems not supporting this GUC, but that should be rare.

3) Removed a section of expected output, matching a block removed
   from the SQL script.

Doc: update v10 release notes through today.

Doc: fix backwards description of visibility map's all-frozen data.

Thinko in commit a892234f8.

Vik Fearing

Discussion: https://postgr.es/m/b6aaa23d-e26f-6404-a30d-e89431492d5d@2ndquadrant.com

Accept pgxc_node as containing no pinned objects

The misc_sanity test checks for catalogs containing no pinned objects
after initdb. As pgxc_node is empty at that point (as expected), it got
caught by the new regression test. So add it to expected output.

Remove the setup_storm call omitted from last commit

Should have been part of the previous commit removing storm_catalog, but
I forgot to include this bit.

Remove storm_catalog schema

The storm_catalog schema is supposed to contain the same catalogs and
views as pg_catalog, but filtered to the current database. The use case
for this is multi-tenant systems, which was a StormDB feature.

But on XL this is mostly irrelevant, and the schema was not populated
since commit 8096e3edf17b260de15472eb04567d1beec1e3e6 which disabled
this part of initdb.

So instead of fixing the regression failures in misc_sanity caused by
this (initdb-time schema with no pinned objects), just rip all the
remaining bits out, including the pgxc_catalog_remap GUC etc.

This also removes the setup_storm() call disabled by 8096e3edf1, as the
function got removed since then.

Correct function line number in an error message

The line number was incorrect for some reason. Should be 24 and not 23.

Make the last query in object_address work again

The last query in the test was not really doing anything as it failed
on the first object unsupported by Postgres-XL. So remove all such
unsupported objects, to make it work again.

Add missing block of expected output to object_address

The regression test was clearly missing a block of expected output,
likely due to a mistake during initial merge conflict resolution.

Disable support for CREATE PUBLICATION/SUBSCRIPTION

As the in-core logical replication is based on decoding WAL, there's no
easy way to support it on Postgres-XL as the WAL is spread over many
nodes. We essentially forward the actions to coordinators/datanodes,
and each of them has it's own local WAL. Reconstructing the global WAL
(which is needed for publications) would be challenging (e.g. because
replicated tables have data on all nodes), and it's certainly not
something we want to do during stabilization phase.

Supporting subscriptions would be challenging to, although for different
reasons (multiple subscriptions vs. multiple coordinators).

So instead just disable the CREATE PUBLICATION / SUBSCRIPTION commands,
just like we do for other unsupported features (e.g. triggers).

MSVC: Repair libpq.rc generator.

It generates an empty file, so libpq.dll advertises no version
information. Commit facde2a98f0b5f7689b4e30a9e7376e926e733b8
mistranslated "print O;" in this one place.

Avoid unreferenced-function warning on low-functionality platforms.

On platforms lacking both locale_t and ICU, collationcmds.c failed
to make any use of its static function is_all_ascii(), thus probably
drawing a compiler warning. Oversight in my commit ddb5fdc06.
Per buildfarm member gaur.

Accept trivial plan changes in brin regression test

Those are new plans in upstream, and the only that happened to them is
adding "Remote Fast Query Execution" to the top of the plan.

Accept trivial plan changes in join regression tests

This commit only accepts trivial plan changes, caused either by unique
joins patch (which may turn semijoins into regular joins), or by adding
a Remote Subquery on top of a new upstream plan.

There are multiple plans that change in a more complicated way,
requiring a more thorough investigation.

Resolve a trivial failure in the union regression test

The expected output for one of the queries contained an extra row, most
likely due to a mistake during merge conflict resolution. There were in
fact 8 rows, yet the expected query output said '(7 rows)'.

Accept plan changes in updatable_views regression tests

The accepted plan changes are trivial and exactly match upstream changes
mostly due to commits 215b43cdc8 and 92a43e4857.

I've not accepted changes in the two UPDATE plans at the very end, as
those are more complicated and require more thorough investigation.

Accept plan changes in inherit regression tests

The accepted plans are actually new in the upstream, testing planning on
partitioned tables. The changes are just adding "Remote Subquery" node
at the top, to distribute the query to all datanodes.

The plans look reasonable and correct in general, but as a sanity check
I've also reproduced them on XL 9.5 using plain inheritance (because
partitioning is new in PostgreSQL 10). Naturally there are differences
)e.g. with partitioning the planner includes only leaf partitions in
the plan, while with inheritance we include the whole inheritance tree)
but otherwise the plans seem to be close enough.

Fix typo

Noticed while reviewing code.

Add missing block of expected output to inherit test

The expected output of 'inherit' regression test was missing a whole
block of code, likely due to incorrect resolution of merge conflict.
Add the missing piece, copied from upstream.

Stabilize order of results in insert regresion test

Same issue as in c2392efc83, but in different regression test. Fixed the
same way, i.e. by adding ORDER BY clauses to stabilize the order.

Stabilize order of results in macaddr8 regression test

Same issue as in c2392efc83, but in different regression test. Fixed the
same way, i.e. by adding ORDER BY clauses to stabilize the order.

Stabilize order of results in inet regression test

The tests were relying on ordering of rows implied by storage, but that
does not work in multi-node clusters. Fixed by adding ORDER BY clauses
stabilizing the order, and updating the expected results.

Fix out of date comment

Author: Masahiko Sawada <sawada.mshk@gmail.com>

Change type to (Node *) to fix compiler warning

get_object_address() expects the second parameter to be (Node *) but
we've been passing (List *), so that compilers were complaining. Just
change the type to fix this.

Add OCLASS_PGXC items to several switch statements

Multiple switch statements on oclass values are intentionally missing
the default clause. As the PGXC oclass options were missing, compilers
were complaining about it.

Fix potential data corruption during freeze

Fix oversight in 3b97e6823b94 bug fix. Bitwise AND is used instead of OR and
it cleans all bits in t_infomask heap tuple field.

Backpatch to 9.3

Clarify the contract of partition_rbound_cmp().

partition_rbound_cmp() is intended to compare range partition bounds
in a way such that if all the bound values are equal but one is an
upper bound and one is a lower bound, the upper bound is treated as
smaller than the lower bound. This particular ordering is required by
RelationBuildPartitionDesc() when building the PartitionBoundInfoData,
so that it can consistently keep only the upper bounds when upper and
lower bounds coincide.

Update the function comment to make that clearer.

Also, fix a (currently unreachable) corner-case bug -- if the bound
values coincide and they contain unbounded values, fall through to the
lower-vs-upper comparison code, rather than immediately returning
0. Currently it is not possible to define coincident upper and lower
bounds containing unbounded columns, but that may change in the
future, so code defensively.

Discussion: https://postgr.es/m/CAAJ_b947mowpLdxL3jo3YLKngRjrq9+Ej4ymduQTfYR+8=YAYQ@mail.gmail.com