Pavan Deolasee [Thu, 25 Oct 2018 09:05:27 +0000 (14:35 +0530)]
Allow RECURSIVE queries on function scans
Pavan Deolasee [Thu, 25 Oct 2018 09:05:04 +0000 (14:35 +0530)]
Put proper date against GA release
Pavan Deolasee [Thu, 25 Oct 2018 05:35:16 +0000 (11:05 +0530)]
Fix a typo
Pavan Deolasee [Thu, 25 Oct 2018 05:15:20 +0000 (10:45 +0530)]
Stamp XL 10r1 GA
Pavan Deolasee [Tue, 23 Oct 2018 08:09:20 +0000 (13:39 +0530)]
Update release notes from XL 10r1 GA release
Pavan Deolasee [Tue, 23 Oct 2018 05:18:39 +0000 (10:48 +0530)]
Add two more parameters to pgxc_ctl.conf
pgxc_ctl sets up postgresql.conf and recovery.conf while adding and removing a
coordinator/datanode standby. This interfers with user defined settings if the
user has chosen different mechanism for backup/recovery, such as barman. So
this band-aid fix avoids overwriting the user defined values while
adding/removing standbys. We assume that if these parameters are set to 'y'
(the default is 'n' which means the default old behaviour), then the user would
manually setup things correctly.
Pavan Deolasee [Mon, 22 Oct 2018 12:00:03 +0000 (17:30 +0530)]
Use wal_level=replica as hot_standby is deprecated
Pavan Deolasee [Mon, 22 Oct 2018 10:54:47 +0000 (16:24 +0530)]
Catch and report GTM errors correctly
If the connection with the GTM is dropped, we should ensure that the caller
gets to see the error. This fixes some of the issues noticed during GTM
stop/start tests.
Pavan Deolasee [Fri, 19 Oct 2018 05:08:03 +0000 (10:38 +0530)]
Another improvement to handle multi-command queries correctly.
The new facilities in PG10 offer ways to split and detect the individual
commands in a multi-command query. But we were failing to use that correctly at
a few places. This was detected by isolation tester, but there was also an
existing bug masked by incorrect acceptance of expected output. To be fair, the
falling test case was copied to xl_known_bugs. So this was a known issue for a
while.
This now fixes the regression and isolation test.
Pavan Deolasee [Thu, 18 Oct 2018 08:15:22 +0000 (13:45 +0530)]
Temporary fix to get consistent results from ANALYZE
While we don't yet quite sure about the underlying problem, it seems that
sometimes ANALYZE fails to take into account recent updates. We suspect that
this happens because of the cluster monitor taking a little longer to report
xmin and update global xmin. For now, force a wait to get up to date stats.
Pavan Deolasee [Thu, 18 Oct 2018 08:05:21 +0000 (13:35 +0530)]
Fix expected output in the 'xml' test case
Pavan Deolasee [Thu, 18 Oct 2018 07:42:45 +0000 (13:12 +0530)]
Ensure that we don't try to dereference a NULL pointer
Pavan Deolasee [Mon, 15 Oct 2018 08:19:22 +0000 (13:49 +0530)]
Fix couple of regressions introduced in the previous commit.
The format of the GTM control file setup during `initgtm` was broken. Also, we
failed to consider the case of GTM standby initialising which might pass a NULL
File handle.
Tomas Vondra [Fri, 12 Oct 2018 12:32:54 +0000 (14:32 +0200)]
Fix incorrect comparison in pgxcnode_gethash
The check is supposed to ensure NULL/empty nodename gets hashed to 0,
but (nodename == '\0') is comparing the pointer itself, not the first
character. So dereference that correctly.
Tomas Vondra [Fri, 12 Oct 2018 12:23:29 +0000 (14:23 +0200)]
Use sufficiently large buffer in SharedQueueWrite
The sq_key alone may be up to 64 bytes, so we need more than that.
We could use dynamic memory instead, but 128 bytes should be enough
both for the sq_key and the other pieces.
Tomas Vondra [Thu, 11 Oct 2018 14:51:09 +0000 (16:51 +0200)]
Use dynamic buffer to parse NODE_LIST_RESULT in GTM
When processing NODE_LIST_RESULT messages, gtmpqParseSuccess() used
a static buffer, defined as "char buf[8092]". This is an issue, as
the message has variable length, and may get long enough to exceed
any hard-coded limit. While that's not very common (it requires
long paths, node names and/or many GTM sessions on the node), it
may happen, in which case the memcpy() causes a buffer overflow and
corrupts the stack.
Fixing this is simple - allocate the buffer using malloc() intead,
requesting exactly the right amount of memory. This however hits
a latent pre-existing issue in the code, because the code was doing
memcpy(&buf,...) instead of memcpy(buf,...). With static buffers
this was harmless, because (buf == &buf), so the code was working
as intended (except when there were more than 8092 bytes). With
dynamic memory this is no longer true, becase (buf != &buf), and
the stack corruption was much easier to trigger (just 8 bytes).
Per report and debug info by Hengbing. Patch by Pavan and me.
Pavan Deolasee [Wed, 10 Oct 2018 10:51:40 +0000 (16:21 +0530)]
Ensure CREATE TABLE (LIKE INCLUDING ALL) honors distribution strategy
We introduce two INCLUDING/EXCLUDING options to copy the distribution strategy
and the target nodes. It's quite unlikely that someone would create a LIKE
table but won't want to copy the distribution information, but it gives user
flexibility to do that. Since we did not want to add more keywords, we simply
used DISTRIBUTE and NODE, though DISTRIBUTION and NODES may have been better
names.
Pavan Deolasee [Wed, 10 Oct 2018 04:33:58 +0000 (10:03 +0530)]
Wake up cluster monitor more aggressively when a backend is waiting
When a backend is waiting for another transaction to commit on the GTM, it
wakes up the cluster monitor process before going to sleep. This ensures that
the backend gets to know about transaction completion much sooner. This
mechanism shouldn't be too bad because a backend potentially may wait on
the global commit only after the transaction is locally committed. Hence in
most common cases, the wait shouldn't be too long.
Per report from Hengbing on the last patch.
Pavan Deolasee [Mon, 8 Oct 2018 07:38:50 +0000 (13:08 +0530)]
Track global GTM state via cluster monitor.
We received a report that sometimes a transaction fails to see a recently
committed row, especially when the query tries to insert a duplicate primary
key, sees another concurrent insert and then waits for the other transaction to
commit. The next statement in the transaction should see the concurrently
inserted row if the transaction is running in read committed mode. But in
practice, sometimes GTM may see the final COMMIT message of the second
transaction after it hands out a new snapshot to the first transaction and this
snapshot still shows the second transaction as running. The MVCC check then
fails to see the row.
This is quite a complex situation and a tricky one to handle in a distributed
system without some kind of logical ordering of events. For example, if a
transaction T1 sees a COMMIT of another transaction T2 anywhere in the cluster,
all subsequent actions of T1 should see T2's COMMIT. Enforcing such logical
ordering is not trivial without additional communication overhead.
So what we are now doing is to ensure that a transaction is reported as
in-progress on all nodes until the GTM sees the final COMMIT. But instead of
querying the GTM everytime, we now maintain a local copy of the GTM's view of
running transactions. This local copy is either updated by the cluster-monitor
proccess at regular interval or when other backends fetch a new snapshot from
the GTM. In order to ensure that the state is only moved forward, we now also
have a concept of snapshot_id or counter which is incremented everytime state
on the GTM changes. Being a 64-bit counter, we don't need to worry about
a wrap-around. A transaction is considered to be in-progress, as long as it's
open on the GTM.
Accept changes in xc_for_update test case's expected output. In fact, the old
output was wrong since we were disregarding the prepared transaction holding
the AEL on the table. This looks like a separate bug, which should be
investigated in more details.
Pavan Deolasee [Tue, 9 Oct 2018 06:43:17 +0000 (12:13 +0530)]
Harden the rules for GTM control file structure
We introduced GTM control file version in commit
d372df43649, way back in
2016-03-02. But in order to support the older version, we were accepting any
bad input and overwriting the control file with initial data. This can lead to
many issues. Hence stop accepting bad version or bad contents. This should
hopefully be not a problem since we're working on a new XL 10 release. And even
if we backpatch this to XL 9.5, it shouldn't cause too much problem as the
version change was introduced before XL 9.5 GA.
We need to work on this more by adding even more stricter checks for control
file sanity, including CRC checksums. But that can happen via a separate
commit.
Pavan Deolasee [Wed, 3 Oct 2018 06:00:30 +0000 (11:30 +0530)]
Relax restriction for column ordering for partition tables
We'd put a overly restrictive limitation that the column orders and column
positions must match between a partitioned table and it's children. This was
done during the PG 10 merge while we were still fixing many other issues. But
now is the right time to revisit this restriction. While we still need to
ensure that the distribution column's type, name and position matches across
all tables, we can relax the restriction for other columns. This should help in
a common situation where columns are added and dropped in the parent table. As
long as the physical position of the distribution column does not change, one
should now be able to create a child table and attach it as a partition without
worrying too much about dropped columns. Many other restrictions still apply as
they do in PostgreSQL.
A few test cases are added to test the new behaviour.
Pavan Deolasee [Fri, 28 Sep 2018 09:24:11 +0000 (14:54 +0530)]
Ensure config changes caused by functions are rolled back correctly.
When a function body has SET clauses attached to it, the function
validation/execution reflects those config changes locally as well as on the
remote nodes. But we were failing to restore the old values back when the
command ends. This had gone unnoticed so far for the lack of any test case
exercising this code. Note that there were existing test cases in this area,
but the bug got unmasked only after we added a temporary table in the session.
When a temporary table is accessed in a session, we don't reset the session at
the end of the transaction and hence the issue surfaced.
This was causing failure in a new test added in the 'rules' test. This patch
fixes that.
Pavan Deolasee [Fri, 28 Sep 2018 07:44:34 +0000 (13:14 +0530)]
Accept regression diffs in 'object_address' test case.
The differences are caused by our lack of support for CREATE
SUBSCRIPTION/PUBLICATION
Pavan Deolasee [Fri, 28 Sep 2018 07:40:17 +0000 (13:10 +0530)]
Accept regression diffs in 'portal' test case.
One change relates to the addition of Remote Subplan node and the other relates
to our lack of support for WHERE CURRENT OF
Pavan Deolasee [Fri, 28 Sep 2018 06:49:43 +0000 (12:19 +0530)]
Fix an oversight in
c8a42a0726911b737837d6763707c7d14d9a2987
Well spotted by Virendra Kumar
Pavan Deolasee [Fri, 28 Sep 2018 05:48:43 +0000 (11:18 +0530)]
Some edits to a TAP test
Pavan Deolasee [Fri, 28 Sep 2018 04:33:16 +0000 (10:03 +0530)]
Minor refactoring of the code to send down SET options
Pavan Deolasee [Thu, 27 Sep 2018 09:56:23 +0000 (15:26 +0530)]
Add a TAP test to do sanity tests post addition of coordinators/datanodes.
Its a very simplistic test right now, but hopefully we can extend this further
in the coming time.
Pavan Deolasee [Wed, 26 Sep 2018 17:11:59 +0000 (22:41 +0530)]
Use read-write transaction to dump a database
In XL we use nextval() to get a consistent value of a sequence since different
nodes may have stale view of the sequence and hence we must obtain a value from
the GTM. Calling nextval() though requires us to execute the transaction in
read-write mode. So we do that. It shouldn't be a big deal since pg_dump is
very careful about only reading from the database (it was only recently the
transaction was made read-only). More to the point, in XL we'd never used
read-only transaction while dumping a database, but by default direct
connection to a datanode are treated as read-only transaction and hence merely
removing READ ONLY transaction attribute wasn't enough. We had to explicitly
mark it READ WRITE to override the default datanode behaviour.
Per report by Virendra Kumar
Pavan Deolasee [Wed, 26 Sep 2018 09:29:33 +0000 (14:59 +0530)]
Accept regression diff in 'xml' test case casued by merge issue
Pavan Deolasee [Wed, 26 Sep 2018 09:25:52 +0000 (14:55 +0530)]
Accept regression diffs arising due to lack of FDW support
Pavan Deolasee [Wed, 26 Sep 2018 09:24:23 +0000 (14:54 +0530)]
Accept regression diffs arising due to lack of trigger support
Pavan Deolasee [Wed, 26 Sep 2018 09:19:55 +0000 (14:49 +0530)]
Accept expected output diffs pertaining to Remote Subplan addition
All these test caes were failing because of Remote Subplan nodes in the explain
output.
Pavan Deolasee [Wed, 26 Sep 2018 08:26:21 +0000 (13:56 +0530)]
Accept regression diffs in 'inherit' test case
We're simply adding Remote Subplan nodes to the query plans
Pavan Deolasee [Wed, 26 Sep 2018 08:24:49 +0000 (13:54 +0530)]
Accept regression diff in 'create_table' test
We're simply printing the distribution info too
Pavan Deolasee [Wed, 26 Sep 2018 08:22:16 +0000 (13:52 +0530)]
Amend and accept regression diffs in 'insert' test case
We don't support triggers and hence the do-nothing trigger doesn't have the
desired effect in XL. As a result, the inserts are not skipped in XL (while
they are skipped because the before-row trigger returns nothing). Also we
expect columns to be in the same order for partitions. So make those changes
too.
Pavan Deolasee [Wed, 26 Sep 2018 07:21:28 +0000 (12:51 +0530)]
Ensure datanode WAL directory changes are reflected on failover.
We'd failed to take into account the changes to WAL dirs in pgxc_ctl.conf upon
failover. This would later cause issues while reading the conf file.
Per report and patch by Virendra Kumar
Pavan Deolasee [Tue, 25 Sep 2018 11:04:43 +0000 (16:34 +0530)]
Merge tag 'REL_10_5' into XL_10_STABLE
Pavan Deolasee [Tue, 25 Sep 2018 09:25:01 +0000 (14:55 +0530)]
Change some expected output files to reflect change in error message
This accounts for the change in the error message introduced in
305ad92ae05a7a367501be3f68cde9ff9f827622.
Pavan Deolasee [Thu, 20 Sep 2018 08:56:42 +0000 (14:26 +0530)]
Leverage binary-upgrade facility of pg_dump
While adding a new coordinator or a datanode, we take a schema dump of an
existing node and restore it on the new node. This has some problems,
especially while restoring tables with dropped columns and different column
ordering. The new node won't restore the dropped columns and may recreate
columns in different ordering, for inheritted tables for example. This then
leads to issues during query execution.
We now leverage binary-upgrade dump facility available in pg_dump/pg_restore.
While XL is not very sensitive to OID preservation across nodes, we do care
about column ordering. So binary-upgrade's OID preservation facility is of not
great interest to us, but it shouldn't harm also.
Also ensure that while restoring sequence states, we don't force update the GTM
state.
A tap test is added to check the node addition this way.
The original report came from Krzysztof Nienartowicz. Patch and further work by
me.
Pavan Deolasee [Wed, 19 Sep 2018 06:38:44 +0000 (12:08 +0530)]
Correct error message to use word "distributed" instead of "partitioned".
Per report by Pallavi Sontakke
Pallavi Sontakke [Fri, 14 Sep 2018 07:18:32 +0000 (12:48 +0530)]
Add test for Set Returning Functions
Fixes #204
Pavan Deolasee [Tue, 18 Sep 2018 10:31:43 +0000 (16:01 +0530)]
Ensure consistent output in the 'select_views' test case
We added ORDER BY clause to the query, but that returns different ordering
depending on the LC_COLLATE setting on a given machine. So enforce the correct
ordering by specifying the desired collation in the query itself.
Pavan Deolasee [Mon, 17 Sep 2018 13:44:39 +0000 (19:14 +0530)]
Correct date in the release notes
Pavan Deolasee [Mon, 17 Sep 2018 09:56:58 +0000 (15:26 +0530)]
Restamp Postgres-XL 10r1beta1 correctly this time.
Pavan Deolasee [Mon, 17 Sep 2018 09:56:32 +0000 (15:26 +0530)]
Correct some typos in the release notes
Pavan Deolasee [Fri, 14 Sep 2018 08:12:06 +0000 (13:42 +0530)]
Stamp Postgres-XL 10r1beta1
Pavan Deolasee [Mon, 17 Sep 2018 04:09:26 +0000 (09:39 +0530)]
Make COPYRIGHT changes
Pavan Deolasee [Fri, 14 Sep 2018 07:53:44 +0000 (13:23 +0530)]
Some updates to the release notes draft
Pavan Deolasee [Fri, 14 Sep 2018 05:51:02 +0000 (11:21 +0530)]
Fix couple of compiler warnings
Pavan Deolasee [Fri, 14 Sep 2018 05:29:48 +0000 (10:59 +0530)]
Fix some more compiler warnings
Pavan Deolasee [Fri, 14 Sep 2018 04:47:59 +0000 (10:17 +0530)]
Remove some unused variables, fix compiler warnings
Pavan Deolasee [Fri, 14 Sep 2018 04:43:54 +0000 (10:13 +0530)]
Fix a few compiler warnings
Pavan Deolasee [Thu, 13 Sep 2018 10:12:14 +0000 (15:42 +0530)]
Write draft release notes for Postgres-XL 10r1beta1
Pavan Deolasee [Thu, 13 Sep 2018 10:11:34 +0000 (15:41 +0530)]
Adding missing documentation for pgxl_remote_fetch_size
Pavan Deolasee [Wed, 12 Sep 2018 09:15:55 +0000 (14:45 +0530)]
Initialise a variable correctly.
This was leading to unexpected/unexplained crashes in the cluster monitor
process. Per reprot by Hengbing
Pavan Deolasee [Wed, 12 Sep 2018 08:55:57 +0000 (14:25 +0530)]
Disable FQS for explicit cursor queries.
The FQS mechanism is not currently handling cursors well as seen by failures in
the 'combocid' test case. The regular planner handles it well though. So for
now disable FQS for such queries.
Also remove the restriction that TID scan cannot be performed by non-FQS
queries. Regression does not throw up any errors. So remove the restriction
unless we see a counter example.
Tomas Vondra commented that this works fine in XL 9.5, so most likely we might
be failing to send a proper snapshot to the datanodes. Further investigations
required.
Pavan Deolasee [Wed, 12 Sep 2018 08:54:43 +0000 (14:24 +0530)]
Accept regression diffs in 'xml' test case
They are related to our lack of support for subtransactions (exception blocks)
and adding Remote Subquery Scan plan nodes.
Amruta Deolasee [Tue, 11 Sep 2018 06:02:56 +0000 (06:02 +0000)]
Test ALTER TABLE.. ADD PRIMARY KEY
Ensure that this sets columns NOT NULL on all child tables on all nodes
Amruta Deolasee [Mon, 10 Sep 2018 06:44:12 +0000 (06:44 +0000)]
test ALTER TYPE.. RENAME VALUE
Pavan Deolasee [Tue, 11 Sep 2018 09:21:09 +0000 (14:51 +0530)]
Run ANALYZE (COORDINATOR) on remote coordinators iff running outside a txn
block
We'd seen some distributed deadlocks when ANALYZE (COORDINATOR) is run on
remote coordinators, in parallel with VACUUM FULL on catalog tables. The
investigations so far indicate that the auxilliary datanode connections from
the other coordinators can lead to a distributed deadlock as the connections
from the originating coordinator still have the transaction open.
We now restrict remote analyze only when we are not running inside a user
transaction block. This allows us to commit the first transaction and then
start a new transaction to run ANALYZE (COORDINATOR) on the remote
coordinators, thus avoiding the distributed deadlock.
Per report from Pallavi Sontakke and lots of analysis by me, which may not
still be sufficient.
Pavan Deolasee [Tue, 11 Sep 2018 05:55:09 +0000 (11:25 +0530)]
Remove a problem test from 'inherit' test case
There is a problem with reading an anonymous record type, resulting in an error
such as "ERROR: input of anonymous composite types is not implemented".
This is an existing issue affecting even XL 9.5, but was masked so far by lack
of testing and also because we inadvertently accepted wrong output in XL 9.5.
For now, move the case to xl_known_bugs and open an issue to track this
potential long standing bug.
Pavan Deolasee [Fri, 7 Sep 2018 07:51:56 +0000 (13:21 +0530)]
Move failing tests from 'with' to 'xl_known_bugs'
There are two remaining failures in the 'with' test case. Both of these bugs
exist in XL 9.5 too and hence moving them to the xl_known_bugs test case. We
have issues open for these bugs.
Pavan Deolasee [Fri, 7 Sep 2018 06:12:38 +0000 (11:42 +0530)]
Adjust 'limit' test case to avoid regression failures.
There are two known problems in the test case.
1. When nextval() is pushed down to the remote nodes, the coordinator session
won't have seen a nextval() invokation and hence a subsequent currval() will
throw an ERROR or return a stale value.
2. When nextval() is pushed down to the remote nodes and a set of rows are
fetched (either ordered or otherwise), the results may be somewhat inconsistent
because each node will run nextval() concurrently and hence the ordering will
change depending on who runs when.
For now, change the queries to force nextval() evaluation on the coordinator.
Also move the original test cases to xl_known_bugs test case and we also have
an internal issue open to track this problem.
Pavan Deolasee [Fri, 7 Sep 2018 06:08:51 +0000 (11:38 +0530)]
Fetch sn_xcnt just once to ensure consistent msg
We're seeing some reports when fetching snapshot from the GTM hangs forever on
the client side, waiting for data which never arrives. One theory is that the
snapshot->sn_xcnt value changes while sending snapshot from the GTM, thus
causing a mismatch between what the server sends and what the client expects.
We fixed a similar problem in
1078b079d5476e3447bd5268b317eacb4c455f5d, but may
be it's not complete. Put in this experimental patch (which can't make things
any worse for sure) while we also investigate other bugs in that area.
Pavan Deolasee [Thu, 6 Sep 2018 08:12:48 +0000 (13:42 +0530)]
Fix problems associated with globalXmin tracking by ClusterMonitor
The very first report by the cluster monitor may be discarded by the GTM if the
reporting xmin has fallen far behind GTM's view. This leads to the globalXmin
value remaining Invalid in the shared memory state, as tracked by the
ClusterMonitor. ClusterMonitor process usually naps for CLUSTER_MONITOR_NAPTIME
(default 5s) between two successive reporting. But discard that during the
bootup process and report the xmin a bit more aggressively. This should in all
likelihood set the globalXmin correctly, before the regular backends start
processing.
The other major problem with the current code was that when the globalXmin
tracked in the shared memory state is Invalid, the callers were using
FirstNormalXid as the globalXmin. This could be disastrous especially when XID
counter has wrapped around. We could accidentally remove visible rows by using
a wrong value of globalXmin. We now fix that by computing the globalXmin using
the local state (just like we would have computed globalXmin in vanilla PG).
This should ensure that we never use a wrong or a newer value for globalXmin
than what is allowed.
Accept regression diff in txid test case resulting from the fix. The new
expected output actually matches with what upstream produces.
Per report by Hengbing and investigations/fix by me.
Pavan Deolasee [Wed, 5 Sep 2018 07:56:02 +0000 (13:26 +0530)]
Use correct format specifier for transaction ids
Pavan Deolasee [Wed, 5 Sep 2018 07:14:39 +0000 (12:44 +0530)]
Remove duplicate (and unused) declaration of XidStatus
Pavan Deolasee [Fri, 31 Aug 2018 09:42:42 +0000 (15:12 +0530)]
Force restart cluster monitor process if the GTM restarts
If GTM loses node registration information and returns
GTM_ERRCODE_NODE_NOT_REGISTERED to the coordinator/datanode, restart the
cluster monitor process (by simply exiting with exit code 0). This would ensure
that the cluster monitor re-registers with the GTM and start cleanly.
Per report by Virendra Kumar
Pavan Deolasee [Wed, 29 Aug 2018 05:58:38 +0000 (11:28 +0530)]
Ensure pgxc_node_str returns the result in correct format
Per report from Krzysztof Nienartowicz, this broke the pgxc_clean command on
XL10.
Pavan Deolasee [Thu, 23 Aug 2018 06:20:23 +0000 (11:50 +0530)]
Allow minimum value of pgxl_remote_fetch_size to zero
This is an experimental work. By setting pgxl_remote_fetch_size to 0, user can
fetch all rows at once from the remote side (instead of fetching a small number
at a time, thus giving the node to produce more rows as the previous set is
consumed). While fetching all rows at once is not very useful, it allows us to
circumvent PostgreSQL's limitation of not supporting parallel queries unless a
Portal is run once and to the end.
We do see hangs in regression while running with PGOPTIONS set to "-c
force_parallel_mode=on -c pgxl_remote_fetch_size=0", so there are issues that
need to be addressed. Also, it doesn't seem quite possible for users to
dynamically set pgxl_remote_fetch_size to enforce parallel query. So this is an
experimental feature that we don't expect users to heavily use, just yet.
Pavan Deolasee [Thu, 23 Aug 2018 05:24:37 +0000 (10:54 +0530)]
Ensure parallelModeNeeded flag is sent down to the remote node.
This still does not solve the problem that the datanodes don't make use of
parallel queries during distributed execution (they work OK if queries are
FQSed). That's because PostgreSQL does not support parallel queries when a
portal is used to fetch only a part of the result set. We need separate patches
to either fix or work-around that.
Pavan Deolasee [Wed, 22 Aug 2018 06:15:42 +0000 (11:45 +0530)]
Accept some more regression diffs in inherit.out
We changed one test case to use HASH distribution so that a primary key can be
defined. Another diff turned out to be a change from the upstream.
Pavan Deolasee [Tue, 21 Aug 2018 11:53:18 +0000 (17:23 +0530)]
Fix an assertion failure
Pavan Deolasee [Tue, 21 Aug 2018 09:54:13 +0000 (15:24 +0530)]
Ensure that RemoteSubplan is marked parallel unsafe.
We can't support multiple backends running RemoteSubplan in parallel. So block
that. In fact, we'd already done so by setting parallel_safe to false while
creating the path, but there is another code path which builds the plan node
directly and we'd missed that.
Pavan Deolasee [Mon, 20 Aug 2018 10:47:31 +0000 (16:17 +0530)]
Fix an oversight in
32025718755c4bbf100563fdc88e96225fc1b250
In passing, add test cases and fix other issues around fetching table sizes
from remote nodes. For example, temp handling and names requiring quoting was
broken for long too.
Per report by Virendra Kumar and further tests by me.
Pavan Deolasee [Mon, 20 Aug 2018 10:43:20 +0000 (16:13 +0530)]
Do not pushdown aggregates with SRFs in the targetlist
Pavan Deolasee [Fri, 17 Aug 2018 10:09:38 +0000 (15:39 +0530)]
Some minor changes to tsrf test case
We don't support SRF in VALUES clause yet. So make some minor adjustments to
the test case.
Pavan Deolasee [Fri, 17 Aug 2018 09:53:33 +0000 (15:23 +0530)]
Accept regression diffs in the sequence test case
These diffs are on the account of differences in the way we handle WAL logging
and sequences on the coordinator. There is scope for improvement here from
usability perspective, but these are not regressions from the past behaviour.
Pavan Deolasee [Thu, 16 Aug 2018 12:01:53 +0000 (17:31 +0530)]
Make improvements to sequence handling
We were pre-maturely checking for maximum and minimum values, thus throwing an
error when the maximum/minimum was still farther by one count. Fix that.
Improve the GTM error message by including the sequence name and the max/min
value reached. This though slightly changes the end user error message because
we also include database and schema in the sequence name at the GTM.
Make some changes to the way sequence is WAL logged. This is still not entirely
correct since we log every time a value is fetched from the GTM. But then WAL
logging is not strictly required in XL because sequence values are managed at
the GTM. In the old code, we were not WAL logging at all (though the code
existed and it was a bit confusing)
The sequence tuple is still not maintained correctly at the coordinator because
it may not know about the sequence values fetched and consumed by the
datanodes. But that's an existing problem and we should look at that
separately.
Pavan Deolasee [Tue, 14 Aug 2018 08:59:53 +0000 (14:29 +0530)]
Ensure table name is schema qualified while running remote ANALYZE
While sending down ANALYZE (COORDINATOR) command to the remote coordinator, we
must ensure that the table name is properly schema qualified. Also add a few
tests to confirm that this works as expected in various scenarios.
Per report by Virendra Kumar.
Pavan Deolasee [Thu, 9 Aug 2018 08:25:05 +0000 (13:55 +0530)]
Automatically trigger ANALYZE (COORDINATOR) on remote coordinators
One of the long standing problems in multi-coordinator setup is that when a
table is either manually ANALYZEd or auto-analyzed on a coordinator, the other
coordinators don't update their planner statistics automatically. This is even
a bigger problem with auto-analyze because a coordinator which is not involved
in any DMLs, may not be even aware about the changes to the table and hence it
will not pick up the table for auto-analyze, thus often generating very poor
query plans.
We now fix that by automatically running ANALYZE (COORDINATOR) command on the
remote coordinators when a table is either manually or automatically analyzed
on one coordinator. ANALYZE (COORDINATOR) does not force a ANALYZE on the
datanodes, but only rebuilts coordinator side stats using the current stats
available on the datanodes.
One problem with running ANALYZE on the remote coordinators in auto-analyze
process is that if one of the coordinators is down or not reachable, then it
will fail. This seems a bit too harsh because the worst that can happen is that
the unreachably coordinator will be left with stale stats. But that seems
better than letting auto-analyze fail on a running coordinator. We address this
by introducing a new mechanism by which a coordinator can execute a
command/query only on the currently available remote coordinators. We consult
the health-map to decide which coordinators are currently reachable. Since the
health-map itself can be stale for some short duration, there is a risk that
auto-analyze may still fail. But it shouldn't fail forever because the node
will be skipped next time.
Pavan Deolasee [Thu, 9 Aug 2018 08:22:28 +0000 (13:52 +0530)]
Fetch minmxid from remote nodes and compute local value
This was a TODO for quite sometime now. Just like we fetch relfrozenxid from
the remote nodes and compute a value at the coordinator, we now do the same for
multi-xid too.
Pavan Deolasee [Thu, 9 Aug 2018 08:21:11 +0000 (13:51 +0530)]
Remove some dead/commented code
Pavan Deolasee [Thu, 9 Aug 2018 07:33:58 +0000 (13:03 +0530)]
Correctly track XID obtained by autovac as one from the GTM
We were observing that a transaction started by the autovac process was left
open on the GTM, thus holding back xmin. This must have been a regression after
recent changes to track autocommit and regular transactions. We now correctly
track and close such transactions on the GTM.
Pavan Deolasee [Wed, 8 Aug 2018 05:17:21 +0000 (10:47 +0530)]
Add a test to do sanity check at the end
Right now it only contains a test to check replicated tables and confirm that
all nodes have the same number of rows. More to be added later.
Tom Lane [Mon, 6 Aug 2018 20:05:31 +0000 (16:05 -0400)]
Stamp 10.5.
Peter Eisentraut [Mon, 6 Aug 2018 18:03:55 +0000 (20:03 +0200)]
Translation updates
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash:
3463878fc340cb1436153b18a300f8cfdcb12adb
Tom Lane [Mon, 6 Aug 2018 17:13:41 +0000 (13:13 -0400)]
Last-minute updates for release notes.
Security: CVE-2018-10915, CVE-2018-10925
Tom Lane [Mon, 6 Aug 2018 14:53:35 +0000 (10:53 -0400)]
Fix failure to reset libpq's state fully between connection attempts.
The logic in PQconnectPoll() did not take care to ensure that all of
a PGconn's internal state variables were reset before trying a new
connection attempt. If we got far enough in the connection sequence
to have changed any of these variables, and then decided to try a new
server address or server name, the new connection might be completed
with some state that really only applied to the failed connection.
While this has assorted bad consequences, the only one that is clearly
a security issue is that password_needed didn't get reset, so that
if the first server asked for a password and the second didn't,
PQconnectionUsedPassword() would return an incorrect result. This
could be leveraged by unprivileged users of dblink or postgres_fdw
to allow them to use server-side login credentials that they should
not be able to use.
Other notable problems include the possibility of forcing a v2-protocol
connection to a server capable of supporting v3, or overriding
"sslmode=prefer" to cause a non-encrypted connection to a server that
would have accepted an encrypted one. Those are certainly bugs but
it's harder to paint them as security problems in themselves. However,
forcing a v2-protocol connection could result in libpq having a wrong
idea of the server's standard_conforming_strings setting, which opens
the door to SQL-injection attacks. The extent to which that's actually
a problem, given the prerequisite that the attacker needs control of
the client's connection parameters, is unclear.
These problems have existed for a long time, but became more easily
exploitable in v10, both because it introduced easy ways to force libpq
to abandon a connection attempt at a late stage and then try another one
(rather than just giving up), and because it provided an easy way to
specify multiple target hosts.
Fix by rearranging PQconnectPoll's state machine to provide centralized
places to reset state properly when moving to a new target host or when
dropping and retrying a connection to the same host.
Tom Lane, reviewed by Noah Misch. Our thanks to Andrew Krasichkov
for finding and reporting the problem.
Security: CVE-2018-10915
Peter Eisentraut [Mon, 11 Jun 2018 21:19:11 +0000 (17:19 -0400)]
Adjust error message
Makes it look more similar to other ones, and avoids the need for
pluralization.
Tom Lane [Sun, 5 Aug 2018 20:38:43 +0000 (16:38 -0400)]
Release notes for 10.5, 9.6.10, 9.5.14, 9.4.19, 9.3.24.
Tom Lane [Sun, 5 Aug 2018 17:03:42 +0000 (13:03 -0400)]
Doc: fix incorrectly stated argument list for pgcrypto's hmac() function.
The bytea variant takes (bytea, bytea, text).
Per unsigned report.
Discussion: https://postgr.es/m/
153344327294.1404.
654155870612982042@wrigleys.postgresql.org
Tom Lane [Sat, 4 Aug 2018 23:38:58 +0000 (19:38 -0400)]
Fix INSERT ON CONFLICT UPDATE through a view that isn't just SELECT *.
When expanding an updatable view that is an INSERT's target, the rewriter
failed to rewrite Vars in the ON CONFLICT UPDATE clause. This accidentally
worked if the view was just "SELECT * FROM ...", as the transformation
would be a no-op in that case. With more complicated view targetlists,
this omission would often lead to "attribute ... has the wrong type" errors
or even crashes, as reported by Mario De Frutos Dieguez.
Fix by adding code to rewriteTargetView to fix up the data structure
correctly. The easiest way to update the exclRelTlist list is to rebuild
it from scratch looking at the new target relation, so factor the code
for that out of transformOnConflictClause to make it sharable.
In passing, avoid duplicate permissions checks against the EXCLUDED
pseudo-relation, and prevent useless view expansion of that relation's
dummy RTE. The latter is only known to happen (after this patch) in cases
where the query would fail later due to not having any INSTEAD OF triggers
for the view. But by exactly that token, it would create an unintended
and very poorly tested state of the query data structure, so it seems like
a good idea to prevent it from happening at all.
This has been broken since ON CONFLICT was introduced, so back-patch
to 9.5.
Dean Rasheed, based on an earlier patch by Amit Langote;
comment-kibitzing and back-patching by me
Discussion: https://postgr.es/m/CAFYwGJ0xfzy8jaK80hVN2eUWr6huce0RU8AgU04MGD00igqkTg@mail.gmail.com
Michael Paquier [Sat, 4 Aug 2018 20:32:12 +0000 (05:32 +0900)]
Reset properly errno before calling write()
6cb3372 enforces errno to ENOSPC when less bytes than what is expected
have been written when it is unset, though it forgot to properly reset
errno before doing a system call to write(), causing errno to
potentially come from a previous system call.
Reported-by: Tom Lane
Author: Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/31797.
1533326676@sss.pgh.pa.us
Peter Geoghegan [Fri, 3 Aug 2018 21:44:56 +0000 (14:44 -0700)]
Add table relcache invalidation to index builds.
It's necessary to make sure that owning tables have a relcache
invalidation prior to advancing the command counter to make
newly-entered catalog tuples for the index visible. inval.c must be
able to maintain the consistency of the local caches in the event of
transaction abort. There is usually only a problem when CREATE INDEX
transactions abort, since there is a generic invalidation once we reach
index_update_stats().
This bug is of long standing. Problems were made much more likely by
the addition of parallel CREATE INDEX (commit
9da0cc35284), but it is
strongly suspected that similar problems can be triggered without
involving plan_create_index_workers(). (plan_create_index_workers()
triggers a relcache build or rebuild, which previously only happened in
rare edge cases.)
Author: Peter Geoghegan
Reported-By: Luca Ferrari
Diagnosed-By: Andres Freund
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/CAKoxK+5fVodiCtMsXKV_1YAKXbzwSfp7DgDqUmcUAzeAhf=HEQ@mail.gmail.com
Backpatch: 9.3-
Alvaro Herrera [Fri, 3 Aug 2018 20:45:08 +0000 (16:45 -0400)]
Add 'n' to list of possible values to pg_default_acl.defaclobjtype
This was missed in commit
ab89e465cb20; backpatch to v10.
Author: Fabien Coelho <coelho@cri.ensmp.fr>
Discussion: https://postgr.es/m/alpine.DEB.2.21.
1807302243001.13230@lancre
Alvaro Herrera [Fri, 3 Aug 2018 20:34:59 +0000 (16:34 -0400)]
Fix pg_replication_slot example output
The example output of pg_replication_slot is wrong. Correct it and make
the output stable by explicitly listing columns to output.
Author: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/
20180731.190909.
42582169.horiguchi.kyotaro@lab.ntt.co.jp
Tom Lane [Fri, 3 Aug 2018 16:20:47 +0000 (12:20 -0400)]
Remove no-longer-appropriate special case in psql's \conninfo code.
\conninfo prints the results of PQhost() and some other libpq functions.
It used to override the PQhost() result with the hostaddr parameter if
that'd been given, but that's unhelpful when multiple hosts were listed
in the connection string. Furthermore, it seems unnecessary in the wake
of commit
1944cdc98, since PQhost does any useful substitution itself.
So let's just remove the extra code and print PQhost()'s result without
any editorialization.
Back-patch to v10, as
1944cdc98 (just) was.
Discussion: https://postgr.es/m/23287.
1533227021@sss.pgh.pa.us
Tom Lane [Fri, 3 Aug 2018 16:12:10 +0000 (12:12 -0400)]
Change libpq's internal uses of PQhost() to inspect host field directly.
Commit
1944cdc98 changed PQhost() to return the hostaddr value when that
is specified and host isn't. This is a good idea in general, but
fe-auth.c and related files contain PQhost() calls for which it isn't.
Specifically, when we compare SSL certificates or other server identity
information to the host field, we do not want to use hostaddr instead;
that's not what's documented, that's not what happened pre-v10, and
it doesn't seem like a good idea.
Instead, we can just look at connhost[].host directly. This does what
we want in v10 and up; in particular, if neither host nor hostaddr
were given, the host field will be replaced with the default host name.
That seems useful, and it's likely the reason that these places were
coded to call PQhost() originally (since pre-v10, the stored field was
not replaced with the default).
Back-patch to v10, as
1944cdc98 (just) was.
Discussion: https://postgr.es/m/23287.
1533227021@sss.pgh.pa.us