Tatsuo Ishii [Sun, 11 Aug 2019 02:27:58 +0000 (11:27 +0900)]
Fix test failure of extended-query-test/disable-load-balance-always.
It expected the first SELECT to be sent to load balance node but a
preceding write query (DROP/CREATE TABLE) prevented it because it set
the writing_transaction flag. Fix is, instead of issuing DROP/CREATE
TABLE before the SELECT, issue harmless SET command after the SELECT
in extended query mode.
Tatsuo Ishii [Sat, 10 Aug 2019 23:34:33 +0000 (08:34 +0900)]
Fix extra test scripts to not fail.
Now SELECT version() is always issued, this makes the script confused
because they extra lines in question by using "grep SELECT". To avoid
the confusion, add "grep -v version" into the command pipe line.
Tatsuo Ishii [Fri, 9 Aug 2019 08:04:28 +0000 (17:04 +0900)]
Fix "unable to bind. cannot get parse message" error.
This was caused by too-eager memory free in parse_before_bind. It
called
pool_remove_sent_message/pool_create_sent_message/pool_add_sent_message
combo to replace the query context in the sent message. Unfortunately
pool_remove_sent_message free memory such as statement name, which was
being passed by caller. As a result, the new sent message created by
pool_create_sent_message pointed to freed statement name, which may
make a search by statement name fail because now the statement name in
the sent message points to freed memory area, which might be
overwritten by later memory allocation. Fix is, instead of calling
pool_remove_sent_message etc., just replace the query context in the
sent message.
Per bug 531.
Tatsuo Ishii [Fri, 9 Aug 2019 05:50:56 +0000 (14:50 +0900)]
Create PostgreSQL version cache as early as possible.
Since once error query is issued, the query to create version cache
(SELECT version()) is ignored and it leads to failure in creating
version cache. Fix is, to create version cache after query context is
created and before user query is sent in SimpleQuery() and Parse().
Tatsuo Ishii [Fri, 9 Aug 2019 01:20:53 +0000 (10:20 +0900)]
Doc: update 4.1 release note.
For these commits: 2019-05-27 [
33df0d33], 2019-08-08 [
3922c12c].
Muhammad Usama [Thu, 8 Aug 2019 13:50:51 +0000 (18:50 +0500)]
Fix for
0000483: online-recovery is blocked after a child process exits ...
The problem is if some child process exits abnormally during the second stage
of online recovery, then the connection counter that keeps the track of exiting
processes does not get decremented and Pgpool-II keeps waiting for the exit of
the already exited process. Eventually, the recovery fails after
client_idle_limit_in_recovery expires.
The fix for this issue is to set the connection counter to zero when
client_idle_limit_in_recovery is enabled and it has less value than
recovery_timeout, Since all clients must have been kicked out by the time
when client_idle_limit_in_recovery expires.
A similar fix is already committed as part of bug 431 by Tatsuo Ishii, So this
commit basically imports the same logic in the watchdog function that processes
the remote online recovery requests.
Apart from the above-mentioned change, Hoshiai San identified that the watchdog
IPC command timeout for the online recovery start functions executed through
watchdog is set exactly to the same as recovery_timeout which needs to be
increased to make the solution work correctly.
Tatsuo Ishii [Thu, 8 Aug 2019 07:35:22 +0000 (16:35 +0900)]
Doc: run auto indent using emacs.
Here is the emacs script F.Y.I.
;; must be run by emacs
(load "/home/t-ishii/.emacs.d/init.el")
(find-file (nth 0 command-line-args-left));
(indent-region (point-min) (point-max));
(save-buffer)
Bo Peng [Thu, 8 Aug 2019 06:35:36 +0000 (15:35 +0900)]
Doc: Update "Pgpool-II + Watchdog Setup Example" configuration example.
Muhammad Usama [Wed, 7 Aug 2019 15:22:01 +0000 (20:22 +0500)]
Fix for no primary on standby pgpool when primary is quarantined on master
Master watchdog Pgpool sends primary_node_id = -1 in the backend status sync
message if the primary node is quarantined on it. So standby watchdog Pgpool
must not update its primary_node_id if the primary backend node id in sync
message is invalid_node_id (-1) while the same sync message reports the
backend status of the current primary node as "NOT DOWN".
The issue was reported by "Tatsuo Ishii <ishii@sraoss.co.jp>" and fixed by me
Tatsuo Ishii [Thu, 8 Aug 2019 05:55:53 +0000 (14:55 +0900)]
Doc; mention quorum faiover introduced in 3.7
Also fix indentation.
Tatsuo Ishii [Thu, 8 Aug 2019 05:44:11 +0000 (14:44 +0900)]
Doc: fix indentation.
Also remove unnecessary xref label of sect2.
Tatsuo Ishii [Thu, 8 Aug 2019 02:38:02 +0000 (11:38 +0900)]
Make waiting for TIME_WAIT in pgpool_setup optional.
Since commit
3b32bc4e583da700cc8df7c5777e90341655ad3b the shutdownall
script generated by pgpool_setup waits for Pgpool-II socket in
TIME_WAIT state disappeared. However in most cases this takes long
time and it makes uncomfortable for developer's testing works.
This commit makes the wait to be optional: unless environment variable
"CHECK_TIME_WAIT" is set to other than "false", it never waits for the
TIME_WAIT state.
Tatsuo Ishii [Thu, 8 Aug 2019 02:02:50 +0000 (11:02 +0900)]
Import some of memory manager debug facilities from PostgreSQL.
Now we can use CLOBBER_FREED_MEMORY, which is useful to detect
accesses to already pfreed memory.
Tatsuo Ishii [Thu, 8 Aug 2019 02:00:35 +0000 (11:00 +0900)]
Enhance extended query test driver.
- Change diff format using context diff.
- Suppress diffs related to message line number changes.
- Fix indentation.
Bo Peng [Thu, 8 Aug 2019 00:48:43 +0000 (09:48 +0900)]
Remove some code that was forgotten to be deleted in a previous commit.
Bo Peng [Thu, 8 Aug 2019 00:45:00 +0000 (09:45 +0900)]
Add new arguments in pgpool_recovery function and failover_command/failback_command/follow_master_command.
Now able to use "recovery node port number" in pgpool_recovery function.
Also the following options is added in failover_command/failback_command/follow_master_command.
- %N = old primary node hostname
- %S = old primary node port number
Bo Peng [Thu, 8 Aug 2019 00:43:51 +0000 (09:43 +0900)]
Revert "Add new arguments in pgpool_recovery function and failover_command/failback_command/follow_master_command."
This reverts commit
25a4237c9bc8db33f6710df8e43b285f36751038.
Bo Peng [Thu, 8 Aug 2019 00:26:36 +0000 (09:26 +0900)]
Add new arguments in pgpool_recovery function and failover_command/failback_command/follow_master_command.
Now able to use "recovery node port number" in pgpool_recovery function.
Also the following options is added in failover_command/failback_command/follow_master_command.
- %N = old primary node hostname
- %S = old primary node port number
Tatsuo Ishii [Wed, 7 Aug 2019 07:55:38 +0000 (16:55 +0900)]
Doc: move watchdog chapter from "Tutorial" to "Server Administration".
The chapter was not best suited for tutorials.
Tatsuo Ishii [Wed, 7 Aug 2019 02:24:27 +0000 (11:24 +0900)]
Fix extended test driver to not ignore given arguments.
src/test/extended-query-tests/test.sh did not consider given arguments
while executing 3 node tests because "$1" is lost because of other
processing. To fix the issue "$1" is saved and used in subsequent
processing.
Takuma Hoshiai [Tue, 6 Aug 2019 08:25:58 +0000 (17:25 +0900)]
Fix global static variable to local static variable.
The variable of auto_failback_interval is called only by establish_persistent_connection().
So changed global static variable to local static variable.
Tatsuo Ishii [Tue, 6 Aug 2019 02:27:30 +0000 (11:27 +0900)]
Overhaul health check debug facility.
check_backend_down_request() in health_check.c is intended to simulate
the situation where communication failure between health check and
PostgreSQL backend node by creating a file containing lines:
1 down
where the first numeric is the node id starting from 0, tab, and
"down". When health check process finds the file, let health check
fails on node 1.
After health check brings the node into down status,
check_backend_down_request() change "down" to "already_down" to
prevent repeating node failure.
However, questions is, this is necessary at all. I think
check_backend_down_request() should keep on reporting the down status
and it should be called inside establish_persistent_connection() to
prevent repeating node failure because it could be better simulated
the failing situation in this way. For example, currently the health
check retry is not simulated but the new way can do it.
Moreover, in current watchdog implementation, to bring a node into
quarantine state requires *two" times of node communication error
detection. Since check_backend_down_request() only allows to raise
node down even *once" (after the down state is changed to already_down
state), it's impossible to test the watchdog quarantine using
check_backend_down_request(). I changed check_backend_down_request()
so that it continues to raise "down" event as long as the down request
file exists.
This commit enhances check_backend_down_request() as described above.
1) caller of check_backend_down_request() is
establish_persistent_connection(), rather than
do_health_check_child().
2) check_backend_down_request() does not change "down" to
"already_down" anymore. This means that the second argument of
check_backend_down_request() is not useful anymore. Probably I
should remove the argument later on.
Tatsuo Ishii [Fri, 2 Aug 2019 04:03:13 +0000 (13:03 +0900)]
Fix segfaut in streaming replication check process.
In case of failure of pg_stat_replication() call, the streaming
replication check process unconditionally tried to free the allocated
memory. This should only be done when the call succeeds.
Tatsuo Ishii [Fri, 2 Aug 2019 00:05:49 +0000 (09:05 +0900)]
Doc: mention the separate parser patch.
Muhammad Usama [Thu, 1 Aug 2019 20:23:25 +0000 (01:23 +0500)]
Multiple performance enhancements especially for of the large
INSERT and UPDATE statements
Pgpool-II only needs very little information, especially for the INSERT and
UPDATE statements to decide where it needs to send the query.
For example: In master-slave mode, for the INSERT statements Pgpool-II only
requires the relation name referenced in the statement while it doesn't care
much about the column values and other parameters. But since the parser we use
in Pgpool-II is taken from PostgreSQL source which parses the complete query
including the value lists which seems harmless for smaller statements but in
case of INSERT and UPDATE with lots of column values and large data in value
items, consumes significant time.
So the idea here is to short circuit the INSERT and UPDATE statement parsing as
soon as we have the required information. For that purpose, the commit adds the
second minimal parser that gets invoked in master-slave mode and tries to
extract the performance for large INSERT and UPDATE statements.
Apart from the second parser addition, following changes aiming towards the
performance enhancements are also part of the commit.
1-Some of the if statements in pool_where_to_send() function are re-arranged to
make sure the more expensive functions calls, pattern_compare()
and pool_has_function_call() should only be made when they are
absolutely necessary.
2- Eliminates the raw_parser() calls in case of un-recognized queries. Instead
of invoking the parser on "dummy read" and "dummy write" statements, the commit
adds the functions to return the pre-built parse_trees for these dummy queries.
3-- strlen() call is removed from scanner_init() function and is passed to it
as an argument. The reason being we already have the query length in most cases
before invoking the parser so why waste CPU cycles on it. Again this becomes
significant in case of large query strings.
4- Removes some of the unnecessary calls of pool_is_likely_select() function.
Tatsuo Ishii [Thu, 1 Aug 2019 06:45:21 +0000 (15:45 +0900)]
Update 4.1 release note.
Tatsuo Ishii [Thu, 1 Aug 2019 06:14:37 +0000 (15:14 +0900)]
Add Pgpool-II 4.1 release note.
Tatsuo Ishii [Mon, 29 Jul 2019 07:29:13 +0000 (16:29 +0900)]
Fix check_replication_lag to not break auto_failback.
Commit
bbfe3adb broke auto_failback because it doesn't collect info
from pg_stat_replication for down node. This commit fix the breakage
and eliminate thinko:
- Do not count the number of active nodes since it is useful to show
streaming replication status in "show pool_nodes" for all nodes which
are actually active and running even if Pgpool-II thinks they are
down regardless auto_failback is on or off.
- Call pg_stat_replication once, rather than call for each node.
Takuma Hoshiai [Mon, 29 Jul 2019 06:07:21 +0000 (15:07 +0900)]
Fix watchdog_setup command option
The mode option is incorrectly. when pgpool_setup command is called by
watchdog_setup command, mode option forget to set.
Takuma Hoshiai [Mon, 29 Jul 2019 05:28:02 +0000 (14:28 +0900)]
Fix typo and copyright.
Tatsuo Ishii [Sun, 28 Jul 2019 02:11:07 +0000 (11:11 +0900)]
Fix pgpool_setup to produce correct follow master command.
The produced script incorrectly checked whether PostgreSQL is running
or not, which resulted in that it mistakenly thought PostgreSQL is
always running.
Tatsuo Ishii [Sun, 28 Jul 2019 02:05:51 +0000 (11:05 +0900)]
Adjust WHERE clause of pg_stat_replication to not issue against down node.
Previously the query to get replication status was unconditionally
issued to down node, which produced annoying "no row returned" log.
Tatsuo Ishii [Sat, 27 Jul 2019 00:34:39 +0000 (09:34 +0900)]
Doc: add index to "quorum".
Takuma Hoshiai [Thu, 25 Jul 2019 05:17:15 +0000 (14:17 +0900)]
Doc: update chapter of 'Examples'
Update 'Examples' chapter, and fix typo in other chapters.
Reviewed by Tatsuo Ishii and Bo Peng.
Tatsuo Ishii [Thu, 25 Jul 2019 04:59:12 +0000 (13:59 +0900)]
Doc: mention that quorum state can be shown by using pcp_watchdog_info.
Bo Peng [Wed, 24 Jul 2019 12:11:33 +0000 (21:11 +0900)]
Feature: Import PostgreSQL 12 beta2 new parser.
The attached patch imports PostgreSQL12 beta2 parser to Pgpool-II 4.1.
Major chanegs of PostgreSQL 12 parser include:
- Add new VACUUM options:
-- SKIP_LOCKED
-- INDEX_CLEANUP
-- TRUNCATE
- Add COMMIT AND CHAIN and ROLLBACK AND CHAIN commands
- Add a WHERE clause to COPY FROM
- Allow to use CREATE OR REPLACE AGGREGATE command
- Allow to use mcv (most-common-value) in CREATE STATISTICS
- ADD REINDEX option CONCURRENTLY
- Add EXPLAIN option SETTINGS
- etc.
Takuma Hoshiai [Mon, 22 Jul 2019 08:05:41 +0000 (17:05 +0900)]
Add regression test for auto_failback
Takuma Hoshiai [Mon, 22 Jul 2019 00:42:18 +0000 (09:42 +0900)]
Feature: auto failback
In Pgpool-II 4.1 or later, you can reattach backend nodethat is status DOWN automatically if auto_failback set on, this future is enabled.
The Feature of auto_failback require that backend node of PostgreSQL are 9.1 or later with streaming-replication mode, and sr_check process and health_check is enabled.
It do health check for standby node, when status of standby Node is down but status of streaming replication between primary and standby is 'stream'. If health check is success, execute process of fail_back automatically. This featurel is usefull for health check is failed bacause of tempolary network error is happened for example.
Tatsuo Ishii [Wed, 17 Jul 2019 08:24:09 +0000 (17:24 +0900)]
Remove useless lines.
Since "contents" is never NULL, those lines are never executed.
Per Coverity.
Tatsuo Ishii [Wed, 17 Jul 2019 07:51:31 +0000 (16:51 +0900)]
Fix the failover() so that it does not access out of array.
Per Coverity.
Tatsuo Ishii [Wed, 17 Jul 2019 07:48:37 +0000 (16:48 +0900)]
Enhance shutdown script of pgpool_setup.
I observe occasional regression test failure caused by bind error to
the TCP/IP port. This fix tries to confirm usage of the TCP/IP port
while executing shutdown script using netstat command.
Tatsuo Ishii [Tue, 16 Jul 2019 05:12:30 +0000 (14:12 +0900)]
Doc: enhance client authentication docs.
Bo Peng [Wed, 10 Jul 2019 07:30:32 +0000 (16:30 +0900)]
Fix Pgpool-II rewriting query error when the queries include "GROUPS" and "EXCLUDE" in frame clauses.
This occurs only in native replication mode.
Thanks to Yugo Nagata for the bug reporting.
Tatsuo Ishii [Wed, 10 Jul 2019 00:03:16 +0000 (09:03 +0900)]
Use foreach macro in pool_temp_tables module.
It used to use plain for(), but there's no reason to not use the
convenient macro.
Tatsuo Ishii [Tue, 9 Jul 2019 22:38:21 +0000 (07:38 +0900)]
Fix logically dead code pointed out by Coverity.
Whether pwd is NULL or not was already checked, and there's no point
to check it again.
Tatsuo Ishii [Tue, 9 Jul 2019 04:31:52 +0000 (13:31 +0900)]
Doc: enhance watchdog documents regarding quorum failover.
Tatsuo Ishii [Mon, 8 Jul 2019 04:20:08 +0000 (13:20 +0900)]
Fix memory leak pointed out by Coverity.
When failed to create a connection to backend, new_connection() failed
to free memory.
Tatsuo Ishii [Sun, 7 Jul 2019 13:58:35 +0000 (22:58 +0900)]
Fix possible out of array index access.
It was pointed out by Coverity that node_id could be -1.
Tatsuo Ishii [Sun, 7 Jul 2019 01:09:25 +0000 (10:09 +0900)]
Fix query cache module so that it checks oid array's bound.
Tatsuo Ishii [Sat, 6 Jul 2019 23:24:16 +0000 (08:24 +0900)]
Fix to check the return value of strchr() in Pgversion().
Pointed out by Coverity.
Tatsuo Ishii [Sat, 6 Jul 2019 23:08:25 +0000 (08:08 +0900)]
Fix off-by-one error in query cache module.
When debug print is enabled, it might had tried to access out of bound
of oid array.
Tatsuo Ishii [Sat, 6 Jul 2019 23:03:02 +0000 (08:03 +0900)]
Fix buffer overrun in Pgversion().
Also enhance comments.
Tatsuo Ishii [Fri, 5 Jul 2019 05:32:43 +0000 (14:32 +0900)]
Allow health check process to reload pgpool.conf.
When separate health check process was introduced, we forgot to send
signal to the health check process when pgpool.conf reload is
requested.
Bo Peng [Thu, 4 Jul 2019 02:33:34 +0000 (11:33 +0900)]
Generate Makefile.in by automake 1.13.4.
Tatsuo Ishii [Wed, 3 Jul 2019 03:59:11 +0000 (12:59 +0900)]
Bug525: Fix sefault when query cache is enabled.
When query cache is enabled,
session_context->query_context->skip_cache_commit flag was set or
reset while processing execute message. Problem was, it was done
before session_context->query_context was set. So fix is just
set/reset
query_context->skip_cache_commit. session_context->query_context is
set later on anyway.
Per bug 525.
Bo Peng [Wed, 3 Jul 2019 03:51:23 +0000 (12:51 +0900)]
Add execute permission to extended-query-test/extra_scripts.
Tatsuo Ishii [Wed, 3 Jul 2019 00:05:55 +0000 (09:05 +0900)]
Change config type of relcache_query_target from string to enum.
This parameter is implemented as enum, so change samples and docs
accordingly.
Tatsuo Ishii [Tue, 2 Jul 2019 09:40:11 +0000 (18:40 +0900)]
Make shutdownall to wait for completion of shutdown of Pgpool-II.
It was observed that regression test occasionally failed because
previous does not completely finished before next test started. To fix
the problem, make shutdownall script generated by pgpool_setup to wait
for completion of shutdown of Pgpool-II.
Tatsuo Ishii [Tue, 2 Jul 2019 05:09:54 +0000 (14:09 +0900)]
Doc: enhance performance docs.
Tatsuo Ishii [Tue, 2 Jul 2019 04:53:56 +0000 (13:53 +0900)]
Doc: add more explanation for quorum failover of watchdog.
Also add newer descriptions to Japanese doc added by English doc.
Tatsuo Ishii [Tue, 2 Jul 2019 02:27:58 +0000 (11:27 +0900)]
Change check_temp_table's type to enum.
It was actually implemented as 'enum' rather than 'string'. So change
all docs and sample configuration files.
Also allow to specify on/off for backward compatibility.
Tatsuo Ishii [Tue, 2 Jul 2019 00:37:38 +0000 (09:37 +0900)]
Doc: mention that in raw mode or load_balance_mode = off case for relation cache.
In this case queries for relation cache will not be sent.
Tatsuo Ishii [Tue, 2 Jul 2019 00:08:57 +0000 (09:08 +0900)]
Down grade LOG to DEBUG5 in sent message module.
The log was added in commit
56a6b6a72, but in some cases it is
disturbing users.
Discussion: [pgpool-general: 6620] Fwd: A lot of "checking zapping sent message" in log
Tatsuo Ishii [Mon, 1 Jul 2019 22:25:05 +0000 (07:25 +0900)]
Add test case for 'trace' to 026.temp_table's test.sh.
Commit
7a0471bb for test.sh forgot to test 'trace' case.
Tatsuo Ishii [Mon, 1 Jul 2019 22:16:51 +0000 (07:16 +0900)]
Dow grade some LOG messages to DEBUG1 in the temp table trace module.
Tatsuo Ishii [Mon, 1 Jul 2019 04:59:18 +0000 (13:59 +0900)]
Doc: fix typo.
Bo Peng [Mon, 1 Jul 2019 03:50:49 +0000 (12:50 +0900)]
feature: Disable load balance after a SELECT having functions specified in black/white function list.
In Pgpool-II 4.0 or earlier, if we set "disable_load_balance_on_write = transaction",
when a write query is issued inside an explicit truncation,
subsequent queries should be sent to primary only until the
end of this transactionin in order to avoid the replication
delay.
However, the SELECTs having write functions specified in black_function_list
are not regarded as a write query and the subsequent read queries are still load balanced.
This commit will disable load balance after a SELECT having functions
specified in black/white function list.
Tatsuo Ishii [Sat, 29 Jun 2019 23:03:10 +0000 (08:03 +0900)]
Doc: change check_temp_table Japanese docs.
This is due to recent change to English doc of the part.
Also enhance docs to add literal tag to string parameters.
Tatsuo Ishii [Thu, 27 Jun 2019 05:49:34 +0000 (14:49 +0900)]
Add new method to check temporary table.
Checking temporary tables are slow because it needs to lookup system
catalogs. To eliminate the lookup, new method to trace CREATE TEMP
TABLE/DROP TABLE is added. For this purpose a linked list is added to
session context. In the list, information of creating/dropping temp
tables (not yet committed) and created/dropped temp tables (committed
or aborted) is kept. By looking up the list, it is possible to decide
whether given table is a temporary or not.
However it is impossible to trace the table creation in functions and
triggers. thus the new method does not completely replace existing
method and choice of method are given by changing check_temp_table
type from boolean to string.
"catalog": existing method (catalog lookup, same as check_temp_table = on)
"trace": the proposed new way
"none": no temp table checking (same as check_temp_table = off)
Tatsuo Ishii [Wed, 26 Jun 2019 01:31:15 +0000 (10:31 +0900)]
Reduce internal queries against system catalogs.
Currently the relcache module issues 7+ queries to obtain various info
from PostgreSQL system catalogs. Some of them are necessary for
Pgpool-II to work with multiple version of PostgreSQL.
Idea is, if we already know the version of PostgreSQL, we can
eliminate some of queries. For example, we need to know if
pg_namespace exists and for this purpose we send a query against
pg_class. But if we know that pg_namespace was introduced in
PostgreSQL 7.3, we do not need to inquire pg_class.
To implement this, new function:
PGVersion *
Pgversion(POOL_CONNECTION_POOL * backend)
is added and "SELECT version()" is used to get PostgreSQL's major and
minor version number (the minor version is not used for now). The
query itself is cached using relcache infrastructure. Plus Pgversion()
has its own cache using static memory. Thus obtaining PostgreSQL
version is quite fast for the second or later call to the function.
In my testing with "select * from t1" issues 7 queries. After the
patch, it reduces to 3. The remaining queries are:
- to know if it's a temporary table or not.
- to know if it's a unlogged table or not.
- to know if it's a system catalog table or not.
Discussion: [pgpool-hackers: 3344] Reducing query issued by relcache
Takuma Hoshiai [Tue, 25 Jun 2019 06:09:37 +0000 (15:09 +0900)]
Fix regression test 023 ssl_connection
If TLS1.3 was used by openssl, this test failed.
This mistake is due to #
51bc494aaa7fd191e14038204d18effe2efb0ec8 .
Tatsuo Ishii [Mon, 24 Jun 2019 13:13:18 +0000 (22:13 +0900)]
Fix mistake introduced in the previous commit.
Tatsuo Ishii [Mon, 24 Jun 2019 01:57:34 +0000 (10:57 +0900)]
Fix segfault when "samenet" is specified in pool_hba.conf.
When "samenet" is specified, SockAddr_cidr_mask(struct
sockaddr_storage *mask, char *numbits, int family) gets called with
numbits == NULL. However the function was not prepared for
it. Originally the function was imported from PostgreSQL. When the bug
was fixed in PostgreSQL, unfortunately the fix was not applied to
Pgpool-II. This commit applies the same fix as PostgreSQL.
Discussion: [pgpool-general: 6601] Pgpool-II + hba + samenet = segfault in libc-2.24.so
Tatsuo Ishii [Thu, 20 Jun 2019 04:06:27 +0000 (13:06 +0900)]
Allow to route relcache queries to load balance node.
Queries to build relcache entries were always sent to master (primary)
node. This is usually good because we could eliminate the bad effect
of replication delay. However if users want to lower the load of
master node, it would be nice if we could route the queries to other
than master node. This patch introduces new parameter
"relache_query_target". If it is set to 'load_balance_node', relcache
queries will be routed to load balance node. If it is set to
'master', the queries are routed to master node, which is same as
before (this is the default).
Bo Peng [Thu, 20 Jun 2019 03:53:40 +0000 (12:53 +0900)]
doc: Fix documentation typos.
Bo Peng [Wed, 19 Jun 2019 01:59:59 +0000 (10:59 +0900)]
doc: Fix documentation typo.
Bo Peng [Wed, 19 Jun 2019 01:49:33 +0000 (10:49 +0900)]
doc: Fix documentation errors in follow_sh script.
Tatsuo Ishii [Tue, 11 Jun 2019 22:18:11 +0000 (07:18 +0900)]
Doc: fix typo in performance section.
Takuma Hoshiai [Tue, 18 Jun 2019 06:02:38 +0000 (15:02 +0900)]
Support ECDH key exchange with SSL
Pgpool-II is supported ECDH key exchange with SSL connections.
Add new parameter 'ssl_ecdh_curve' and 'ssl_dh_params_file' to use
ECDH key exchange. The user can use more secure communication with
SSL as with PostgreSQL.
Tatsuo Ishii [Tue, 11 Jun 2019 04:47:42 +0000 (13:47 +0900)]
Fix health check process is not shutting down in certain cases.
When watchdog detects fatal events, including not reaching to
trusted_servers, watchdog suicides with POOL_EXIT_FATAL exit status
code. In this case the parent of watchdog, the pgpool main process's
SIGCHILD handler reaper() exits and on_exit call back calls
system_will_go_down(), which in turn calls terminate_all_children().
Problem is, terminate_all_children() forgot to kill health check
process. This commit fixes that.
Also there are some not well behaving codings are enhanced.
Back patched to 3.7, when the bug was introduced.
Bo Peng [Fri, 7 Jun 2019 08:19:37 +0000 (17:19 +0900)]
Fix to deal with backslashes according to the config of standard_conforming_strings
in native replication mode.
per bug467.
Tatsuo Ishii [Fri, 7 Jun 2019 07:50:39 +0000 (16:50 +0900)]
Allow to use MD5 hashed password in health_check_password and sr_check_password.
If they are left blank (empty string), then pool_passwd is consulted.
Tatsuo Ishii [Sun, 2 Jun 2019 03:29:09 +0000 (12:29 +0900)]
Doc: add more description to pcp_node_info manual.
To some of items be correctly displayed, sr_check_user needs more privilege.
Tatsuo Ishii [Sun, 2 Jun 2019 02:40:40 +0000 (11:40 +0900)]
Doc: add description to pg_md5 man page how to show pool_passwd ready string.
Sometimes it is necessary to just show md5 hash string suitable for
pool_passwd, without adding an entry to pool_passwd.
Tatsuo Ishii [Wed, 29 May 2019 17:33:39 +0000 (02:33 +0900)]
Fix segfault of worker process if sr_check_user does not have enough privilege.
If sr_check_user does not have enough privilege, selecting on
pg_stat_replication returns some columns to be NULL, which caused a
segfault.
Muhammad Usama [Tue, 28 May 2019 09:30:21 +0000 (14:30 +0500)]
Documentation updates to reflect the changed behaviour of quarantined nodes.
Mentioning about the resignation of watchdog-master when primary backend gets in
quarantine on it, and the continuous health checking of quarantined nodes.
Muhammad Usama [Sun, 26 May 2019 20:59:06 +0000 (01:59 +0500)]
Second part for [pgpool-hackers: 3295] duplicate failover request ...fix
As per the discussion on the thread [pgpool-hackers: 3295] we came to the
conclusion, that the master watchdog node should resign from master
responsibilities if the primary backend node gets into quarantine state on that.
The commit implements the said behaviour by making the master/coordinator watchdog
node resign from its status if it fails to get the consensus for the quarantined
primary node failover, with in FAILOVER_COMMAND_FINISH_TIMEOUT(15) seconds.
When the watchdog master resigns, because of quarantined primary node its
wd_priority is decreased to (-1), so that it should get the least preference
in the next election for the master/coordinator node selection. And once the
election is concluded the wd_priority for the node gets restored to the
original configured value.
In case of failed consensus for standby node failover no action is taken.
Muhammad Usama [Sun, 26 May 2019 20:46:38 +0000 (01:46 +0500)]
Enhancing the debugging/testing aid for health check.
To cater the continuous health check on quarantine nodes the commit extends
the check_backend_down_request() function to check for nodes that are marked
as "already_down" in the backend_down_request file. This should fix
013.watchdog_failover_require_consensus test case, that started failing
after
3dd1cd3f15287ee6bb8b09f0642f99db98e9776a commit.
Bo Peng [Thu, 23 May 2019 10:46:27 +0000 (19:46 +0900)]
Fix compile error on freebsd.
Add missing include file "netinet/in.h".
per bug519 and bug512.
Tatsuo Ishii [Thu, 23 May 2019 05:36:10 +0000 (14:36 +0900)]
Fix memory leak in pgproto pointed out by Coverity.
Tatsuo Ishii [Wed, 22 May 2019 23:22:19 +0000 (08:22 +0900)]
Fix to not access pg_stat_replication view if PostgreSQL version is not appropriate.
PostgreSQL 9.0 does not have it.
Tatsuo Ishii [Wed, 22 May 2019 22:34:03 +0000 (07:34 +0900)]
Make failover in progress check more aggressively.
In pool_virtual_master_db_node_id() the case when session context is
not available was not covered by the failover in progress checking
because I thought it'd be too aggressive. However a report from field
showed that that could happen while authenticating a client (and it
causes a segfault). So I decided to move the check to beginning of the
function to cover the case.
Tatsuo Ishii [Wed, 22 May 2019 14:12:29 +0000 (23:12 +0900)]
Add backend_application_name to "pgpool show backend" group.
Tatsuo Ishii [Wed, 22 May 2019 14:01:48 +0000 (23:01 +0900)]
Fix backend_application_name not being set while reloading config file.
Tatsuo Ishii [Wed, 22 May 2019 08:01:47 +0000 (17:01 +0900)]
Fix memory leak in outfuncs.c pointed out by Coverity.
Tatsuo Ishii [Wed, 22 May 2019 07:20:51 +0000 (16:20 +0900)]
Fix NULL pointer dereference pointed out by Coverity.
Tatsuo Ishii [Wed, 22 May 2019 06:15:37 +0000 (15:15 +0900)]
Fix memory leak pointed out by coverity.
Tatsuo Ishii [Wed, 22 May 2019 05:22:14 +0000 (14:22 +0900)]
Fix coverity complain.
Tatsuo Ishii [Wed, 22 May 2019 01:20:32 +0000 (10:20 +0900)]
Doc: fix mistake in the previous commit.
Follow master command's %P is "old primary node id" and should have
not been changed.
Tatsuo Ishii [Wed, 22 May 2019 00:48:29 +0000 (09:48 +0900)]
Doc: fix mistakenly described %P of failback command and follow master command.
These should have been "current primary node id", rather than "old
primary node id".