| 44bf2533 | 02-Aug-2016 |
antirez <[email protected]> |
Redis 3.2.3. |
| e67ad1d1 | 01-Aug-2016 |
Qu Chen <[email protected]> |
Fix a bug to delay bgsave while AOF rewrite in progress for replication |
| 7c6e288d | 28-Jul-2016 |
antirez <[email protected]> |
Redis 3.2.2. |
| 0a45fbc3 | 27-Jul-2016 |
antirez <[email protected]> |
Ability of slave to announce arbitrary ip/port to master.
This feature is useful, especially in deployments using Sentinel in order to setup Redis HA, where the slave is executed with NAT or port fo
Ability of slave to announce arbitrary ip/port to master.
This feature is useful, especially in deployments using Sentinel in order to setup Redis HA, where the slave is executed with NAT or port forwarding, so that the auto-detected port/ip addresses, as listed in the "INFO replication" output of the master, or as provided by the "ROLE" command, don't match the real addresses at which the slave is reachable for connections.
show more ...
|
| c3982c09 | 12-Jul-2016 |
antirez <[email protected]> |
redis-benchmark: new option to show server errors on stdout.
Disabled by default, can be activated with -e. Maybe the reverse was more safe but departs from the past behavior. |
| fdafe233 | 27-Jul-2016 |
antirez <[email protected]> |
Multiple GEORADIUS bugs fixed.
By grepping the continuous integration errors log a number of GEORADIUS tests failures were detected.
Fortunately when a GEORADIUS failure happens, the test suite log
Multiple GEORADIUS bugs fixed.
By grepping the continuous integration errors log a number of GEORADIUS tests failures were detected.
Fortunately when a GEORADIUS failure happens, the test suite logs enough information in order to reproduce the problem: the PRNG seed, coordinates and radius of the query.
By reproducing the issues, three different bugs were discovered and fixed in this commit. This commit also improves the already good reporting of the fuzzer and adds the failure vectors as regression tests.
The issues found:
1. We need larger squares around the poles in order to cover the area requested by the user. There were already checks in order to use a smaller step (larger squares) but the limit set (+/- 67 degrees) is not enough in certain edge cases, so 66 is used now.
2. Even near the equator, when the search area center is very near the edge of the square, the north, south, west or ovest square may not be able to fully cover the specified radius. Now a test is performed at the edge of the initial guessed search area, and larger squares are used in case the test fails.
3. Because of rounding errors between Redis and Tcl, sometimes the test signaled false positives. This is now addressed.
Whenever possible the original code was improved a bit in other ways. A debugging example stanza was added in order to make the next debugging session simpler when the next bug is found.
show more ...
|
| a1bfe22a | 22-Jul-2016 |
antirez <[email protected]> |
Replication: when possible start RDB saving ASAP.
In a previous commit the replication code was changed in order to centralize the BGSAVE for replication trigger in replicationCron(), however after
Replication: when possible start RDB saving ASAP.
In a previous commit the replication code was changed in order to centralize the BGSAVE for replication trigger in replicationCron(), however after further testings, the 1 second delay imposed by this change is not acceptable.
So now the BGSAVE is only delayed if the AOF rewriting process is active. However past comments made sure that replicationCron() is always able to trigger the BGSAVE when needed, making the code generally more robust.
The new code is more similar to the initial @oranagra patch where the BGSAVE was delayed only if an AOF rewrite was in progress.
Trivia: delaying the BGSAVE uncovered a minor Sentinel issue that is now fixed.
show more ...
|
| 5b5e6520 | 22-Jul-2016 |
antirez <[email protected]> |
Sentinel: check Slave INFO state more often when disconnected.
During the initial handshake with the master a slave will report to have a very high disconnection time from its master (since technica
Sentinel: check Slave INFO state more often when disconnected.
During the initial handshake with the master a slave will report to have a very high disconnection time from its master (since technically it was disconnected since forever, so the current UNIX time in seconds is reported).
However when the slave is connected again the Sentinel may re-scan the INFO output again only after 10 seconds, which is a long time. During this time Sentinels will consider this instance unable to failover, so a useless delay is introduced.
Actaully this hardly happened in the practice because when a slave's master is down, the INFO period for slaves changes to 1 second. However when a manual failover is attempted immediately after adding slaves (like in the case of the Sentinel unit test), this problem may happen.
This commit changes the INFO period to 1 second even in the case the slave's master is not down, but the slave reported to be disconnected from the master (by publishing, last time we checked, a master disconnection time field in INFO).
This change is required as a result of an unrelated change in the replication code that adds a small delay in the master-slave first synchronization.
show more ...
|
| 21cffc26 | 21-Jul-2016 |
antirez <[email protected]> |
Avoid simultaneous RDB and AOF child process.
This patch, written in collaboration with Oran Agra (@oranagra) is a companion to 780a8b1. Together the two patches should avoid that the AOF and RDB sa
Avoid simultaneous RDB and AOF child process.
This patch, written in collaboration with Oran Agra (@oranagra) is a companion to 780a8b1. Together the two patches should avoid that the AOF and RDB saving processes can be spawned at the same time. Previously conditions that could lead to two saving processes at the same time were:
1. When AOF is enabled via CONFIG SET and an RDB saving process is already active.
2. When the SYNC command decides to start an RDB saving process ASAP in order to serve a new slave that cannot partially resynchronize (but only if we have a disk target for replication, for diskless replication there is not such a problem).
Condition "1" is not very severe but "2" can happen often and is definitely good at degrading Redis performances in an unexpected way.
The two commits have the effect of always spawning RDB savings for replication in replicationCron() instead of attempting to start an RDB save synchronously. Moreover when a BGSAVE or AOF rewrite must be performed, they are instead just postponed using flags that will try to perform such operations ASAP.
Finally the BGSAVE command was modified in order to accept a SCHEDULE option so that if an AOF rewrite is in progress, when this option is given, the command no longer returns an error, but instead schedules an RDB rewrite operation for when it will be possible to start it.
show more ...
|
| 017378ec | 21-Jul-2016 |
antirez <[email protected]> |
Replication: start BGSAVE for replication always in replicationCron().
This makes the replication code conceptually simpler by removing the synchronous BGSAVE trigger in syncCommand(). This also mea
Replication: start BGSAVE for replication always in replicationCron().
This makes the replication code conceptually simpler by removing the synchronous BGSAVE trigger in syncCommand(). This also means that socket and disk BGSAVE targets are handled by the same code.
show more ...
|
| 21736b41 | 06-Jul-2016 |
antirez <[email protected]> |
getLongLongFromObject: use string2ll() instead of strict_strtoll().
strict_strtoll() has a bug that reports the empty string as ok and parses it as zero.
Apparently nobody ever replaced this old ca
getLongLongFromObject: use string2ll() instead of strict_strtoll().
strict_strtoll() has a bug that reports the empty string as ok and parses it as zero.
Apparently nobody ever replaced this old call with the faster/saner string2ll() which is used otherwise in the rest of the Redis core.
This commit close #3333.
show more ...
|
| 0b748e91 | 05-Jul-2016 |
antirez <[email protected]> |
redis-cli: check SELECT reply type just in state updated.
In issues #3361 / #3365 a problem was reported / fixed with redis-cli not updating correctly the current DB on error after SELECT.
In theor
redis-cli: check SELECT reply type just in state updated.
In issues #3361 / #3365 a problem was reported / fixed with redis-cli not updating correctly the current DB on error after SELECT.
In theory this bug was fixed in 0042fb0e, but actually the commit only fixed the prompt updating, not the fact the state was set in a wrong way.
This commit removes the check in the prompt update, now that hopefully it is the state that is correct, there is no longer need for this check.
show more ...
|
| 1158386b | 01-Jul-2016 |
sskorgal <[email protected]> |
Fix for redis_cli printing default DB when select command fails. |
| 026f9fc7 | 04-Jul-2016 |
antirez <[email protected]> |
Sentinel: fix cross-master Sentinel address update.
This commit both fixes the crash reported with issue #3364 and also properly closes the old links after the Sentinel address for the other masters
Sentinel: fix cross-master Sentinel address update.
This commit both fixes the crash reported with issue #3364 and also properly closes the old links after the Sentinel address for the other masters gets updated.
The two problems where:
1. The Sentinel that switched address may not monitor all the masters, it is possible that there is no match, and the 'match' variable is NULL. Now we check for no match and 'continue' to the next master.
2. By ispecting the code because of issue "1" I noticed that there was a problem in the code that disconnects the link of the Sentinel that needs the address update. Basically link->disconnected is non-zero even if just *a single link* (cc -- command link or pc -- pubsub link) are disconnected, so to check with if (link->disconnected) in order to close the links risks to leave one link connected.
I was able to manually reproduce the crash at "1" and verify that the commit resolves the issue.
Close #3364.
show more ...
|
| 11523b3e | 04-Jul-2016 |
antirez <[email protected]> |
CONFIG GET is now no longer case sensitive.
Like CONFIG SET always was. Close #3369. |
| 4c6ff74c | 04-Jul-2016 |
antirez <[email protected]> |
Make tcp-keepalive default to 300 in internal conf.
We already changed the default in the redis.conf template, but I forgot to change the internal config as well. |
| 27dbec2a | 01-Jul-2016 |
antirez <[email protected]> |
In Redis RDB check: more details in error reportings. |
| 41f30047 | 01-Jul-2016 |
antirez <[email protected]> |
In Redis RDB check: log decompression errors. |
| 278fe3e9 | 01-Jul-2016 |
antirez <[email protected]> |
In Redis RDB check: log object type on error. |
| f5110c3c | 01-Jul-2016 |
antirez <[email protected]> |
In Redis RDB check: minor output message changes. |
| 35b18bfb | 01-Jul-2016 |
antirez <[email protected]> |
In Redis RDB check: better error reporting. |
| f578f085 | 30-Jun-2016 |
antirez <[email protected]> |
In Redis RDB check: initial POC.
So far we used an external program (later executed within Redis) and parser in order to check RDB files for correctness. This forces, at each RDB format update, to h
In Redis RDB check: initial POC.
So far we used an external program (later executed within Redis) and parser in order to check RDB files for correctness. This forces, at each RDB format update, to have two copies of the same format implementation that are hard to keep in sync. Morover the former RDB checker only checked the very high-level format of the file, without actually trying to load things in memory. Certain corruptions can only be handled by really loading key-value pairs.
This first commit attempts to unify the Redis RDB loadig code with the task of checking the RDB file for correctness. More work is needed but it looks like a sounding direction so far.
show more ...
|
| 7f1e1cae | 23-Jun-2016 |
tielei <[email protected]> |
A string with 21 chars is not representable as a 64-bit integer. |
| 70419679 | 27-Jun-2016 |
antirez <[email protected]> |
Fix quicklistReplaceAtIndex() by updating the quicklist ziplist size.
The quicklist takes a cached version of the ziplist representation size in bytes. The implementation must update this length eve
Fix quicklistReplaceAtIndex() by updating the quicklist ziplist size.
The quicklist takes a cached version of the ziplist representation size in bytes. The implementation must update this length every time the underlying ziplist changes. However quicklistReplaceAtIndex() failed to fix the length.
During LSET calls, the size of the ziplist blob and the cached size inside the quicklist diverged. Later, when this size is used in an authoritative way, for example during nodes splitting in order to copy the nodes, we end with a duplicated node that may contain random garbage.
This commit should fix issue #3343, however several problems were found reviewing the quicklist.c code in search of this bug that should be addressed soon or later.
For example:
1. To take a cached ziplist length is fragile since failing to update it leads to this kind of issues.
2. The node splitting code needs auditing. For example it works just for a side effect of ziplistDeleteRange() to be able to cope with a wrong count of elements to remove. The code inside quicklist.c assumes that -1 means "delete till the end" while actually it's just a count of how many elements to delete, and is an unsigned count. So -1 gets converted into the maximum integer, and just by chance the ziplist code stops deleting elements after there are no more to delete.
3. Node splitting is extremely inefficient, it copies the node and removes elements from both nodes even when actually there is to move a single entry from one node to the other, or when the new resulting node is empty at all so there is nothing to copy but just to create a new node.
However at least for Redis 3.2 to introduce fresh code inside quicklist.c may be even more risky, so instead I'm writing a better fuzzy tester to stress the internals a bit more in order to anticipate other possible bugs.
This bug was found using a fuzzy tester written after having some clue about where the bug could be. The tester eventually created a ~2000 commands sequence able to always crash Redis. I wrote a better version of the tester that searched for the smallest sequence that could crash Redis automatically. Later this smaller sequence was minimized by removing random commands till it still crashed the server. This resulted into a sequence of 7 commands. With this small sequence it was just a matter of filling the code with enough printf() to understand enough state to fix the bug.
show more ...
|
| 04c7261f | 17-Jun-2016 |
antirez <[email protected]> |
Redis 3.2.1. |