redis源码阅读之hyperloglog-stolennnxb-ChinaUnix博客

本来按照大神所说http://blog.huangz.me/diary/2014/how-to-read-redis-source-code.html以为可以可以开开心心的直接去读具体数据结构的实现了，结果发现还是漏掉了这么个东西，哎，慢慢啃~~
先来段注释热热身，这段注释说的很清楚，hyperloglog是redis用来实现基值估算的。

点击(此处)折叠或打开

/* hyperloglog.c - Redis HyperLogLog probabilistic cardinality approximation.

这边理解就要分两头了

概念	解释
基值	学过数学的朋友们都应该对这个概念不陌生吧，就是一个集合当中包含的不同元素的个数
估算	估算就比较简单了，就是给出的值没有那么精确。

为什么要用估算呢？就是为了防止在数据量过大，但统计时只关心量，而不要求记录所有元素的所有数据的时候用的。比如算个什么日活跃用户这种，这时候要算的数据具有如下两个特点：1）去重；2）不需要记录具体的用户信息。这个时候就是我们hyperloglog登场的时候咯~~
hyperloglog在使用内存方面是非常节省的，它对每个key仅花费12KB(16384 个 6-bit registers)来存储，当一个元素到来时，它先是进行哈希，然后将其分配到16384个register当中的一个。
至于怎么估算，这里就要用到register当中的数据了。首先，register当中存放的是分配到当前register的值当中，二进制表示，从最后一位向前数，连续0的个数的最大值的对数（以2为底）。这里的数学理论依据是：一堆数，若其中二进制表示，从最后一位向前数，连续0的个数的最大值为k，那么这一堆数的基大概为2^k个。但是随着数据量的不断增大，2^k和2^k+1的差距也会越来越大，redis就巧妙的采取了将各个桶中的基进行调和求平均的方法来得到估算值，这期间还会用到修正因子神马的，这里就不细究了。

这里还要再说一下hyperloglog的具体存储方式，当数据量较大的时候，恒定每个key占用12KB，这种方式叫密集存储方式，其内存排列如下：

点击(此处)折叠或打开

+--------+--------+--------+------// //--+
|11000000|22221111|33333322|55444444 .... |
+--------+--------+--------+------// //--+

这里需要注意的是当取某一个register的内容时，由于其实6bit对齐，但是一个字节是8bit这里需要考虑数据位的提取问题。

但是如果数据量比较小的时候，仍旧采用这种方式就会造成一定的浪费，redis的解决方法就是采取最多对两个字节进行编码的稀疏存储的方式来表示所存数据，具体如下, 下表中的数据表示为二进制：

编码方式	解释（current register included）
ZERO: 00xxxxxx（单字节）	连续的N=xxxxxx+1个register为空，最多表示64个空register
XZERO: 01xxxxxx-yyyyyyyy（双字节）	连续的N=00xxxxxx-yyyyyyyy + 1个register均为空，最多表示16384个register为空，即hyperloglog的初始状态
VAL: 1vvvvvxx（单字节）	连续M=xx+1个register的值为vvvvv+1

当使用稀疏表示，并且表中只有3个非0的register，其位置分别为1000,1020,1021的时候，其内存表示为

点击(此处)折叠或打开

01000011 11100111 10001000 00011000 10001001 01111100 00000001

其中前两个字节表示有1000个空register，
第三个字节表示第1001（下标为1000）个register的值为2
第四字节表示后面的19个register为空
第五个字节表示后面连续两个register的值为3
最后两个字节表示后面的register均为空

这样的话，稀疏表示方法的局限也就露出来了，也就是单个register的最大保存值为32，当某一个register的值超过32的时候，会促使redis将其存储结构变为密集型存储。

由于有16384个register，而且基数统计要遍历所有的register，如果每来一个元素就整个遍历一下所有的register，就会显得很不划算，redis是设置了一个64位缓存来缓和这一现象，如果缓存最高位为1，则表示信息已经过期，如果为0，则余下的63位表示调和之后的值，该值只会在请求基数的时候才会重新计算，平时每次在更新register的时候都会将这一缓存的最高位置为1以表示过期。

redis当中使用了hllhdr来存储hyperloglog的相关信息，以便于操作，代码如下：

点击(此处)折叠或打开

struct hllhdr {
char magic[4]; /* "HYLL" */
uint8_t encoding; /* HLL_DENSE or HLL_SPARSE. */
uint8_t notused[3]; /* Reserved for future use, must be zero. */
uint8_t card[8]; /* Cached cardinality, little endian. */
uint8_t registers[]; /* Data bytes. */
};

在函数当中，比较核心的就是计算count和转换这边了，计算count代码如下：

点击(此处)折叠或打开

int hllPatLen(unsigned char *ele, size_t elesize, long *regp) {
uint64_t hash, bit, index;
int count;
/* Count the number of zeroes starting from bit HLL_REGISTERS
* (that is a power of two corresponding to the first bit we don't use
* as index). The max run can be 64-P+1 bits.
*
* Note that the final "1" ending the sequence of zeroes must be
* included in the count, so if we find "001" the count is 3, and
* the smallest count possible is no zeroes at all, just a 1 bit
* at the first position, that is a count of 1.
*
* This may sound like inefficient, but actually in the average case
* there are high probabilities to find a 1 after a few iterations. */
hash = MurmurHash64A(ele,elesize,0xadc83b19ULL);
index = hash & HLL_P_MASK; /* Register index. */
hash |= ((uint64_t)1<<63); /* Make sure the loop terminates. */
bit = HLL_REGISTERS; /* First bit not used to address the register. */
count = 1; /* Initialized to 1 since we count the "00000...1" pattern. */
while((hash & bit) == 0) {
count++;
bit <<= 1;
}
*regp = (int) index;
return count;
}

稀疏型存储转换为密集型存储代码如下，逐个遍历字节：

点击(此处)折叠或打开

int hllSparseToDense(robj *o) {
sds sparse = o->ptr, dense;
struct hllhdr *hdr, *oldhdr = (struct hllhdr*)sparse;
int idx = 0, runlen, regval;
uint8_t *p = (uint8_t*)sparse, *end = p+sdslen(sparse);
/* If the representation is already the right one return ASAP. */
hdr = (struct hllhdr*) sparse;
if (hdr->encoding == HLL_DENSE) return C_OK;
/* Create a string of the right size filled with zero bytes.
* Note that the cached cardinality is set to 0 as a side effect
* that is exactly the cardinality of an empty HLL. */
dense = sdsnewlen(NULL,HLL_DENSE_SIZE);
hdr = (struct hllhdr*) dense;
*hdr = *oldhdr; /* This will copy the magic and cached cardinality. */
hdr->encoding = HLL_DENSE;
/* Now read the sparse representation and set non-zero registers
* accordingly. */
p += HLL_HDR_SIZE;
while(p < end) {
if (HLL_SPARSE_IS_ZERO(p)) {
runlen = HLL_SPARSE_ZERO_LEN(p);
idx += runlen;
p++;
} else if (HLL_SPARSE_IS_XZERO(p)) {
runlen = HLL_SPARSE_XZERO_LEN(p);
idx += runlen;
p += 2;
} else {
runlen = HLL_SPARSE_VAL_LEN(p);
regval = HLL_SPARSE_VAL_VALUE(p);
while(runlen--) {
HLL_DENSE_SET_REGISTER(hdr->registers,idx,regval);
idx++;
}
p++;
}
}
/* If the sparse representation was valid, we expect to find idx
* set to HLL_REGISTERS. */
if (idx != HLL_REGISTERS) {
sdsfree(dense);
return C_ERR;
}
/* Free the old representation and set the new one. */
sdsfree(o->ptr);
o->ptr = dense;
return C_OK;
}

不足之处，还请大家多提宝贵意见~~~