Hadoop子项目 ZooKeeper 3.3.4 发布

fmms 13年前
     <p>ZooKeeper是Hadoop的正式子项目,它是一个针对大型分布式系统的可靠协调系统,提供的功能包括:配置维护、名字服务、分布式同步、组服务等。ZooKeeper的目标就是封装好复杂易出错的关键服务,将简单易用的接口和性能高效、功能稳定的系统提供给用户。<br /> <a href="/misc/goto?guid=4958192107847702038"><img alt="ZooKeeper 3.3.4 发布" src="https://simg.open-open.com/show/d9bab28a4c9f4daca45847c89e81ac63.gif" width="79" height="112" /></a></p>    <p>Zookeeper是Google的Chubby一个开源的实现.是高有效和可靠的协同工作系统.Zookeeper能够用来leader选举,配 置信息维护等.在一个分布式的环境中,我们需要一个Master实例或存储一些配置信息,确保文件写入的一致性等.Zookeeper能够保证如下3点:</p>    <ul>     <li>Watches are ordered with respect to other events, other watches, and<br /> asynchronous replies. The ZooKeeper client libraries ensures that<br /> everything is dispatched in order.</li>     <li>A client will see a watch event for a znode it is watching before seeing the new data that corresponds to that znode.</li>     <li>The order of watch events from ZooKeeper corresponds to the order of the updates as seen by the ZooKeeper service.</li>    </ul>    <p> </p>    <p>在Zookeeper中,znode是一个跟Unix文件系统路径相似的节点,可以往这个节点存储或获取数据.如果在创建znode时Flag设置 为EPHEMERAL,那么当这个创建这个znode的节点和Zookeeper失去连接后,这个znode将不再存在在Zookeeper 里.Zookeeper使用Watcher察觉事件信息,当客户端接收到事件信息,比如连接超时,节点数据改变,子节点改变,可以调用相应的行为来处理数 据.Zookeeper的Wiki页面展示了如何使用Zookeeper来处理事件通知,队列,优先队列,锁,共享锁,可撤销的共享锁,两阶段提交.</p>    <p>那么Zookeeper能帮我们作什么事情呢?简单的例子:假设我们我们有个20个搜索引擎的服务器(每个负责总索引中的一部分的搜索任务)和一个 总服务器(负责向这20个搜索引擎的服务器发出搜索请求并合并结果集),一个备用的总服务器(负责当总服务器宕机时替换总服务器),一个web的 cgi(向总服务器发出搜索请求).搜索引擎的服务器中的15个服务器现在提供搜索服务,5个服务器正在生成索引.这20个搜索引擎的服务器经常要让正在 提供搜索服务的服务器停止提供服务开始生成索引,或生成索引的服务器已经把索引生成完成可以搜索提供服务了.使用Zookeeper可以保证总服务器自动 感知有多少提供搜索引擎的服务器并向这些服务器发出搜索请求,备用的总服务器宕机时自动启用备用的总服务器,web的cgi能够自动地获知总服务器的网络 地址变化.这些又如何做到呢?</p>    <ol>     <li> 提供搜索引擎的服务器都在Zookeeper中创建znode,zk.create("/search/nodes/node1",<br /> "hostname".getBytes(), Ids.OPEN_ACL_UNSAFE, CreateFlags.EPHEMERAL);</li>     <li>总服务器可以从Zookeeper中获取一个znode的子节点的列表,zk.getChildren("/search/nodes", true);</li>     <li>总服务器遍历这些子节点,并获取子节点的数据生成提供搜索引擎的服务器列表.</li>     <li>当总服务器接收到子节点改变的事件信息,重新返回第二步.</li>     <li>总服务器在Zookeeper中创建节点,zk.create("/search/master", "hostname".getBytes(), Ids.OPEN_ACL_UNSAFE, CreateFlags.EPHEMERAL);</li>     <li>备用的总服务器监控Zookeeper中的"/search/master"节点.当这个znode的节点数据改变时,把自己启动变成总服务器,并把自己的网络地址数据放进这个节点.</li>     <li>web的cgi从Zookeeper中"/search/master"节点获取总服务器的网络地址数据并向其发送搜索请求.</li>     <li>web的cgi监控Zookeeper中的"/search/master"节点,当这个znode的节点数据改变时,从这个节点获取总服务器的网络地址数据,并改变当前的总服务器的网络地址.</li>    </ol>    <p><img alt="ZooKeeper 3.3.4 发布" src="https://simg.open-open.com/show/762a6dbb01904fdbbe7c01e4b680206b.png" width="600" height="185" /></p>    <p><br style="font-weight:bold;" /> <span style="font-weight:bold;">项目地址:</span><a style="font-weight:bold;" href="/misc/goto?guid=4958192107847702038" target="_blank">http://zookeeper.apache.org/</a><br /> <br /> Apache ZooKeeper 3.3.4 主要改进:</p>    <table class="ForrestTable ke-zeroborder" cellspacing="1" cellpadding="0">     <tbody>      <tr>       <td><strong>Bug</strong></td>       <td> </td>      </tr>      <tr>       <td>[<a href="/misc/goto?guid=4958202060486089155">ZOOKEEPER-961</a>]</td>       <td>Watch recovery after disconnection when connection string contains a prefix</td>      </tr>      <tr>       <td>[<a href="/misc/goto?guid=4958202061231047034">ZOOKEEPER-1006</a>]</td>       <td>QuorumPeer "Address already in use" -- regression in 3.3.3</td>      </tr>      <tr>       <td>[<a href="/misc/goto?guid=4958202061964790775">ZOOKEEPER-1046</a>]</td>       <td>Creating a new sequential node results in a ZNODEEXISTS error</td>      </tr>      <tr>       <td>[<a href="/misc/goto?guid=4958202062706566004">ZOOKEEPER-1049</a>]</td>       <td>Session expire/close flooding renders heartbeats to delay significantly</td>      </tr>      <tr>       <td>[<a href="/misc/goto?guid=4958202063457311928">ZOOKEEPER-1069</a>]</td>       <td>Calling shutdown() on a QuorumPeer too quickly can lead to a corrupt log</td>      </tr>      <tr>       <td>[<a href="/misc/goto?guid=4958202064188439528">ZOOKEEPER-1087</a>]</td>       <td>ForceSync VM arguement not working when set to "no"</td>      </tr>      <tr>       <td>[<a href="/misc/goto?guid=4958202064945513392">ZOOKEEPER-1097</a>]</td>       <td>Quota is not correctly rehydrated on snapshot reload</td>      </tr>      <tr>       <td>[<a href="/misc/goto?guid=4958202065690996499">ZOOKEEPER-1117</a>]</td>       <td>zookeeper 3.3.3 fails to build with gcc >= 4.6.1 on Debian/Ubuntu</td>      </tr>      <tr>       <td>[<a href="/misc/goto?guid=4958202066435018848">ZOOKEEPER-1154</a>]</td>       <td>Data inconsistency when the node(s) with the highest zxid is not present at the time of leader election</td>      </tr>      <tr>       <td>[<a href="/misc/goto?guid=4958202067175139272">ZOOKEEPER-1156</a>]</td>       <td>Log truncation truncating log too much - can cause data loss</td>      </tr>      <tr>       <td>[<a href="/misc/goto?guid=4958202067920481466">ZOOKEEPER-1174</a>]</td>       <td>FD leak when network unreachable</td>      </tr>      <tr>       <td>[<a href="/misc/goto?guid=4958202068664366536">ZOOKEEPER-1189</a>]</td>       <td>For an invalid snapshot file(less than 10bytes size) RandomAccessFile stream is leaking.</td>      </tr>      <tr>       <td>[<a href="/misc/goto?guid=4958202069407292310">ZOOKEEPER-1203</a>]</td>       <td>Zookeeper systest is missing Junit Classes</td>      </tr>      <tr>       <td>[<a href="/misc/goto?guid=4958202070154976839">ZOOKEEPER-1206</a>]</td>       <td>Sequential node creation does not use always use digits in node name given certain Locales.</td>      </tr>      <tr>       <td>[<a href="/misc/goto?guid=4958202070906107497">ZOOKEEPER-1208</a>]</td>       <td>Ephemeral node not removed after the client session is long gone</td>      </tr>      <tr>       <td>[<a href="/misc/goto?guid=4958202071641994494">ZOOKEEPER-1212</a>]</td>       <td>zkServer.sh stop action is not conformat with LSB para 20.2 Init Script Actions</td>      </tr>      <tr>       <td>[<a href="/misc/goto?guid=4958202072376568411">ZOOKEEPER-1264</a>]</td>       <td>FollowerResyncConcurrencyTest failing intermittently</td>      </tr>      <tr>       <td>[<a href="/misc/goto?guid=4958202073126778736">ZOOKEEPER-1271</a>]</td>       <td>testEarlyLeaderAbandonment failing on solaris - clients not retrying connection</td>      </tr>      <tr>       <td>[<a href="/misc/goto?guid=4958202073862567161">ZOOKEEPER-1283</a>]</td>       <td>building 3.3 branch fails with Ant 1.8.2 (success with 1.7.1 though)</td>      </tr>      <tr>       <td><strong>改进</strong></td>       <td> </td>      </tr>      <tr>       <td>[<a href="/misc/goto?guid=4958202074607987465">ZOOKEEPER-1103</a>]</td>       <td>In QuorumTest, use the same "for ( .. try { break } catch { } )" pattern in testFollowersStartAfterLeaders as in testSessionMove.</td>      </tr>      <tr>       <td>[<a href="/misc/goto?guid=4958202075345106883">ZOOKEEPER-1239</a>]</td>       <td>add logging/stats to identify fsync stalls</td>      </tr>      <tr>       <td>[<a href="/misc/goto?guid=4958202076101464787">ZOOKEEPER-1301</a>]</td>       <td>backport patches related to the zk startup script from 3.4 to 3.3 release</td>      </tr>     </tbody>    </table>