<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Star</title>
  
  <link href="/atom.xml" rel="self"/>
  
  <link href="http:starllap.space/"/>
  <updated>2019-09-18T06:02:28.642Z</updated>
  <id>http:starllap.space/</id>
  
  <author>
    <name>Star</name>
    
  </author>
  
  <generator uri="http://hexo.io/">Hexo</generator>
  
  <entry>
    <title>离开这里两年后</title>
    <link href="http:starllap.space/2019/09/17/%E7%A6%BB%E5%BC%80%E8%BF%99%E9%87%8C%E4%B8%A4%E5%B9%B4%E5%90%8E-1/"/>
    <id>http:starllap.space/2019/09/17/离开这里两年后-1/</id>
    <published>2019-09-18T05:54:16.000Z</published>
    <updated>2019-09-18T06:02:28.642Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/DPlayer.min.js"> </script><p>两年了，我感到自己离开最想象力丰富，最充满热情的年龄太遥远了。这些日子，一直在思考自己如何变成一个越发平庸的人。</p>
<p>甚至找不到给自己的标签。</p>
<p>在这个地方，还可以写点文字，找找自己真正想要的是什么。</p>
<p>今天收到别人请求内推的邮件，没有附上简历。第一次没有生气。忽然想起当初自己找工作时候莽莽撞撞，无所适从的样子，想起那些睡不着的夜晚，不停哭不停哭的时刻，想起无法面对自己的失败，却充满韧性的我。至少那个时候，充满希望，机智善良。</p>
<p>不知道工作改变了我什么，变成一个刻薄或者并不耐心的人，或者变成毫无生趣的人。终究是中了这个社会的圈套。</p>
<p>我思来想去，我擅长的可能不够好，但对于写字，记录，和思考，我终究还是喜爱的。我到达不了哪里，也没有什么小目标，可能这辈子也不会出一本书。但我要好好写下去，让自己更努力一点。</p>
<p>这几年，好像之前太努力了，工作以后总想着如何享受生活。可我大概不是享受的命，我还是很怀念奋斗时不记得自己是谁的感觉，怀念充满了观念的我。</p>
<p>不知道创业公司是不是更适合我，如果能解决生计，温饱，我为什么要为了那些所谓的优越而选择平庸呢。</p>
<p>我甚至不知道下一篇什么时候会写。我不知道明天我会不会努力。</p>
<p>但我记录在这里，至少在这一刻，我还想过自己是不是可以不平凡。</p>
]]></content>
    
    <summary type="html">
    
      &lt;script src=&quot;/assets/js/DPlayer.min.js&quot;&gt; &lt;/script&gt;&lt;p&gt;两年了，我感到自己离开最想象力丰富，最充满热情的年龄太遥远了。这些日子，一直在思考自己如何变成一个越发平庸的人。&lt;/p&gt;
&lt;p&gt;甚至找不到给自己的标签。&lt;/p&gt;
&lt;p&gt;在这个
    
    </summary>
    
      <category term="真实的生命" scheme="http:starllap.space/categories/%E7%9C%9F%E5%AE%9E%E7%9A%84%E7%94%9F%E5%91%BD/"/>
    
    
      <category term="真实的生命" scheme="http:starllap.space/tags/%E7%9C%9F%E5%AE%9E%E7%9A%84%E7%94%9F%E5%91%BD/"/>
    
  </entry>
  
  <entry>
    <title>计算机网络原理知识点初整理</title>
    <link href="http:starllap.space/2017/06/07/%E8%AE%A1%E7%AE%97%E6%9C%BA%E7%BD%91%E7%BB%9C%E5%8E%9F%E7%90%86%E7%9F%A5%E8%AF%86%E7%82%B9%E5%88%9D%E6%95%B4%E7%90%86/"/>
    <id>http:starllap.space/2017/06/07/计算机网络原理知识点初整理/</id>
    <published>2017-06-07T17:57:55.000Z</published>
    <updated>2017-06-22T15:49:47.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/DPlayer.min.js"> </script><p>更新中...</p>
<h2 id="http">HTTP</h2>
<h3 id="http报文">HTTP报文</h3>
<p>分为request line, header line, message body
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div></pre></td><td class="code"><pre><div class="line">&lt;method&gt; &lt;request-URL&gt; &lt;version&gt;</div><div class="line">&lt;headers&gt;</div><div class="line"></div><div class="line">&lt;entity-body&gt;</div></pre></td></tr></table></figure></p>
<h3 id="http基本方法">HTTP基本方法：</h3>
<p>GET，POST，PUT，DELETE。
URL全称是资源描述符，我们可以这样认为：一个URL地址，它用于描述一个网络上的资源。</p>
<h4 id="get">GET：</h4>
<p>用于信息获取，而且应该是安全的 和 幂等的。</p>
<p>所谓安全的意味着该操作用于获取信息而非修改信息。换句话说，GET 请求一般不应产生副作用。就是说，它仅仅是获取资源信息，就像数据库查询一样，不会修改，增加数据，不会影响资源的状态。</p>
<p>幂等的意味着对同一URL的多个请求应该返回同样的结果。</p>
<p>GET请求报文示例：
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div></pre></td><td class="code"><pre><div class="line">GET /books/?sex=man&amp;name=Professional HTTP/1.1</div><div class="line"> Host: www.example.com</div><div class="line"> User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.6)</div><div class="line"> Gecko/20050225 Firefox/1.0.1</div><div class="line"> Connection: Keep-Alive</div></pre></td></tr></table></figure></p>
<h4 id="post">Post:</h4>
<p>报文示例：</p>
<p><figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div></pre></td><td class="code"><pre><div class="line">POST / HTTP/1.1</div><div class="line">Host: www.example.com</div><div class="line">User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.6)</div><div class="line">Gecko/20050225 Firefox/1.0.1</div><div class="line">Content-Type: application/x-www-form-urlencoded</div><div class="line">Content-Length: 40</div><div class="line">Connection: Keep-Alive</div><div class="line"></div><div class="line">sex=man&amp;name=Professional</div></pre></td></tr></table></figure></p>
<p>HTTP 协议中规定 POST 提交的数据必须在 body 部分中，但是协议中没有规定数据使用哪种编码方式或者数据格式。实际上，开发者完全可以自己决定消息主体的格式，只要最后发送的 HTTP 请求满足上面的格式就可以。</p>
<h5 id="post提交数据的方式">Post提交数据的方式：</h5>
<h6 id="applicationx-www-form-urlencoded">application/x-www-form-urlencoded</h6>
<p>浏览器的原生 &lt;form&gt; 表单，如果不设置 enctype 属性，那么最终就会以 application/x-www-form-urlencoded 方式提交数据。</p>
<h6 id="multipartform-data">multipart/form-data</h6>
<h3 id="响应报文">响应报文</h3>
<p>HTTP 响应与 HTTP 请求相似，HTTP响应也由3个部分构成，分别是：</p>
<p>状态行
响应头(Response Header)
响应正文</p>
<h3 id="常见状态码">常见状态码：</h3>
<p><figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div></pre></td><td class="code"><pre><div class="line">200 OK 客户端请求成功</div><div class="line">301 Moved Permanently 请求永久重定向</div><div class="line">302 Moved Temporarily 请求临时重定向</div><div class="line">304 Not Modified 文件未修改，可以直接使用缓存的文件。</div><div class="line">400 Bad Request 由于客户端请求有语法错误，不能被服务器所理解。</div><div class="line">401 Unauthorized 请求未经授权。这个状态代码必须和WWW-Authenticate报头域一起使用</div><div class="line">403 Forbidden 服务器收到请求，但是拒绝提供服务。服务器通常会在响应正文中给出不提供服务的原因</div><div class="line">404 Not Found 请求的资源不存在，例如，输入了错误的URL</div><div class="line">500 Internal Server Error 服务器发生不可预期的错误，导致无法完成客户端的请求。</div><div class="line">503 Service Unavailable 服务器当前不能够处理客户端的请求，在一段时间之后，服务器可能会恢复正常。</div></pre></td></tr></table></figure></p>
<h3 id="keep-alive">Keep-alive:</h3>
<p>在 HTTP 1.1 版本中，默认情况下所有连接都被保持，如果加入 &quot;Connection: close&quot; 才关闭。目前大部分浏览器都使用 HTTP 1.1 协议，也就是说默认都会发起 Keep-Alive 的连接请求了，所以是否能完成一个完整的 Keep-Alive 连接就看服务器设置情况。</p>
<p>使用长连接之后，客户端、服务端怎么知道本次传输结束呢？两部分：1. 判断传输数据是否达到了Content-Length 指示的大小；2. 动态生成的文件没有 Content-Length ，它是分块传输（chunked），这时候就要根据 chunked 编码来判断，chunked 编码的数据在最后有一个空 chunked 块，表明本次传输数据结束。</p>
<h3 id="cookie">Cookie</h3>
<p>Cookie是Web服务器发送给客户端的一小段信息，客户端请求时可以读取该信息发送到服务器端，进而进行用户的识别。对于客户端的每次请求，服务器都会将Cookie发送到客户端,在客户端可以进行保存,以便下次使用。</p>
<p>客户端可以采用两种方式来保存这个Cookie对象，一种方式是保存在客户端内存中，称为临时Cookie，浏览器关闭后这个Cookie对象将消失。另外一种方式是保存在客户机的磁盘上，称为永久Cookie。以后客户端只要访问该网站，就会将这个Cookie再次发送到服务器上，前提是这个Cookie在有效期内，这样就实现了对客户的跟踪。</p>
<p>Cookie是可以被禁止的。</p>
<h3 id="session">Session</h3>
<p>每一个用户都有一个不同的session，各个用户之间是不能共享的，是每个用户所独享的，在session中可以存放信息。</p>
<p>在服务器端会创建一个session对象，产生一个sessionID来标识这个session对象，然后将这个sessionID放入到Cookie中发送到客户端，下一次访问时，sessionID会发送到服务器，在服务器端进行识别不同的用户。</p>
<p>Session的实现依赖于Cookie，如果Cookie被禁用，那么session也将失效。</p>
<h3 id="token">Token</h3>
<p>目前主流的做法是使用 Token 抵御 CSRF 攻击。</p>
<p>Token 使用原则</p>
<ul>
<li>要足够随机————只有这样才算不可预测</li>
<li>是一次性的，即每次请求成功后要更新Token————这样可以增加攻击难度，增加预测难度</li>
<li>要注意保密性————敏感操作使用 post，防止 Token 出现在 URL 中</li>
</ul>
<p>注意：过滤用户输入的内容不能阻挡 csrf，我们需要做的是过滤请求的来源</p>
<h2 id="tcp">TCP</h2>
<ul>
<li>TCP提供一种面向连接的、可靠的字节流服务</li>
<li>在一个TCP连接中，仅有两方进行彼此通信。广播和多播不能用于TCP</li>
<li>TCP使用校验和，确认和重传机制来保证可靠传输</li>
<li>TCP给数据分节进行排序，并使用累积确认保证数据的顺序不变和非重复</li>
<li>TCP使用滑动窗口机制来实现流量控制，通过动态改变窗口的大小进行拥塞控制</li>
</ul>
<h3 id="三次握手-three-way-handshake">三次握手 Three-way Handshake</h3>
<p>是指建立一个 TCP 连接时，需要客户端和服务器总共发送3个包。
三次握手的目的是连接服务器指定端口，建立 TCP 连接，并同步连接双方的序列号和确认号，交换 TCP 窗口大小信息。在 socket 编程中，客户端执行 connect() 时。将触发三次握手。
<img src="http://i.imgur.com/DROOpLM.png" alt="">
在TCP/IP协议中，TCP协议提供可靠的连接服务，采用三次握手建立一个连接。
第一次握手：建立连接时，客户端发送syn包(syn=j)到服务器，并进入SYN_SEND状态，等待服务器确认；</p>
<p>第二次握手：服务器收到syn包，必须确认客户的SYN（ack=j+1），同时自己也发送一个SYN包（syn=k），即SYN+ACK包，此时服务器进入SYN_RECV状态； 第三次握手：客户端收到服务器的SYN＋ACK包，向服务器发送确认包ACK(ack=k+1)，此包发送完毕，客户端和服务器进入ESTABLISHED状态，完成三次握手。 完成三次握手，客户端与服务器开始传送数据.</p>
<h2 id="socket">Socket</h2>
<p>Socket 是对 TCP/IP 协议族的一种封装，是应用层与TCP/IP协议族通信的中间软件抽象层。从设计模式的角度看来，Socket其实就是一个门面模式，它把复杂的TCP/IP协议族隐藏在Socket接口后面，对用户来说，一组简单的接口就是全部，让Socket去组织数据，以符合指定的协议。
Socket 还可以认为是一种网络间不同计算机上的进程通信的一种方法，利用三元组（ip地址，协议，端口）就可以唯一标识网络中的进程，网络中的进程通信可以利用这个标志与其它进程进行交互。</p>
<p><img src="http://i.imgur.com/w4tZUCs.png" alt=""></p>
<p>References:
https://hit-alibaba.github.io/interview/basic/network/HTTP.html</p>
]]></content>
    
    <summary type="html">
    
      &lt;script src=&quot;/assets/js/DPlayer.min.js&quot;&gt; &lt;/script&gt;&lt;p&gt;更新中...&lt;/p&gt;
&lt;h2 id=&quot;http&quot;&gt;HTTP&lt;/h2&gt;
&lt;h3 id=&quot;http报文&quot;&gt;HTTP报文&lt;/h3&gt;
&lt;p&gt;分为request line, heade
    
    </summary>
    
      <category term="技术冷板凳" scheme="http:starllap.space/categories/%E6%8A%80%E6%9C%AF%E5%86%B7%E6%9D%BF%E5%87%B3/"/>
    
    
      <category term="HTTP" scheme="http:starllap.space/tags/HTTP/"/>
    
      <category term="Network" scheme="http:starllap.space/tags/Network/"/>
    
  </entry>
  
  <entry>
    <title>Binary Search 二分查找</title>
    <link href="http:starllap.space/2017/05/28/Binary%20Search/"/>
    <id>http:starllap.space/2017/05/28/Binary Search/</id>
    <published>2017-05-28T21:13:27.000Z</published>
    <updated>2017-05-30T14:26:30.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/DPlayer.min.js"> </script><h2 id="二分查找模板总结">二分查找模板总结</h2>
<p>四点要素：</p>
<ul>
<li>start + 1 &lt; end</li>
<li>start + (end - start) / 2</li>
<li>A[mid] ==, &lt;, &gt;</li>
<li>A[start] A[end] ? target</li>
</ul>
<p>二分法关键</p>
<ul>
<li>头尾指针，取中点，判断往哪儿走</li>
<li>寻找满足某个条件第一个或是最后一个位置</li>
<li>保留剩下来一定有解的那一半</li>
</ul>
<h3 id="question-1-classical-binary-search">Question 1: classical-binary-search</h3>
<p>https://www.lintcode.com/en/problem/classical-binary-search/</p>
<p>Find any position of a target number in a sorted array. Return -1 if target does not exist.</p>
<p>Given [1, 2, 2, 4, 5, 5].</p>
<p>For target = 2, return 1 or 2.</p>
<p>For target = 5, return 4 or 5.</p>
<p>For target = 6, return -1.</p>
<h3 id="explanation">Explanation</h3>
<p>基本的二分查找。</p>
<h3 id="code">Code</h3>
<p><figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div><div class="line">23</div><div class="line">24</div><div class="line">25</div><div class="line">26</div><div class="line">27</div><div class="line">28</div><div class="line">29</div><div class="line">30</div><div class="line">31</div><div class="line">32</div></pre></td><td class="code"><pre><div class="line">public class Solution &#123;</div><div class="line">    /**</div><div class="line">     * @param A an integer array sorted in ascending order</div><div class="line">     * @param target an integer</div><div class="line">     * @return an integer</div><div class="line">     */</div><div class="line">    public int findPosition(int[] A, int target) &#123;</div><div class="line">        // Write your code here</div><div class="line">        if (A == null || A.length == 0)&#123;</div><div class="line">            return -1;</div><div class="line">        &#125;</div><div class="line">        int start = 0;</div><div class="line">        int end = A.length - 1;</div><div class="line">        while (start + 1 &lt; end) &#123;</div><div class="line">            int mid = start + (end - start) / 2;</div><div class="line">            if (target == A[mid]) &#123;</div><div class="line">                end = mid;</div><div class="line">            &#125; else if (target &gt; A[mid]) &#123;</div><div class="line">                start = mid;</div><div class="line">            &#125; else &#123;</div><div class="line">                end = mid;</div><div class="line">            &#125;</div><div class="line">        &#125;</div><div class="line">        if (target == A[start]) &#123;</div><div class="line">            return start;</div><div class="line">        &#125;</div><div class="line">        if (target == A[end]) &#123;</div><div class="line">            return end;</div><div class="line">        &#125;</div><div class="line">        return -1;</div><div class="line">    &#125;</div><div class="line">&#125;</div></pre></td></tr></table></figure></p>
<h3 id="question-2-first-position-of-target">Question 2: First Position of target</h3>
<p>For a given sorted array (ascending order) and a target number, find the first index of this number in O(log n) time complexity.</p>
<p>If the target number does not exist in the array, return -1.</p>
<p>If the array is [1, 2, 3, 3, 4, 5, 10], for given target 3, return 2.</p>
<h3 id="code">Code</h3>
<p><figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div><div class="line">23</div><div class="line">24</div><div class="line">25</div><div class="line">26</div></pre></td><td class="code"><pre><div class="line">class Solution &#123;</div><div class="line">    /**</div><div class="line">     * @param nums: The integer array.</div><div class="line">     * @param target: Target to find.</div><div class="line">     * @return: The first position of target. Position starts from 0.</div><div class="line">     */</div><div class="line">    public int binarySearch(int[] nums, int target) &#123;</div><div class="line">        //write your code here</div><div class="line">        if (nums == null || nums.length == 0) return -1;</div><div class="line">        int start = 0;</div><div class="line">        int end = nums.length - 1;</div><div class="line">        while (start + 1 &lt; end) &#123;</div><div class="line">            int mid = start + (end - start) / 2;</div><div class="line">            if (target == nums[mid]) &#123;</div><div class="line">                end = mid;</div><div class="line">            &#125; else if (target &gt; nums[mid]) &#123;</div><div class="line">                start = mid;</div><div class="line">            &#125; else &#123;</div><div class="line">                end = mid;</div><div class="line">            &#125;</div><div class="line">        &#125;</div><div class="line">        if (target == nums[start]) return start;</div><div class="line">        if (target == nums[end]) return end;</div><div class="line">        return -1;</div><div class="line">    &#125;</div><div class="line">&#125;</div></pre></td></tr></table></figure></p>
<h3 id="question-3-last-position-of-target">Question 3: last-position-of-target</h3>
<p>给一个升序数组，找到target最后一次出现的位置，如果没出现过返回-1</p>
<h3 id="code">Code</h3>
<p><figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div><div class="line">23</div><div class="line">24</div><div class="line">25</div><div class="line">26</div><div class="line">27</div><div class="line">28</div><div class="line">29</div><div class="line">30</div><div class="line">31</div><div class="line">32</div><div class="line">33</div><div class="line">34</div></pre></td><td class="code"><pre><div class="line">public class Solution &#123;</div><div class="line">    /**</div><div class="line">     * @param A an integer array sorted in ascending order</div><div class="line">     * @param target an integer</div><div class="line">     * @return an integer</div><div class="line">     */</div><div class="line">    public int lastPosition(int[] A, int target) &#123;</div><div class="line">        // Write your code here</div><div class="line">        if (A == null || A.length == 0)&#123;</div><div class="line">            return -1;</div><div class="line">        &#125;</div><div class="line">        int start = 0;</div><div class="line">        int end = A.length - 1;</div><div class="line">        while (start + 1 &lt; end) &#123;</div><div class="line">            int mid = start + (end - start) / 2;</div><div class="line">            if (target == A[mid]) &#123;</div><div class="line">                //difference with the above questions</div><div class="line">                start = mid;</div><div class="line">            &#125; else if (target &gt; A[mid]) &#123;</div><div class="line">                start = mid;</div><div class="line">            &#125; else &#123;</div><div class="line">                end = mid;</div><div class="line">            &#125;</div><div class="line">        &#125;</div><div class="line">        //check end element first</div><div class="line">        if (target == A[end]) &#123;</div><div class="line">            return end;</div><div class="line">        &#125;</div><div class="line">        if (target == A[start]) &#123;</div><div class="line">            return start;</div><div class="line">        &#125;</div><div class="line">        return -1;</div><div class="line">    &#125;</div><div class="line">&#125;</div></pre></td></tr></table></figure></p>
<h3 id="question-4-search-insert-position">Question 4: search-insert-position</h3>
<p>Given a sorted array and a target value, return the index if the target is found. If not, return the index where it would be if it were inserted in order.</p>
<p>You may assume NO duplicates in the array.</p>
<p>[1,3,5,6], 5 → 2</p>
<p>[1,3,5,6], 2 → 1</p>
<p>[1,3,5,6], 7 → 4</p>
<p>[1,3,5,6], 0 → 0</p>
<h3 id="code">Code</h3>
<p><figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div><div class="line">23</div><div class="line">24</div><div class="line">25</div><div class="line">26</div></pre></td><td class="code"><pre><div class="line">public class Solution &#123;</div><div class="line">    /**</div><div class="line">     * param A : an integer sorted array</div><div class="line">     * param target :  an integer to be inserted</div><div class="line">     * return : an integer</div><div class="line">     */</div><div class="line">    public int searchInsert(int[] A, int target) &#123;</div><div class="line">        // write your code here</div><div class="line">        if ( A == null || A.length == 0) return 0;</div><div class="line">        int start = 0;</div><div class="line">        int end = A.length - 1;</div><div class="line">        while (start + 1 &lt; end) &#123;</div><div class="line">            int mid = start + (end - start) / 2;</div><div class="line">            if (target == A[mid]) &#123;</div><div class="line">                return mid;</div><div class="line">            &#125; else if (target &gt; A[mid]) &#123;</div><div class="line">                start = mid;</div><div class="line">            &#125; else &#123;</div><div class="line">                end = mid;</div><div class="line">            &#125;</div><div class="line">        &#125;</div><div class="line">        if (target &lt;= A[start]) return start;</div><div class="line">        if (target &lt;= A[end]) return end;</div><div class="line">        return end + 1;</div><div class="line">    &#125;</div><div class="line">&#125;</div></pre></td></tr></table></figure></p>
<h3 id="question-5-search-in-a-big-sorted-array">Question 5: Search in a big sorted array</h3>
<p>Given a big sorted array with positive integers sorted by ascending order. The array is so big so that you can not get the length of the whole array directly, and you can only access the kth number by ArrayReader.get(k) (or ArrayReader-&gt;get(k) for C++). Find the first index of a target number. Your algorithm should be in O(log k), where k is the first index of the target number.
Return -1, if the number doesn't exist in the array.</p>
<h3 id="explanation">Explanation</h3>
<p>本题和上面考察如何在一个array中找到一个数不同的是，这个array会非常大。所以要考虑的是如何“倍增”的问题。增到大于target就可以了，接下来找到最初出现的问题。</p>
<h3 id="code">Code</h3>
<p><figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div><div class="line">23</div><div class="line">24</div><div class="line">25</div><div class="line">26</div><div class="line">27</div><div class="line">28</div><div class="line">29</div><div class="line">30</div></pre></td><td class="code"><pre><div class="line">class Solution &#123;</div><div class="line">    /**</div><div class="line">     * @param nums: The integer array.</div><div class="line">     * @param target: Target to find.</div><div class="line">     * @return: The first position of target. Position starts from 0.</div><div class="line">     */</div><div class="line">    public int binarySearch(int[] nums, int target) &#123;</div><div class="line">        //write your code here</div><div class="line">        if (nums == null || nums.length == 0) return -1;</div><div class="line">        int end = 0;</div><div class="line">        while (end &lt; nums.length &amp;&amp; nums[end] &lt; target) &#123;</div><div class="line">          end = end * 2 + 1;</div><div class="line">        &#125;</div><div class="line">        if (end &gt;= nums.length) end = nums.length - 1;</div><div class="line">        int start = 0;</div><div class="line">        while (start + 1 &lt; end) &#123;</div><div class="line">            int mid = start + (end - start) / 2;</div><div class="line">            if (target == nums[mid]) &#123;</div><div class="line">                end = mid;</div><div class="line">            &#125; else if (target &gt; nums[mid]) &#123;</div><div class="line">                start = mid;</div><div class="line">            &#125; else &#123;</div><div class="line">                end = mid;</div><div class="line">            &#125;</div><div class="line">        &#125;</div><div class="line">        if (target == nums[start]) return start;</div><div class="line">        if (target == nums[end]) return end;</div><div class="line">        return -1;</div><div class="line">    &#125;</div><div class="line">&#125;</div></pre></td></tr></table></figure></p>
]]></content>
    
    <summary type="html">
    
      &lt;script src=&quot;/assets/js/DPlayer.min.js&quot;&gt; &lt;/script&gt;&lt;h2 id=&quot;二分查找模板总结&quot;&gt;二分查找模板总结&lt;/h2&gt;
&lt;p&gt;四点要素：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;start + 1 &amp;lt; end&lt;/li&gt;
&lt;li&gt;start + 
    
    </summary>
    
      <category term="Data Structure" scheme="http:starllap.space/categories/Data-Structure/"/>
    
    
      <category term="Data Structure" scheme="http:starllap.space/tags/Data-Structure/"/>
    
      <category term="Binary Search" scheme="http:starllap.space/tags/Binary-Search/"/>
    
  </entry>
  
  <entry>
    <title>Install Sudo in Debian</title>
    <link href="http:starllap.space/2017/04/28/Install-Sudo-in-Debian/"/>
    <id>http:starllap.space/2017/04/28/Install-Sudo-in-Debian/</id>
    <published>2017-04-29T05:21:34.000Z</published>
    <updated>2017-04-29T02:26:58.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/DPlayer.min.js"> </script><p>Debian系统里默认是没有sudo的，安装操作如下。</p>
<p>Step 1: 用root的身份进入<br>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div></pre></td><td class="code"><pre><div class="line">su -u</div></pre></td></tr></table></figure></p>
<p>Step 2: 安装sudo
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div></pre></td><td class="code"><pre><div class="line">apt-get install sudo</div></pre></td></tr></table></figure></p>
<p>Step 3: 把你的用户名加入sudoer列表中
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div></pre></td><td class="code"><pre><div class="line">vi /etc/sudoers</div></pre></td></tr></table></figure></p>
<p>编辑此文件，加入这样一句：
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div></pre></td><td class="code"><pre><div class="line">star ALL=(ALL:ALL) ALL</div></pre></td></tr></table></figure></p>
<p>Step 4: 退出root，测试sudo
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div></pre></td><td class="code"><pre><div class="line">exit</div><div class="line"></div><div class="line">sudo su -</div></pre></td></tr></table></figure></p>
<p>Done!</p>
]]></content>
    
    <summary type="html">
    
      &lt;script src=&quot;/assets/js/DPlayer.min.js&quot;&gt; &lt;/script&gt;&lt;p&gt;Debian系统里默认是没有sudo的，安装操作如下。&lt;/p&gt;
&lt;p&gt;Step 1: 用root的身份进入&lt;br&gt;
&lt;figure class=&quot;highlight plai
    
    </summary>
    
      <category term="技术冷板凳" scheme="http:starllap.space/categories/%E6%8A%80%E6%9C%AF%E5%86%B7%E6%9D%BF%E5%87%B3/"/>
    
    
      <category term="Debian" scheme="http:starllap.space/tags/Debian/"/>
    
      <category term="sudo" scheme="http:starllap.space/tags/sudo/"/>
    
  </entry>
  
  <entry>
    <title>论文笔记：《Stronger consistency and semantics for low-latency geo-replicated storage》</title>
    <link href="http:starllap.space/2017/04/11/%E8%AE%BA%E6%96%87%E7%AC%94%E8%AE%B0%EF%BC%9A%E3%80%8AStronger-consistency-and-semantics-for-low-latency-geo-replicated-storage%E3%80%8B/"/>
    <id>http:starllap.space/2017/04/11/论文笔记：《Stronger-consistency-and-semantics-for-low-latency-geo-replicated-storage》/</id>
    <published>2017-04-12T06:11:01.000Z</published>
    <updated>2017-04-24T19:53:44.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/DPlayer.min.js"> </script><p>本文是《Stronger consistency and semantics for low-latency geo-replicated
storage》的阅读笔记。</p>
<p>(未完成更新...)</p>
<h3 id="背景">背景</h3>
<h4 id="地理复制geo-replication">地理复制（Geo-replication）</h4>
<p>如今大型的网络服务常常需要大规模数据存储，需要支持上百万的并行用户对数据进行操作。在这些系统中，数据中心往往会对数据进行完全备份，也就是在每一个数据中心中都存储全部的数据。比如Facebook会把所有用户信息存在每个数据中心中。像这样讲数据备份在不同地理位置的方式称为：地理复制（Geo-replication）。<br>
地理复制有两个好处：容错和低延迟。一个地理位置的数据库挂掉了，其他的可以继续提供服务。用户可以选择离自己最近的那个服务。</p>
<h4 id="数据分区">数据分区</h4>
<p>在大规模数据库中，每一个数据中心的数据会非常大，常常需要分布在上万的机器中，常用的一个技术叫做分区（sharding），就是把不同部分的数据房子啊不同的服务器中。当有新的机器增加时，就需要重新进行分区。所以总而言之，在不同地理区域的数据是重复的，每个地区的数据则是分区存放的。</p>
<p>为了达到更少的round trop time也就是RTT，也就会从用户发出请求到返回的网络延迟，最重要的就是要减少到达数据存储之间延迟。一种方式就是尽量从本地获取。如图所示：</p>
<p><img src="http://i.imgur.com/oyQ8t2X.png" alt=""></p>
<h4 id="local-replica-only">Local-replica-only</h4>
<p>从最近的地方读写是最快的。读从最近的数据中心读取，就不会去远程获取。写的是后也是在这个数据中心中更新，在更新远程数据中心之前就返回结果。这种设计称为“local-replica-only”数据库设计，就是为了减少远程获取，来得到数据中心之间RTT最短时间。然而这种“Local-replica-only”的设计往往会达不到强一致性。</p>
<h4 id="linearizability线性一致化">Linearizability线性一致化</h4>
<p>Linearizability是一种强一致性模型。也就是说如果写完之后，在另一个数据中心读取时，就需要有刚刚写的更新。所以理论上来说，低延迟和强一致性往往是trade-off的关系。</p>
<h4 id="alps系统">ALPS系统</h4>
<p>通过CAP理论，我们已经知道，一个系统无法同时达到Consistency，Availability和Partition-tolerance。因此，现代的网络服务通常牺牲了强一致性来满足可用性和分区容忍性。这类系统可以命名为“ALPS系统”，满足可用性，低延迟，分区容忍和高可拓。</p>
<p>鉴于ALPS系统必须牺牲强一致性，我们就来探索一下在ALSP束缚下最强的一致性可以达到什么程度。这里，我们提出casual consistency with convergent conflict handling,也就是casual+一致性。</p>
]]></content>
    
    <summary type="html">
    
      &lt;script src=&quot;/assets/js/DPlayer.min.js&quot;&gt; &lt;/script&gt;&lt;p&gt;本文是《Stronger consistency and semantics for low-latency geo-replicated
storage》的阅读笔记。&lt;/p
    
    </summary>
    
      <category term="Distributed System" scheme="http:starllap.space/categories/Distributed-System/"/>
    
    
      <category term="Distributed System" scheme="http:starllap.space/tags/Distributed-System/"/>
    
      <category term="Cloud Computing" scheme="http:starllap.space/tags/Cloud-Computing/"/>
    
      <category term="Geo-replication" scheme="http:starllap.space/tags/Geo-replication/"/>
    
      <category term="Data Storage" scheme="http:starllap.space/tags/Data-Storage/"/>
    
  </entry>
  
  <entry>
    <title>论文笔记：Facebook可扩展架构概览</title>
    <link href="http:starllap.space/2017/03/29/%E8%AE%BA%E6%96%87%E7%AC%94%E8%AE%B0%EF%BC%9AFacebook%E5%8F%AF%E6%8B%93%E5%B1%95%E6%9E%B6%E6%9E%84%E6%A6%82%E8%A7%88/"/>
    <id>http:starllap.space/2017/03/29/论文笔记：Facebook可拓展架构概览/</id>
    <published>2017-03-30T04:43:57.000Z</published>
    <updated>2017-03-30T18:15:38.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/DPlayer.min.js"> </script><p>本文是论文《Overview of Facebook Scalable Architecture》的阅读笔记，作者Hugo Barrigas, Daniel Barrigas, Melyssa Barata, Pedro Furtado, Jorge Bernardina。技术细节很少，只是一个粗略的大框架。其实根本也没有讲什么细节，所以以后有更多补充再往里写。感觉最大的亮点就是<strong>MySQL+Memcached</strong>。</p>
<hr>
<p>这篇文章主要是介绍Facebook的网站架构，讲述在扩展方面遇到的困难和解决的方式，从而更好了解Facebook是如何运行的。
对于大型分布式系统来说，可拓展性是网络、系统和进程中非常重要的指标。它标志着是否有成长的能力，是否能处理增长的工作量。
规模只是拓展需要考虑的一个方面。可拓展性包含如下几个方面：<br>
a. 是否能轻松增加存储能力<br>
b. 能处理多少增加的traffic<br>
c. 能多运行多少事务</p>
<h3 id="facebook网站架构">Facebook网站架构</h3>
<h4 id="facebook如何运作">Facebook如何运作</h4>
<p>随着用户增加，Facebook做了一些改动，但依旧使用LAMP（Linux-Apache-Memcached-PHP）模式：
a. 依旧使用PHP，但写了一个将PHP转成C++的编译器来提高服务器的性能。
b. 依旧使用Linux但进行了一些优化。
c. 最具有争议的事情是使用MySQL，它依旧是最主要的数据库。
另外Facebook还有两个自己系统：</p>
<ol>
<li>Haystack：高拓展，用来存储大量的图片。</li>
<li>Scribe：一个可拓展的登录系统。</li>
</ol>
<h4 id="facebook前端">Facebook前端</h4>
<p>前端是把LAMP服务器运行在Memcache上。</p>
<h5 id="linux-amp-apache">Linux &amp; Apache</h5>
<p>Facebook使用Linux和Apache HTTP Server</p>
<h5 id="php-amp-bigpipe">PHP &amp; BigPipe</h5>
<p>BigPipe是Facebook开发的动态网页系统。主要就是把各个部分通过不同的步骤在浏览器和服务器中完成。
比如：</p>
<p>￼￼￼<img src="http://i.imgur.com/QpgOhQI.png" alt=""></p>
<h5 id="hiphop">HipHop</h5>
<p>将PHP编译成C++的编译器。一些关键点：</p>
<ul>
<li>是PHP编译器</li>
<li>容易增加插件</li>
<li>极大减少了CPU和内存的使用量</li>
</ul>
<p><img src="http://i.imgur.com/SH8jSfb.png" alt=""></p>
<h4 id="facebook后端">Facebook后端</h4>
<h5 id="mysql">MySQL</h5>
<p>Facebook在MySQL的使用上运用了sharding和caching的技术。
为了让MySQL可拓展，主要的解决方案就是sharding。也就是数据库被分为几个部分，而且90%的query都存在缓存里，并不需要去数据库里取。Facebook非常依赖Memcached，并且值得一提的是Facebook在多个数据中心中有好几千个MySQL的服务器。另外，Facebook一些复杂的Join操作都是在服务器层面跑的，而不是直接在表上跑。（怎么做到的...不清楚技术细节）</p>
<h5 id="scribe">Scribe</h5>
<p>Scribe是Facebook的登录系统。Scribe主要做的就是从多个服务器端读取整合数据，然后把信息传送给Hadoop：
<img src="http://i.imgur.com/NkAvOBJ.png" alt=""></p>
<h5 id="thrift">Thrift</h5>
<p>Thrift协议提供不同语言之间的序列化，从而使Facebook支持不同语言共同开发应用。</p>
<h4 id="memcached">Memcached</h4>
<p>Memcache是键值对内存存储缓存系统。</p>
<p><img src="http://i.imgur.com/KEljkA5.png" alt=""></p>
<h4 id="hadoop-amp-hive">Hadoop &amp; Hive</h4>
<p><img src="http://i.imgur.com/Zko5phE.png" alt=""></p>
<h4 id="haystack">Haystack</h4>
<p>Haystack是在主内存中加入了可拓展的缓存：</p>
<p><img src="http://i.imgur.com/XT5VIhG.png" alt=""></p>
<h3 id="主体架构">主体架构</h3>
<p><img src="http://i.imgur.com/HWAy2UQ.png" alt=""></p>
<hr>
<p><em>REFERENCES：</em></p>
<p>[1] Building Scalable Web Architecture and Distributed Systems http://www.drdobbs.com/web-development/building-scalable-web-architecture-and-d/240142422, (Accessed 26 January , 2014)</p>
<p>[2] How Does Facebook Work? The Nuts and Bolts [Technology Explained] http://www.makeuseof.com/tag/facebook-work-nuts-bolts-technology-explained/ (Accessed 25 February 2014)</p>
<p>[3] Lloys G. W. Lloyd and Connie U. S. 2008. Scalable Query Result Caching for Web Applications. PVLDB vol. 1 no. 1 pp. 550-561, 2008</p>
<p>[4] Parris I., Abdesslem F. B., and Henderson T. 2012. Facebook or Fakebook? The effect of simulation on location privacy user studies. Ad Hoc Networks vol. 12 pp. 35-49</p>
<p>[5] Performance and scalability techniques 101 http://www.webforefront.com/performance/scaling101.html , (Accessed 30 January 2014)</p>
<p>[6] Scaling the Messages Application Back End
https://www.facebook.com/note.php?note_id=10150148835363920 (Accessed 25 February 2014)</p>
<p>[7] The effects of teacher self-disclosure via Facebook on
teacher credibility
http://www.gtaan.gatech.edu/meetings/handouts/MazerFacebook.pdf (Accessed 25 February 2014)</p>
<p>￼</p>
]]></content>
    
    <summary type="html">
    
      &lt;script src=&quot;/assets/js/DPlayer.min.js&quot;&gt; &lt;/script&gt;&lt;p&gt;本文是论文《Overview of Facebook Scalable Architecture》的阅读笔记，作者Hugo Barrigas, Daniel Barrigas
    
    </summary>
    
      <category term="Cloud Computing" scheme="http:starllap.space/categories/Cloud-Computing/"/>
    
    
      <category term="Scalability" scheme="http:starllap.space/tags/Scalability/"/>
    
      <category term="Facebook" scheme="http:starllap.space/tags/Facebook/"/>
    
      <category term="Architecture" scheme="http:starllap.space/tags/Architecture/"/>
    
  </entry>
  
  <entry>
    <title>分布式系统知识整理：二阶段提交协议、三阶段提交协议</title>
    <link href="http:starllap.space/2017/03/29/%E5%88%86%E5%B8%83%E5%BC%8F%E4%B8%80%E8%87%B4%E6%80%A7/"/>
    <id>http:starllap.space/2017/03/29/分布式一致性/</id>
    <published>2017-03-30T02:19:56.000Z</published>
    <updated>2017-03-30T02:02:10.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/DPlayer.min.js"> </script><p>本文梳理分布式系统中的常见概念：二阶段提交协议和三阶段提交协议。</p>
<h2 id="二阶段提交协议two-phase-commit-protocol">二阶段提交协议(Two-phase Commit Protocol)</h2>
<p>TPC是基于分布式系统架构下所有节点进行事务提交时保持<strong>一致性</strong>而设计的一种算法。在分布式系统中，每个节点虽然可以知晓自己操作的成功或失败，但无法知道其他节点的情况。需要一个协调者来组织所有节点的操作结果并最终指示他们是否进行真正的提交。</p>
<h3 id="第一阶段投票阶段">第一阶段：投票阶段</h3>
<p>1）协调者给参与者发送信息，询问是否vote，等待响应。<br>
2）参与者节点执行事务操作，写本地的redo和undo日志，但不提交。<br>
3）参与者回复协调者，如果实务操作执行成功，返回“同意”。否则，返回“中止”。</p>
<h3 id="第二阶段提交阶段">第二阶段：提交阶段</h3>
<p>a.如果第一阶段所有参与者提交都为“同意”：<br>
1）协调者给参与者发送“是否正式提交”的询问。<br>
2）参与者正式完成操作，释放资源。<br>
3）参与者返回“完成”信息。<br>
4）协调者收到所有“完成”信息，完成事务。</p>
<p>b.如果第一阶段有参与者提交“中止”：<br>
1）协调者向所有参与者节点发出“回滚”的请求。<br>
2）参与者利用undo信息执行回滚，释放在整个事务占用的资源。<br>
3）参与者返回“回滚完成”信息。<br>
4）协调者收到所有参与者的“回滚完成”后，取消事务。</p>
<p>整个过程如图所示：</p>
<p><img src="http://i.imgur.com/J1kzk7I.png" alt=""></p>
<h3 id="存在的问题">存在的问题</h3>
<ol>
<li>节点在等待消息处于阻塞的状态，节点中的其他进程需要等待阻塞进程释放资源。</li>
<li>如果参与者故障，协调者需要给每个参与者制定超时机制，超时后整个事务失败。TPC没有容错机制，一个节点故障整个事务都要回滚，代价很大。</li>
<li>如果协调器发生故障，参与者无法完成事务，就会一直阻塞下去。</li>
<li>如果协调者发出commit消息之后宕机，唯一接收到消息的参与者也同时宕机了，即使协调者通过选举产生了新协调者，这条事务的状态是无人知晓的。</li>
</ol>
<p>因此，后来产生了三阶段提交协议。</p>
<h2 id="三阶段提交协议three-phase-commit-protocol">三阶段提交协议(Three-phase Commit Protocol)</h2>
<p>非阻塞协议。</p>
<p>两方面的改动：</p>
<ol>
<li>引入超时机制，协调者和参与者都引入超时机制。</li>
<li>在TPC的第一阶段和第二阶段之间插入了一个准备阶段，保证在最后提交阶段之前各参与节点的状态是一致的。</li>
</ol>
<p><img src="http://i.imgur.com/YLxLPqR.png" alt=""></p>
<hr>
<p><em>References:</em></p>
<ol>
<li>3.1.4 Two-phase commit: http://itdoc.hitachi.co.jp/manuals/3000/30003D5030e/DESC0050.HTM</li>
<li>Wikipedia:<br>
https://zh.wikipedia.org/zh-hans/%E4%BA%8C%E9%98%B6%E6%AE%B5%E6%8F%90%E4%BA%A4</li>
<li>分布式协议之两阶段提交协议（2PC）和改进三阶段提交协议（3PC）:     http://www.mamicode.com/info-detail-890945.html</li>
</ol>
]]></content>
    
    <summary type="html">
    
      &lt;script src=&quot;/assets/js/DPlayer.min.js&quot;&gt; &lt;/script&gt;&lt;p&gt;本文梳理分布式系统中的常见概念：二阶段提交协议和三阶段提交协议。&lt;/p&gt;
&lt;h2 id=&quot;二阶段提交协议two-phase-commit-protocol&quot;&gt;二阶段提交协议(
    
    </summary>
    
      <category term="Distributed System" scheme="http:starllap.space/categories/Distributed-System/"/>
    
    
      <category term="Distributed System" scheme="http:starllap.space/tags/Distributed-System/"/>
    
      <category term="2PC" scheme="http:starllap.space/tags/2PC/"/>
    
      <category term="3PC" scheme="http:starllap.space/tags/3PC/"/>
    
  </entry>
  
  <entry>
    <title>云计算知识整理：使用Apache Kafka和Samza进行流处理</title>
    <link href="http:starllap.space/2017/03/25/%E4%BA%91%E8%AE%A1%E7%AE%97%E7%9F%A5%E8%AF%86%E6%95%B4%E7%90%86%EF%BC%9A%E4%BD%BF%E7%94%A8Apache-Kafka%E5%92%8CSamza%E8%BF%9B%E8%A1%8C%E6%B5%81%E5%A4%84%E7%90%86/"/>
    <id>http:starllap.space/2017/03/25/云计算知识整理：使用Apache-Kafka和Samza进行流处理/</id>
    <published>2017-03-25T22:48:14.000Z</published>
    <updated>2017-03-30T01:47:01.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/DPlayer.min.js"> </script><p>本文是关于Stream Processing with Kafka and Samza的介绍以及总结反思。</p>
<h3 id="流处理">流处理</h3>
<p>如今实时数据处理的需求越来越高，实时数据的来源包括传感器数据（比如物联网设备），社交网络交互，实时商业数据等。这些情况下需要极其的延迟率。比如LinkedIn就需要用实时的广告点击数据来不断扩充广告架构。类似Hadoop和Spark这种数据获取和数据处理分离的方式无法达到低延迟的实时处理需求。所以在这里就介绍一写管理和处理大量实时数据的流处理框架。</p>
<h3 id="apache-kafka">Apache Kafka</h3>
<p>Kafka是分布式发布-订阅消息系统。由linkedin开发，后来成为Apache项目的一部分。发布者把消息放在不同的classes里，并不知道订阅者会如何使用这些数据。而订阅者可以订阅特定的消息并且只能收到相应的消息。Kafka使用commit log来保持数据，commit log是按顺序的，不可修改的，只能添加的数据结构。Kafka最大的优势是它提供完整的数据结构，所有的组织里的系统能够独立和可靠地获取数据。可以认为Kafka是流数据源。以下为一些主要Kafka术语：</p>
<ul>
<li>
<p>Topic:
表示一个用户定义的类型，消息会在这个类别下发布。主要用partitioned log来维护。</p>
</li>
<li>
<p>Producers:
用来向Kafka集群中发布一个或多个topic信息的进程。</p>
</li>
<li>
<p>Consumers：
用来向Kafka集群中读取消息的进程。</p>
</li>
<li>
<p>Partitions：
topics被分为多个partitions。一个partition代表一个并行单位。总的来说，partition越多，吞吐量越多。每个partition中的每个信息都有特定的偏移量，这样数据消费者能借此定位。简单来说，我们认为Kafka会根据key来给数据排序并提供，类似于MapReduce中的Map和Shuffle的阶段。</p>
</li>
<li>
<p>Brokers：
Brokers用来负责数据持久化和复制。brokers会和producers交流来发布信息给Kafka集群，和consumers交谈来获取信息。</p>
</li>
</ul>
<p><img src="http://i.imgur.com/sVrZDbo.png" alt=""></p>
<p>值得注意的是，kafka不会运行处理数据，只是一种存储和分类流数据的一种方式。在这个Project里，你需要用Samza来处理Kafka提供的数据流。</p>
<p>可以看下这个视频（通常搞不懂一个概念的时候我都是查youtube_(:зゝ∠)_ ）<a href="https://www.youtube.com/watch?v=Q5wOegcVa8E" target="_blank" rel="external">Understanding Kafka with Legos</a></p>
<h3 id="apache-samza">Apache Samza</h3>
<p>Samza是由Linkedin开发的分布式流处理框架，以下为三层流处理框架中的关键组件：</p>
<ul>
<li>Streaming：这一层是用partitioned stream的方式提供输入，这里也就是Kafka</li>
<li>Execution:在不同的机器间调度协调任务，这里使用YARN</li>
<li>Processing：负责具体的数据处理，这里使用Samza</li>
</ul>
<p>Samza相关的术语包括：</p>
<ul>
<li>Streams：等同于Kafka的topics</li>
<li>Jobs：使用Samza API来从一个或多个流读取和处理数据。一个Job可能被分割成不同的task，每个task可能会使用输入流中的一个或多个partition</li>
<li>Stateful Stream Processing：流处理可以分为有状态和无状态。</li>
</ul>
<p><img src="http://i.imgur.com/8497uxN.png" alt="">
<img src="http://i.imgur.com/TJSfPwq.png" alt=""></p>
<p>Samza结构：
<img src="http://i.imgur.com/53TilKn.png" alt=""></p>
<h3 id="samza-api">Samza API</h3>
<p>Samza API简单抽象，以一个ExampleCode为例，看Twitter类似的实时信息如何展示：</p>
<p><figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div></pre></td><td class="code"><pre><div class="line">￼public class FanOutTask implements StreamTask, InitableTask, WindowableTask &#123;</div><div class="line">   private KeyValueStore&lt;String, String&gt; socialGraph;</div><div class="line">   private KeyValueStore&lt;String, Map&lt;String, Object&gt;&gt; userTimeline;</div><div class="line">   private long numMessages = 0;</div><div class="line">   @Override</div><div class="line">   @SuppressWarnings(&quot;unchecked&quot;)</div><div class="line">   public void init(Config config, TaskContext context) throws Exception &#123;</div><div class="line">     socialGraph = (KeyValueStore&lt;String, String&gt;) context.getStore(&quot;social-graph&quot;);</div><div class="line">     userTimeline = (KeyValueStore&lt;String, Map&lt;String, Object&gt;&gt;) context.getStore(&quot;user-timelin</div><div class="line"> e&quot;);</div><div class="line">&#125;</div><div class="line">   @Override</div><div class="line">   @SuppressWarnings(&quot;unchecked&quot;)</div><div class="line">   public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordin</div><div class="line"> ator coordinator) &#123;</div><div class="line">     String incomingStream = envelope.getSystemStreamPartition().getStream();</div><div class="line">     if (incomingStream.equals(NewsfeedConfig.FOLLOWS_STREAM.getStream())) &#123;</div><div class="line">       processFollowsEvent((Map&lt;String, Object&gt;) envelope.getMessage());</div><div class="line">     &#125; else if (incomingStream.equals(NewsfeedConfig.MESSAGES_STREAM.getStream())) &#123;</div><div class="line">       processMessageEvent((Map&lt;String, Object&gt;) envelope.getMessage(), collector);</div><div class="line">     &#125; else &#123;</div><div class="line">       throw new I</div></pre></td></tr></table></figure></p>
<p>所有的Samza Jobs都要完成 StreamTask接口，有的时候还需要完成InitableTask和WindowableTask接口。</p>
]]></content>
    
    <summary type="html">
    
      &lt;script src=&quot;/assets/js/DPlayer.min.js&quot;&gt; &lt;/script&gt;&lt;p&gt;本文是关于Stream Processing with Kafka and Samza的介绍以及总结反思。&lt;/p&gt;
&lt;h3 id=&quot;流处理&quot;&gt;流处理&lt;/h3&gt;
&lt;p&gt;如今实时
    
    </summary>
    
      <category term="Cloud Computing" scheme="http:starllap.space/categories/Cloud-Computing/"/>
    
    
      <category term="Cloud Computing" scheme="http:starllap.space/tags/Cloud-Computing/"/>
    
      <category term="Stream Processing" scheme="http:starllap.space/tags/Stream-Processing/"/>
    
  </entry>
  
  <entry>
    <title>云计算应用实例：OFA/Start-ups/Data Analytics/Cloud Migration Exercise</title>
    <link href="http:starllap.space/2017/03/06/%E4%BA%91%E8%AE%A1%E7%AE%97%E5%BA%94%E7%94%A8%E5%AE%9E%E4%BE%8B%EF%BC%9AObama-For-America/"/>
    <id>http:starllap.space/2017/03/06/云计算应用实例：Obama-For-America/</id>
    <published>2017-03-06T22:41:09.000Z</published>
    <updated>2017-03-30T01:47:17.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/DPlayer.min.js"> </script><p>本文主要是总结一些云计算的应用实例，是否要用云计算这个问题是个很复杂的问题，大多数时候需要根据需求和预算才能得到确切的答案，云计算是趋势，但如何利用是个值得仔细考虑的问题。</p>
<h1 id="云计算应用实例obama-for-america">云计算应用实例：Obama For America</h1>
<blockquote>
<p>Obama For America（OFA）是云计算一个经典实例，云计算被运用在筹集竞选资金、分析竞争对手、有效使用竞选资金等都为奥巴马的连任成功提供了极大的保障。</p>
</blockquote>
<h2 id="ofa是什么">OFA是什么？</h2>
<p>OFA利用数据集成和预测分析使奥巴马赢得了选举，主要通过数据挖掘来确定及影响摇摆州的投票目标。OFA的数据来源包含：人口数据，投票历史数据，筹款数据，志愿者数据，社交网络数据以及投票数据，得到相关数据后，通过上门拜访，打电话，邮件，信件，网络广告，社交媒体，付费电视和网站等方式来影响投票人的决定。</p>
<h2 id="ofa的一些数据">OFA的一些数据</h2>
<p>根据DevOps的负责人透露，OFA的一些关键数据包括：</p>
<ul>
<li>4Gbps带宽</li>
<li>每秒10,000条请求</li>
<li>2,000个数据节点</li>
<li>3个数据中心</li>
<li>180TB数据量</li>
<li>共约85亿条请求</li>
<li>从设计到部署到应用共用了583天</li>
<li>最多30分钟的宕机时间</li>
<li>技术团队共40人</li>
</ul>
<h2 id="核心应用">核心应用</h2>
<p>Obama For America团队一共构建了200个应用，包括：</p>
<h3 id="narwhal">Narwhal</h3>
<p>Narwhal是一个python写的REST API，可调用存放的所有数据。在OFA中，大多数数据存放在MySQL为主的关系型数据库中，但也使用了PostgresDB, MS SQL Server, MongoDB, Vertica, LevelDB, S3, DynamoDB, SimpleDB数据库。除数据存储之外，主要应用了AWS EMR(Elastic MapReduce service)和Vertica来处理大量的数据建模和分析的工作。</p>
<h3 id="calltool">CallTool</h3>
<p>CallTool可以用来让志愿者给投票人打电话。在选举的最后四天里，这个工具被1000多个志愿者给超过百万的投票人打了电话。工具可以将志愿者和投票人进行配对。团队在其中应用了AWS的auto-scalling功能在需求高峰能够快速增加云资源。</p>
<h3 id="dashboard">DashBoard</h3>
<p>一个用来登记志愿者信息，并允许志愿者查看相应进度和收集到的信息的Rails线上应用工具。</p>
<h3 id="dreamcatcher">Dreamcatcher</h3>
<p>Dreamcatcher可以通过处理社交网络上的政治舆情，精确确定目标投票人，并赢得投票人。</p>
<h3 id="gotv">GOTV</h3>
<p>GOTV即Get out the vote，这个应用用来动员支持者将支持转化为投票。</p>
<h2 id="aws如何帮助了ofa">AWS如何帮助了OFA?</h2>
<p>OFA所有的应用几乎都部署在AWS云端，他们应用了分布消息队列，NoSQL和SQL数据库，虚拟私有云服务，负载均衡，内容传输网络等服务。
除了能够应对大规模数据处理需求之外，还能够在需求高峰通过自动增加云计算资源保证响应速度。另外，团队在数据备份上也充分利用了云计算的优势。</p>
<p><img src="http://i.imgur.com/LZfspfL.png" alt=""></p>
<p><img src="http://i.imgur.com/6DnWTB5.png" alt=""></p>
<p><img src="http://i.imgur.com/TJaoYcR.png" alt=""></p>
<p><img src="http://i.imgur.com/vE7R4Ui.png" alt=""></p>
<hr>
<p><strong>Reference:</strong></p>
<blockquote>
<ol>
<li><a href="https://gigaom.com/2012/11/12/how-obamas-tech-team-helped-deliver-the-2012-election/" target="_blank" rel="external">How Obama’s tech team helped deliver the 2012 election</a></li>
<li>Wikipedia:<a href="https://en.wikipedia.org/wiki/Organizing_for_America" target="_blank" rel="external">Organizing for America</a></li>
<li>CMU 15719 slides</li>
<li><a href="http://www.informationweek.com/cloud/6-ways-amazon-cloud-helped-obama-win/d/d-id/1107433" target="_blank" rel="external">6 Ways Amazon Cloud Helped Obama Win</a></li>
</ol>
</blockquote>
<h1 id="云计算应用实例start-ups">云计算应用实例：Start-ups</h1>
<h2 id="背景">背景</h2>
<p>对于Start-up来说，计算要求往往各种各样，需求量不大，变化却很快。在有限的预算下，还存在着硬件采购周期长，部署周期长，负担不起系统管理，数据中心能量供应，利用率低，灾难响应速度低等问题。</p>
<p>所以，云计算成为了Start-up的一种选择。举个栗子：</p>
<h3 id="gosquared">GoSquared</h3>
<p>GoSquared是一家提供线上实时数据网站分析的企业，被TechCrunch报道后，使用量忽然增加，没有足够的资源应对增长，最终把网站搬到了AWS上，可以迅速得到更多资源，快速拓展。</p>
<p><strong>Reference</strong></p>
<blockquote>
<p><a href="https://aws.amazon.com/solutions/case-studies/gosquared/" target="_blank" rel="external">AWS case study: GoSquared</a></p>
</blockquote>
<h1 id="云计算应用实例data-analytics">云计算应用实例：Data Analytics</h1>
<p>如今企业决策都讲究数据驱动，层出不穷的多媒体数据往往有不同的数据来源，从text到图片到音频和视频。数据处理还分为批处理和流处理。如今，传统的数据处理流程也慢慢转向了云端。</p>
<h2 id="数据处理流程图">数据处理流程图</h2>
<p><img src="http://i.imgur.com/7OTudIS.png" alt=""></p>
<h2 id="lambda架构">Lambda架构</h2>
<p>Lambda架构定义了一套明确的架构原则，同时利用批处理和流处理对大规模的数据进行处理。此框架的提出主要是为了解决这样的一个问题：如何实时地在任意大数据集上进行查询？</p>
<p>Lambda架构包含三层layer：</p>
<ul>
<li><strong>批处理层：写一次，批量读取多次</strong><br>
主要有Hadoop实现，负责数据存储和产生试图数据。当新数据到达时，会用MapReduce迭代地讲数据聚集到视图中。根据数据大小和集群的规模，迭代转换计算时间大约需要几小时。</li>
<li><strong>服务层：随机读取，不支持随机写入，批量计算和批量写入</strong><br>
这层由Cloudera Impala实现，对于上一层输出的一些列包含预计算视图的原始文件，服务层在Hive元数据层中创建一个表，元数据都指向HDFS中的文件，随后用户可以通过Impala查询到视图。这里服务层用Impala就是为了快速而且交互地查询到Hadoop存储和处理的数据。但是由于MapReduce在实时数据处理中的高延迟，我们还需要加速层。
在这一层中，需要合并处理Batch views和Real-time views得到最终user需要的query结果。</li>
<li><strong>加速层：随机读取，随机写入，增量计算</strong><br>
本层利用Strom框架计算实时视图，使用增量模型，将实时视图作为临时量，只要数据被传送到批处理层中，服务层相应的实时视图就会被丢掉。由于应用了增量模型，往往会比较复杂。</li>
</ul>
<p>所以，整个流程为，当数据进来时，会并行地进入到批处理层与加速层，当两者查询都完成在服务层整合好后，才算完成一次完整的查询。</p>
<p><img src="http://i.imgur.com/E2LbtPy.png" alt=""></p>
<p>但是这样令人激动的模型，还存在着一些争议，比如</p>
<ul>
<li>代码维护需要在两个复杂的分布式系统中进行，需要对不同的框架进行不同的编程。</li>
<li>在两个分布式系统中运行和维护程序任然很难，比如跨数据库系统中进行ORM（对象关系映射）就会比较难。</li>
</ul>
<p>关于Lambda框架的具体细节可参考这本书：<a href="https://www.manning.com/books/big-data" target="_blank" rel="external">《Big data: Principles and best practices of scalable realtime data systems》
</a>
其中除了介绍Lambda框架外，还重点放在了一些重要的地方：</p>
<ul>
<li>分布式思想</li>
<li>避免增量架构</li>
<li>创建再计算算法</li>
<li>数据相关性</li>
<li>关注数据不可变</li>
</ul>
<p>这张图很好诠释了Lambda是如何处理query的：
<img src="http://i.imgur.com/qcUMzoL.png" alt=""></p>
<p>随着发展，后来，Spark被认为是Lambda框架的合理实现。所以在这一部分，感觉写Lambda框架有点过时呢，但老师提到了就再理解一下。时代发展就是这样的，Lambda框架这种临时解决方案，会被更完美的方案取代。不过，不管怎样，人类的进步是令人欣喜的。关于Spark有空可以再写一篇具体一点的，amazing。</p>
<p><strong>References：</strong></p>
<blockquote>
<ol>
<li><a href="http://blog.csdn.net/john_f_lau/article/details/25502203" target="_blank" rel="external">大数据Lambda架构</a></li>
<li><a href="https://en.wikipedia.org/wiki/Lambda_architecture" target="_blank" rel="external">Wikipedia Lambda Architecture</a></li>
<li><a href="https://dzone.com/articles/lambda-architecture-with-apache-spark" target="_blank" rel="external">Lambda Architecture with Apache Spark</a></li>
</ol>
</blockquote>
<h1 id="云计算应用实例云迁移-究竟是否应该选择cloud">云计算应用实例：云迁移 （究竟是否应该选择Cloud？）</h1>
<p>对研究室来说，是应该使用预置的硬件，还是迁移到云端？
回答当然是，“It depends.”。</p>
<p>比如需求是这样：</p>
<p><img src="http://i.imgur.com/I3rZUZ1.png" alt=""></p>
<p>如果我们选择预置的机器，按照五年来算：
<img src="http://i.imgur.com/03678Pu.png" alt=""></p>
<p>有以下两种云计算的方案：
方案1：
<img src="http://i.imgur.com/9umJ84S.png" alt="">
方案2：
<img src="http://i.imgur.com/0yWhXho.png" alt=""></p>
<p>所以我们可以针对不同的需求、不同的云计算选择、不同的费用模型、不同的权限要求等来选择是否迁移到云端。</p>
<p>另外，在对比时，我们还需要考虑：</p>
<ul>
<li>对于本地预置机器：供电、散热、安全性等</li>
<li>对于云端：迁移到/迁出云端的成本等</li>
</ul>
]]></content>
    
    <summary type="html">
    
      &lt;script src=&quot;/assets/js/DPlayer.min.js&quot;&gt; &lt;/script&gt;&lt;p&gt;本文主要是总结一些云计算的应用实例，是否要用云计算这个问题是个很复杂的问题，大多数时候需要根据需求和预算才能得到确切的答案，云计算是趋势，但如何利用是个值得仔细考虑的问题。&lt;
    
    </summary>
    
      <category term="Cloud Computing" scheme="http:starllap.space/categories/Cloud-Computing/"/>
    
    
      <category term="Cloud Computing" scheme="http:starllap.space/tags/Cloud-Computing/"/>
    
      <category term="OFA" scheme="http:starllap.space/tags/OFA/"/>
    
  </entry>
  
  <entry>
    <title>Binary Tree 二叉树</title>
    <link href="http:starllap.space/2017/02/28/Binary%20Tree/"/>
    <id>http:starllap.space/2017/02/28/Binary Tree/</id>
    <published>2017-02-28T22:12:00.000Z</published>
    <updated>2017-05-29T18:43:31.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/DPlayer.min.js"> </script><p>本文是总结Tree这种结构的常用知识点，暂时总结Binary Tree。</p>
<h2 id="binary-tree二叉树">Binary Tree二叉树</h2>
<h3 id="why-tree">Why Tree?</h3>
<p>因为树结合了其他数据结构的优势：</p>
<ul>
<li>顺序数组： 用Binary Search查找会很快。</li>
<li>链表：插入和删除会非常快，不需要shift值。</li>
</ul>
<h3 id="基本概念">基本概念：</h3>
<ul>
<li>根： 树的顶部。</li>
<li>父节点</li>
<li>子节点</li>
<li>叶节点：没有子节点的节点。</li>
<li>Leve（高度）：代表有几代。</li>
</ul>
<h3 id="平衡树和非平衡树">平衡树和非平衡树</h3>
<p>平衡树：
左右子树及其的高度相差&lt;=1，并且左右子树也是平衡树。
<img src="http://i.imgur.com/8XBVd4y.png" alt=""></p>
<h3 id="full-tree-和-complete-tree">Full Tree 和 Complete Tree：</h3>
<ul>
<li>Full Tree:每个节点都有0/2个子节点。</li>
<li>Complete Tree:除了最右边的节点，其他节点都是满节点，并且都靠左。
<img src="http://i.imgur.com/Tp3azOo.png" alt=""></li>
</ul>
<h3 id="遍历顺序">遍历顺序：</h3>
<p><img src="http://i.imgur.com/MpkGahp.png" alt=""></p>
<h3 id="binary-tree代码实现">Binary Tree代码实现</h3>
<h4 id="binary-tree-interface">Binary Tree Interface</h4>
<p><figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div><div class="line">23</div><div class="line">24</div><div class="line">25</div><div class="line">26</div><div class="line">27</div><div class="line">28</div></pre></td><td class="code"><pre><div class="line">public interface BSTInterface &#123;</div><div class="line"></div><div class="line">    /**</div><div class="line">     * Searches for the specified key in the tree.</div><div class="line">     * @param key key of the element to search</div><div class="line">     * @return boolean value indication of success or failure</div><div class="line">     */</div><div class="line">    boolean find(int key);</div><div class="line"></div><div class="line">    /**</div><div class="line">     * Inserts a new element into the tree.</div><div class="line">     * @param key key of the element</div><div class="line">     * @param value value of the element</div><div class="line">     */</div><div class="line">    void insert(int key, double value);</div><div class="line"></div><div class="line">    /**</div><div class="line">     * Deletes an element from the tree using the specified key.</div><div class="line">     * @param key key of the element to delete</div><div class="line">     */</div><div class="line">    void delete(int key);</div><div class="line"></div><div class="line">    /**</div><div class="line">     * Traverses and prints values of the tree in ascending order based on key.</div><div class="line">     */</div><div class="line">    void traverse();</div><div class="line"></div><div class="line">&#125;</div></pre></td></tr></table></figure></p>
<h4 id="binary-tree功能实现">Binary Tree功能实现</h4>
<ol>
<li>
<p>Find:
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div><div class="line">23</div><div class="line">24</div></pre></td><td class="code"><pre><div class="line">public boolean find(int key) &#123;</div><div class="line">    // tree is empty</div><div class="line">    if (root == null) &#123;</div><div class="line">        return false;</div><div class="line">    &#125;</div><div class="line"></div><div class="line">    Node curr = root;</div><div class="line">    // while not found</div><div class="line">    while (curr.key != key) &#123;</div><div class="line">        if (curr.key &lt; key) &#123;</div><div class="line">            // go right</div><div class="line">            curr = curr.right;</div><div class="line">        &#125; else &#123;</div><div class="line">            // go left</div><div class="line">            curr = curr.left;</div><div class="line">        &#125;</div><div class="line"></div><div class="line">        // not found</div><div class="line">        if (curr == null) &#123;</div><div class="line">            return false;</div><div class="line">        &#125;</div><div class="line">    &#125;</div><div class="line">    return true; // found</div><div class="line">&#125;</div></pre></td></tr></table></figure></p>
</li>
<li>
<p>Insert
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div><div class="line">23</div><div class="line">24</div><div class="line">25</div><div class="line">26</div><div class="line">27</div><div class="line">28</div><div class="line">29</div><div class="line">30</div><div class="line">31</div><div class="line">32</div><div class="line">33</div><div class="line">34</div><div class="line">35</div><div class="line">36</div><div class="line">37</div></pre></td><td class="code"><pre><div class="line">public void insert(int key, double value) &#123;</div><div class="line">    Node newNode = new Node(key, value);</div><div class="line">    // empty tree</div><div class="line">    if (root == null) &#123;</div><div class="line">        root = newNode;</div><div class="line">        return;</div><div class="line">    &#125;</div><div class="line"></div><div class="line">    Node parent = root; // keep track of parent</div><div class="line">    Node curr = root;</div><div class="line">    while (true) &#123;</div><div class="line">        // no duplicate keys allowed</div><div class="line">        // simply keep the existing one here</div><div class="line">        if (curr.key == key) &#123;</div><div class="line">            return;</div><div class="line">        &#125;</div><div class="line"></div><div class="line">        parent = curr; // update parent</div><div class="line">        if (curr.key &lt; key) &#123;</div><div class="line">            // go right</div><div class="line">            curr = curr.right;</div><div class="line">            if (curr == null) &#123;</div><div class="line">                // found a spot</div><div class="line">                parent.right = newNode;</div><div class="line">                return;</div><div class="line">            &#125;</div><div class="line">        &#125; else &#123;</div><div class="line">            // go left</div><div class="line">            curr = curr.left;</div><div class="line">            if (curr == null) &#123;</div><div class="line">                // found a spot</div><div class="line">                parent.left = newNode;</div><div class="line">                return;</div><div class="line">            &#125;</div><div class="line">        &#125; // end of if-else to go right or left</div><div class="line">    &#125; // end of while</div><div class="line">&#125; // end of insert method</div></pre></td></tr></table></figure></p>
</li>
<li>
<p>Delete
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div><div class="line">23</div><div class="line">24</div><div class="line">25</div><div class="line">26</div><div class="line">27</div><div class="line">28</div><div class="line">29</div><div class="line">30</div><div class="line">31</div><div class="line">32</div><div class="line">33</div><div class="line">34</div><div class="line">35</div><div class="line">36</div><div class="line">37</div><div class="line">38</div><div class="line">39</div><div class="line">40</div><div class="line">41</div><div class="line">42</div><div class="line">43</div><div class="line">44</div><div class="line">45</div><div class="line">46</div><div class="line">47</div><div class="line">48</div><div class="line">49</div><div class="line">50</div><div class="line">51</div><div class="line">52</div><div class="line">53</div><div class="line">54</div><div class="line">55</div><div class="line">56</div><div class="line">57</div><div class="line">58</div><div class="line">59</div><div class="line">60</div><div class="line">61</div><div class="line">62</div><div class="line">63</div><div class="line">64</div><div class="line">65</div><div class="line">66</div><div class="line">67</div><div class="line">68</div><div class="line">69</div><div class="line">70</div><div class="line">71</div><div class="line">72</div><div class="line">73</div><div class="line">74</div></pre></td><td class="code"><pre><div class="line">public void delete(int key) &#123;</div><div class="line">    // empty tree</div><div class="line">    if (root == null) &#123;</div><div class="line">        return;</div><div class="line">    &#125;</div><div class="line"></div><div class="line">    Node parent = root;</div><div class="line">    Node curr = root;</div><div class="line">    /*</div><div class="line">     * flag to check left child</div><div class="line">     *</div><div class="line">     * need this flag because actual deletion process happens after the</div><div class="line">     * while loop that is to find the key to delete</div><div class="line">     */</div><div class="line">    boolean isLeftChild = true;</div><div class="line"></div><div class="line">    while (curr.key != key) &#123;</div><div class="line">        parent = curr; // update parent first</div><div class="line">        if (curr.key &lt; key) &#123; // go right</div><div class="line">            isLeftChild = false;</div><div class="line">            curr = curr.right;</div><div class="line">        &#125; else &#123; // go left</div><div class="line">            isLeftChild = true;</div><div class="line">            curr = curr.left;</div><div class="line">        &#125;</div><div class="line"></div><div class="line">        // case 1: not found</div><div class="line">        if (curr == null) &#123;</div><div class="line">            return;</div><div class="line">        &#125;</div><div class="line">    &#125;</div><div class="line"></div><div class="line">    if (curr.left == null &amp;&amp; curr.right == null) &#123;</div><div class="line">        // case 2: leaf</div><div class="line">        if (curr == root) &#123;</div><div class="line">            root = null;</div><div class="line">        &#125; else if (isLeftChild) &#123;</div><div class="line">            parent.left = null;</div><div class="line">        &#125; else &#123;</div><div class="line">            parent.right = null;</div><div class="line">        &#125;</div><div class="line">    &#125; else if (curr.right == null) &#123;</div><div class="line">        // case 3: no right child</div><div class="line">        if (curr == root) &#123;</div><div class="line">            root = curr.left;</div><div class="line">        &#125; else if (isLeftChild) &#123;</div><div class="line">            parent.left = curr.left;</div><div class="line">        &#125; else &#123;</div><div class="line">            parent.right = curr.left;</div><div class="line">        &#125;</div><div class="line">    &#125; else if (curr.left == null) &#123;</div><div class="line">        // case 3: no left child</div><div class="line">        if (curr == root) &#123;</div><div class="line">            root = curr.right;</div><div class="line">        &#125; else if (isLeftChild) &#123;</div><div class="line">            parent.left = curr.right;</div><div class="line">        &#125; else &#123;</div><div class="line">            parent.right = curr.right;</div><div class="line">        &#125;</div><div class="line">    &#125; else &#123;</div><div class="line">        // case 4: with two children</div><div class="line">        // here we use successor but using predecessor is also an option</div><div class="line">        Node successor = getSuccessor(curr);</div><div class="line"></div><div class="line">        if(curr == root) &#123;</div><div class="line">            root = successor;</div><div class="line">        &#125; else if(isLeftChild) &#123;</div><div class="line">            parent.left = successor;</div><div class="line">        &#125; else &#123;</div><div class="line">            parent.right = successor;</div><div class="line">        &#125;</div><div class="line">        successor.left = curr.left;</div><div class="line">    &#125;</div><div class="line">&#125;</div></pre></td></tr></table></figure></p>
</li>
<li>
<p>找到下一个节点
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div><div class="line">23</div><div class="line">24</div><div class="line">25</div><div class="line">26</div><div class="line">27</div><div class="line">28</div><div class="line">29</div><div class="line">30</div><div class="line">31</div><div class="line">32</div></pre></td><td class="code"><pre><div class="line">/**</div><div class="line"> * Helper method to find the successor of the toDelete node.</div><div class="line"> * This tries to find the smallest value of the right subtree</div><div class="line"> * of the toDelete node by going down to the left most node in the subtree</div><div class="line"> * @param toDelete node to delete</div><div class="line"> * @return the successor of the toDelete node</div><div class="line"> */</div><div class="line">private Node getSuccessor(Node toDelete) &#123;</div><div class="line">    Node successorParent = toDelete;</div><div class="line">    Node successor = toDelete;</div><div class="line">    // start the search from the root of the right subtree</div><div class="line">    Node curr = toDelete.right;</div><div class="line"></div><div class="line">    // move down to left as far as possible in the right subtree</div><div class="line">    // successor&apos;s left child must be null</div><div class="line">    while (curr != null) &#123;</div><div class="line">        successorParent = successor;</div><div class="line">        successor = curr;</div><div class="line">        curr = curr.left;</div><div class="line">    &#125;</div><div class="line"></div><div class="line">    /*</div><div class="line">     * If successor is NOT the right child of the node to delete, then</div><div class="line">     * need to take care of two connections in the right subtree</div><div class="line">     */</div><div class="line">    if (successor != toDelete.right) &#123;</div><div class="line">        successorParent.left = successor.right;</div><div class="line">        successor.right = toDelete.right;</div><div class="line">    &#125;</div><div class="line"></div><div class="line">    return successor;</div><div class="line">&#125;</div></pre></td></tr></table></figure></p>
</li>
<li>
<p>Traverse Binary Tree:
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div></pre></td><td class="code"><pre><div class="line">public void traverse() &#123;</div><div class="line">    inOrderHelper(root);</div><div class="line">    System.out.println();</div><div class="line">&#125;</div><div class="line"></div><div class="line">private void inOrderHelper(Node toVisit) &#123;</div><div class="line">    if(toVisit != null) &#123;</div><div class="line">        inOrderHelper(toVisit.left);</div><div class="line">        System.out.print(toVisit);</div><div class="line">        inOrderHelper(toVisit.right);</div><div class="line">    &#125;</div><div class="line">&#125;</div></pre></td></tr></table></figure></p>
</li>
</ol>
<hr>
<p>Reference: @Terry Lee</p>
]]></content>
    
    <summary type="html">
    
      &lt;script src=&quot;/assets/js/DPlayer.min.js&quot;&gt; &lt;/script&gt;&lt;p&gt;本文是总结Tree这种结构的常用知识点，暂时总结Binary Tree。&lt;/p&gt;
&lt;h2 id=&quot;binary-tree二叉树&quot;&gt;Binary Tree二叉树&lt;/h2&gt;
&lt;h
    
    </summary>
    
      <category term="Data Structure" scheme="http:starllap.space/categories/Data-Structure/"/>
    
    
      <category term="Data Structure" scheme="http:starllap.space/tags/Data-Structure/"/>
    
      <category term="Tree" scheme="http:starllap.space/tags/Tree/"/>
    
  </entry>
  
  <entry>
    <title>云计算Project：Twitter大数据分析</title>
    <link href="http:starllap.space/2017/02/27/%E4%BA%91%E8%AE%A1%E7%AE%97Project%EF%BC%9ATwitter%E5%A4%A7%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/"/>
    <id>http:starllap.space/2017/02/27/云计算Project：Twitter大数据分析/</id>
    <published>2017-02-28T04:37:26.000Z</published>
    <updated>2017-03-30T01:48:05.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/DPlayer.min.js"> </script><p>本文是Twitter Analytics on the Cloud项目的介绍及分析总结。小组作业当时做的匆忙，现在再思考下可以优化的地方很多。感谢队友@shuangshuang 和 @烟酱。</p>
<h2 id="项目介绍">项目介绍</h2>
<p>目标：</p>
<ul>
<li>在云上建立一个高性能又可靠的web服务。</li>
<li>设计，开发和部署并优化服务器以能够处理每秒上万次请求的高负载。</li>
<li>在一个1TB的数据集上完成ETL并载入到Mysql和HBase中。</li>
<li>设计MySQL和HBase并优化配置，提高性能。</li>
<li>探索基于云的web服务存在瓶颈的方法，并提高性能。</li>
</ul>
<h2 id="基本结构">基本结构：</h2>
<p>￼￼￼￼￼
<img src="http://i.imgur.com/qMn31B4.png" alt=""></p>
<h3 id="前端">前端：</h3>
<ul>
<li>通过HTTP GET请求访问web服务，不同的请求有不同的地址，后面有不同参数。</li>
<li>返回相应的响应时，必须要在持续若干个小时的测试中正常运行。</li>
<li>web服务不能拒绝请求，要能承受高负载。</li>
</ul>
<h3 id="后端">后端：</h3>
<ul>
<li>保存用来查询的数据文件</li>
<li>比较SQL(MySQL)和NoSQL(HBase)</li>
<li>比较不同数据集不同查询类型的表现，来决定如何实现后端。</li>
</ul>
<h3 id="数据集">数据集：</h3>
<p>Twitter数据集，大于1T，JSON格式存储。</p>
<h2 id="项目实战">项目实战</h2>
<h3 id="搭建前端">搭建前端：</h3>
<p>在搭建前端之前，需要慎重选择框架。对比主流web框架，参考<a href="https://www.techempower.com/benchmarks/" target="_blank" rel="external">Techempower</a>,我们最终选择用vertx和undertow进行开发。
具体可以参考一些比较好的配置指南：</p>
<p>Vertx:</p>
<p><a href="http://vertx.io/docs/" target="_blank" rel="external">vertx Document</a>
<a href="http://vertx.io/blog/my-first-vert-x-3-application/index.html" target="_blank" rel="external">My first Vert.x 3 Application</a></p>
<h4 id="前端优化">前端优化：</h4>
<ul>
<li>运用Cache，每次得到请求先check是否有缓存。当缓存满了的时候，就把最不常用的缓存踢出去。</li>
</ul>
<h3 id="etl">ETL:</h3>
<p>根据request设计好数据库的schema以后，要好好设计ETL。因为我们这里用EMR把twitter数据集载入到数据仓库中，每次需要10-20个小时，而EMR特别贵，所以最好不要重复劳动。最初，用小数据及来测试。</p>
<p>这一阶段我们要处理两类请求，从存储系统中获取数据，搭建好的web service 需要能够连接到两个不同的后端存储系统(MySQL 和 HBase)，前端需要通过端口 80 接收 HTTP GET 请求。</p>
<h4 id="操作过程">操作过程：</h4>
<p>这里主要要写一个Map和一个Reduce文件来处理数据。原始数据的格式是JSON，我们需要处理成需要的数据格式：</p>
<p>请求格式
userid+hashtag
<code>GET /q2?userid=uid&amp;hashtag=hashtag</code></p>
<p>响应格式 (如果Tweet存在)</p>
<ul>
<li>tweet 的 sentiment density</li>
<li>tweet 的发布时间</li>
<li>tweet id</li>
<li>审查修改过的的 tweet 内容，这里有很多可能出问题的地方，比如 emoji 表情、反斜杠、其他语言的字符等等</li>
</ul>
<p><code>TEAMID,TEAM_AWS_ACCOUNT_ID\n Sentiment_density1:Tweet_time1:Tweet_id1:Cencored_text1\n Sentiment_density2:Tweet_time2:Tweet_id2:Cencored_text2\n Sentiment_density3:Tweet_time3:Tweet_id3:Cencored_text3\n</code></p>
<p>响应格式 (如果Tweet不存在)
<code>TEAMID,TEAM_AWS_ACCOUNT_ID\n \n</code></p>
<p>map和reduce程序写完后，到EMR上面跑，要注意：</p>
<ul>
<li>现用小数据集测试。</li>
<li>注意各种小细节</li>
<li>关于EMR的操作，步骤之后有空总结下之前云计算的EMR project。</li>
</ul>
<h3 id="query-文本清理和分析">Query 文本清理和分析</h3>
<p>目标吞吐量： 10000 rps
不允许用现用的缓存设备，可以自己写缓存。
会查询某个用户用指定的 hashtag 发的 tweet，主要考察如何设计一个高效的后端来处理大量的请求。</p>
<h3 id="后端数据库">后端数据库</h3>
<p>ETL结束以后，我们需要导入数据库。在这个过程中，我们纠结于replication和sharding的选择。
Replication是指将完整的数据库存在每一台机器上，而Sharding是指分成几个部分分别存在每一台机器上。最终，选择了Sharding模式。</p>
<h4 id="数据库设计">数据库设计：</h4>
<p>按照我们刚刚说过的请求格式和响应格式，我们对MySQL和HBase进行设计：</p>
<h5 id="mysql">MySQL：</h5>
<h6 id="设计模式">设计模式：</h6>
<p>（这里参照了Yuki组的赢家设计模式，非常简单粗暴）
原来的schema是每一列都很清晰，但是这样row相比后面的设计模式多了很多，导致数据库的读取速度慢了很多。
所以新的schema就选择只存取id，读取所有的tweets以后，让前端进行相应的解析。</p>
<p><img src="http://i.imgur.com/Zmk6QUp.png" alt=""></p>
<h6 id="优化方法">优化方法：</h6>
<ul>
<li>建立索引Index</li>
<li>mysql有两个存储引擎，MyISAM和InnoDB，MyISAM适用于大量查寻，对写并不是非常友好，updata时会整表锁住。而InnoDB使用的是“行锁&quot;。
设置Key_buffer_size以及Query_cache_size到更高的值，可以增加缓冲容量。</li>
<li>设置所有column为not null，这样mysql不用预留空间检查null值。会提高读取速度。</li>
</ul>
<h5 id="hbase">HBase:</h5>
<p>鉴于HBase是key-value存储模式，我们在这里只要考虑key里怎么放，剩下的数据全都放到column family里面就可以了。
我们采用tweet_id + user_id + hashtag作为rowkey。</p>
<h6 id="优化方法摘自小土刀博客">优化方法（摘自小土刀博客）：</h6>
<p>1.分配合适的内存给 RegionServer 服务:
例如在 HBase 的 conf 目录下的 hbase-env.sh 的最后添加 export HBASE_REGIONSERVER_OPTS=”-Xmx16000m $HBASE_REGIONSERVER_OPTS”
其中 16000m 为分配给 RegionServer 的内存大小。</p>
<p>2.RegionServer 的请求处理 IO 线程数:
较少的 IO 线程适用于处理单次请求内存消耗较高的 Big Put 场景 (大容量单次 Put 或设置了较大 cache 的 Scan，均属于 Big Put) 或 ReigonServer 的内存比较紧张的场景。
较多的 IO 线程，适用于单次请求内存消耗低，TPS 要求 (每秒事务处理量 (TransactionPerSecond)) 非常高的场景。设置该值的时候，以监控内存为主要参考。
在 hbase-site.xml 配置文件中配置项为 hbase.regionserver.handler.count 200</p>
<p>3.调整 Block Cache:
hfile.block.cache.size：RS的block cache的内存大小限制，默认值0.25，在偏向读的业务中，可以适当调大该值，具体配置时需试hbase集群服务的业务特征，结合memstore的内存占比进行综合考虑。</p>
<h2 id="总结">总结：</h2>
<p>Team Project过去挺久了，很多细节记不得了，清洗数据的部分有很多细节需要注意，并不像这里写的一两句话就讲清楚了。还有数据库优化是一条不归路，盲目优化会导致反向优化，其实根据后来赢家的报告来看，优化并起不到多少作用，好的schema设计才是提高performance的最根本。
云计算这门课的精华，都在这个Project，覆盖了大部分这门课的所实验的知识。从load balance到sharding和replication，再到SQL和NoSQL数据库，再到EMR的应用，就差并行并发那部分的内容了。
学习是不难的，有指导来做project也不难，真正到了实际应用中，没有人知道正确答案，靠的都是思考和经验了。</p>
<p><strong>References：</strong></p>
<ol>
<li><a href="http://wdxtub.com/" target="_blank" rel="external">小土刀云计算语料分析&amp;反思课</a></li>
<li>小Yuki的Report</li>
</ol>
]]></content>
    
    <summary type="html">
    
      &lt;script src=&quot;/assets/js/DPlayer.min.js&quot;&gt; &lt;/script&gt;&lt;p&gt;本文是Twitter Analytics on the Cloud项目的介绍及分析总结。小组作业当时做的匆忙，现在再思考下可以优化的地方很多。感谢队友@shuangshuan
    
    </summary>
    
      <category term="Cloud Computing" scheme="http:starllap.space/categories/Cloud-Computing/"/>
    
    
      <category term="Cloud Computing" scheme="http:starllap.space/tags/Cloud-Computing/"/>
    
      <category term="Big Data" scheme="http:starllap.space/tags/Big-Data/"/>
    
      <category term="Twitter" scheme="http:starllap.space/tags/Twitter/"/>
    
      <category term="MySQL" scheme="http:starllap.space/tags/MySQL/"/>
    
      <category term="HBase" scheme="http:starllap.space/tags/HBase/"/>
    
  </entry>
  
  <entry>
    <title>云计算Project：基于多个后端的社交网络时间线的实现</title>
    <link href="http:starllap.space/2017/02/23/%E4%BA%91%E8%AE%A1%E7%AE%97Project%EF%BC%9A%E5%9F%BA%E4%BA%8E%E5%A4%9A%E4%B8%AA%E5%90%8E%E7%AB%AF%E7%9A%84%E7%A4%BE%E4%BA%A4%E7%BD%91%E7%BB%9C%E6%97%B6%E9%97%B4%E7%BA%BF%E7%9A%84%E5%AE%9E%E7%8E%B0/"/>
    <id>http:starllap.space/2017/02/23/云计算Project：基于多个后端的社交网络时间线的实现/</id>
    <published>2017-02-23T23:12:14.000Z</published>
    <updated>2017-03-30T01:47:44.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/DPlayer.min.js"> </script><p>本文是关于CMU15619Cloud Computing项目:<code>Social Networking Timeline with Heterogeneous Backends</code>的介绍以及总结反思。</p>
<p>项目主要目标：</p>
<ul>
<li>探索AWS的DBaas服务的申请、配置和管理</li>
<li>比较MySQL, HBase和MongoDB在使用Java API载入数据时的异同。</li>
<li>利用多个后端为同一个复杂的web应用提供数据。</li>
<li>比较不同数据库在实际应用中的特点。</li>
</ul>
<h2 id="背景介绍">背景介绍</h2>
<h3 id="dbaasdatabase-as-a-services">DBaas(Database-as-a-Services)：</h3>
<p>在AWS中，我们可以用其中的RDS的MySQL服务。</p>
<h3 id="mongodb">MongoDB：</h3>
<p>MongoDB是NoSQL数据库的典型，基于文档存储（Document-oriented），不支持事务和表连接，所以查询的编写、理解和优化比较容易。之后会写一篇关于NoSQL的总结（一个坑）。
和HBase的key-value存储模式不同，MongoDB基于文档存储模式的优势在于可以支持复杂的数据类型，并且也支持Index。
MongoDB使用BSON类型存储数据，据说就是把文本直接转成二进制表示，BSON用于以下三种目的：</p>
<ul>
<li>节省空间：BSON即使在最坏的情况下，也比普通的JSON占用空间少。</li>
<li>移动性</li>
<li>Performance：BSON对内容的编码和解码的速度快于很多编程语言。</li>
</ul>
<h3 id="数据结构-图">数据结构： 图</h3>
<h4 id="1邻接矩阵adjacent-matrix空间复杂度为on2">1.邻接矩阵Adjacent Matrix：空间复杂度为O（n^2)</h4>
<p>比如这个：
<img src="http://i.imgur.com/Qyrp0rL.png" alt="">
<img src="http://i.imgur.com/x2yyDEI.png" alt=""></p>
<h4 id="2邻接表adjacent-list-空间较少">2.邻接表Adjacent List 空间较少：</h4>
<p><img src="http://i.imgur.com/ofeLFxn.png" alt=""></p>
<h3 id="社交网络应用基础">社交网络应用基础：</h3>
<p>如今像Facebook, Twitter和Instagram都需要复杂和涉及良好的后端来处理多种类型的用户数据，提供持续的高性能低延迟的服务。同时还要通过实时数据分析为公司和广告商提供有价值的信息。</p>
<ul>
<li>不同的数据类型（Video，Text，Link，etc.)需要存在不同的数据库中）</li>
<li>一个简单的展示社交网络页面的HTTP请求会触发后端一系列的请求和数据库动作。可以参见下图：
<img src="http://i.imgur.com/r3gf00x.png" alt=""></li>
</ul>
<p>社交网络中的数据通常包括以下三种：</p>
<ul>
<li>用户信息：
<ul>
<li>身份验证系统</li>
<li>用户信息/简介</li>
<li>活动日志</li>
<li>社交关系图（在下面会进步介绍）</li>
</ul>
</li>
<li>用户活动：
<ul>
<li>用户产生的多媒体数据</li>
</ul>
</li>
<li>大数据分析系统：
<ul>
<li>搜索系统</li>
<li>推荐系统</li>
<li>用户行为分析（基于云数据仓库的OLAP，有机会单独更新这个部分）</li>
</ul>
</li>
</ul>
<p><img src="http://i.imgur.com/Qwd8z0q.png" alt=""></p>
<p>社交网络的前端已经做好，我们需要把四中不同的数据集存入三种数据库（MySQL, HBase, MongoDB),你完成的后端要能同时响应四中不同的request。</p>
<h2 id="项目操作">项目操作</h2>
<h3 id="通过rds的mysql实现基本登录">通过RDS的MySQL实现基本登录：</h3>
<p>在AWS RDS中配置MySQL并导入users.csv, userinfo.csv数据集。
连接AWS RDS中MySQL时注意：
远程登录需要导入数据时要加入 --local-infile得到授权。
<code>mysql -u username -p password -h hostname --port=portname --local-infile database</code></p>
<p>数据集格式:</p>
<ul>
<li>users.csv [UserID, Password]</li>
<li>userinfo.csv [UserID, Name, Profile Image URL]</li>
</ul>
<p>导入MySQL语句:
<code>LOAD DATA LOCAL INFILE 'filename' INTO TABLE tablename CHARACTER SET utf8mb4 FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n';</code></p>
<p>请求格式:
<code>GET /task1?id=[UserID]&amp;pwd=[Password]</code></p>
<p>响应格式:
<code>returnRes({&quot;name&quot;:&quot;my_name&quot;, &quot;profile&quot;:&quot;profile_image_url&quot;})</code></p>
<p>所以，之后在Java文件中连接数据库，再创建JSON相应的代码即可。
测试：</p>
<ol>
<li>启动前后端服务器，访问<code>http://&lt;your_front_end_dns&gt;:3000</code></li>
<li>输入正确或错误的账号密码登录测试</li>
</ol>
<p><img src="http://i.imgur.com/rv5LnYr.png" alt=""></p>
<h3 id="利用hbase存储社交图谱">利用HBase存储社交图谱：</h3>
<p>用HBase来保存用户间的follow关系，可以选择用之前在图中介绍的邻接矩阵和邻接表中选择一种，来保存数据。
原始数据格式：
<code>&lt;followee, follower&gt;</code></p>
<p>请求格式：
<code>GET /task2?id=[UserID]</code></p>
<p>响应格式:
<code>{&quot;followers&quot;:[{&quot;name&quot;:&quot;follower_name_1&quot;, &quot;profile&quot;:&quot;profile_image_url_1&quot;}, {&quot;name&quot;:&quot;follower_name_2&quot;, &quot;profile&quot;:&quot;profile_image_url_2&quot;}, ...]}</code></p>
<h4 id="思路">思路：</h4>
<ul>
<li>在HBase中存成followee： follower1， follower2， ...的格式</li>
<li>设计好HBase之后导入数据</li>
<li>启动前后端服务器后访问http://&lt;your_front_end_dns&gt;:3000</li>
<li>输入userid进行测试</li>
</ul>
<h3 id="用mongdb搭建主页">用MongDB搭建主页：</h3>
<p>如之前介绍的那样，对于各种形式的帖子，用MongoDB存储会是一个很好的选择。这里会查询一些特定的field，所以可以建立索引来加速查询。</p>
<p>帖子数据的形式：
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div><div class="line">23</div></pre></td><td class="code"><pre><div class="line">&#123;</div><div class="line">    &quot;pid&quot;:xxx,                                      // PostID</div><div class="line">    &quot;uid&quot;:xxx,                                      // UserID of poster</div><div class="line">    &quot;name&quot;:&quot;xxx&quot;,                                   // User name of poster</div><div class="line">    &quot;profile&quot;:&quot;xxx&quot;,                                // Poster profile image URL</div><div class="line">    &quot;timestamp&quot;:&quot;YYYY-MM-DD HH:MM:SS&quot;,              // When post is posted</div><div class="line">    &quot;image&quot;:&quot;xxx&quot;,                                  // Post image</div><div class="line">    &quot;content&quot;:&quot;xxx&quot;,                                // Post text content</div><div class="line">    &quot;comments&quot;:[                                    // comments json array</div><div class="line">        &#123;</div><div class="line">            &quot;uid&quot;:xxx,                              // UserID of commenter</div><div class="line">            &quot;name&quot;:&quot;xxx&quot;,                           // User name of commenter</div><div class="line">            &quot;profile&quot;:&quot;xxx&quot;,                        // Commenter profile image URL</div><div class="line">            &quot;timestamp&quot;:&quot;YYYY-MM-DD HH:MM:SS&quot;,      // When comment is made</div><div class="line">            &quot;content&quot;:&quot;xxx&quot;                         // Comment text content</div><div class="line">        &#125;,</div><div class="line">        &#123;</div><div class="line">            &quot;uid&quot;:xxx,</div><div class="line">            .......</div><div class="line">        &#125;,</div><div class="line">        ......</div><div class="line">    ]</div><div class="line">&#125;</div></pre></td></tr></table></figure></p>
<p>关于MongoDB建立索引，可以参考<a href="https://docs.mongodb.com/manual/reference/method/db.collection.createIndex/" target="_blank" rel="external">这里</a></p>
<p>请求格式：
<code>GET /task3?id=[UserID]</code></p>
<p>响应格式：
<code>{&quot;posts&quot;:[{post1_json}, {post2_json}, ...]}</code></p>
<p>测试方法：</p>
<ul>
<li>启动前后端服务器，输入userid</li>
</ul>
<p><img src="http://i.imgur.com/8p2pep3.png" alt=""></p>
<h3 id="最终整合">最终整合</h3>
<p>之前三个部分分别实现了三个数据库的存储，现在我们希望实现输入一个userid就可以返回用户信息（MySQL),用户粉丝列表（HBase）以及用户关注的人最新三十条帖子（MongoDB）。</p>
<p>排序规则：</p>
<ul>
<li>对followers进行排序:
<ul>
<li>姓名升序排列</li>
<li>Profile image URL升序排列</li>
</ul>
</li>
<li>对最新30篇post排序：
<ul>
<li>按照timestamp升序排序</li>
<li>按照PostID升序排序</li>
</ul>
</li>
</ul>
<p>请求格式：
<code>GET /task4?id=[UserID]</code></p>
<p>响应格式:
<code>{&quot;name&quot;:&quot;my_name&quot;, &quot;profile&quot;:&quot;my_profile_image_url&quot;, &quot;followers&quot;:[{&quot;name&quot;:&quot;follower_name_1&quot;, &quot;profile&quot;:&quot;profile_image_url_1&quot;}, {&quot;name&quot;:&quot;follower_name_2&quot;, &quot;profile&quot;:&quot;profile_image_url_2&quot;}, ...], &quot;posts&quot;:[{post1_json, post2_json, ...}]}</code></p>
<p><img src="http://i.imgur.com/EpRB33c.png" alt=""></p>
<h3 id="简单推荐的实现">简单推荐的实现</h3>
<p>推荐系统的内容太多了，可以看看<a href="http://www.shuang0420.com/categories/Recommender-System/" target="_blank" rel="external">shaung的博客</a>（一个广告）
这次我们用协同过滤算法实现一个简单的推荐系统，利用“朋友的朋友”来推荐好友。</p>
<h4 id="graph-distance">Graph Distance：</h4>
<p>比如：
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div></pre></td><td class="code"><pre><div class="line">A follows &#123;B, C, D&#125;</div><div class="line">Followee B follows &#123;C, E, A&#125;</div><div class="line">followee C follows &#123;F, G&#125;</div><div class="line">followee D follows &#123;G, H&#125;</div></pre></td></tr></table></figure></p>
<p><img src="http://i.imgur.com/83cuJ8G.png" alt="">
我们可以得到与A的距离关系为：
<code>{A:1, C:1, E:1, F:1, G:2, H:1}</code>
其中去掉A本身，去掉A已经关注的C，剩下的就是
<code>{G: 2, E: 1, F: 1, H: 1}</code></p>
<h4 id="思路">思路：</h4>
<ul>
<li>找到userid的关注的人的集合</li>
<li>将关注的人的集合中的每个人关注的人添加到信集合中，第一次出现则为1，之后的为原来的加1</li>
<li>用优先队列存储，注意第一个关注的人集合中的元素都不应该在此队列中</li>
<li>返回前十个的name和url，并返回</li>
</ul>
<p>请求格式：
<code>http://backend-public-dns:8080/MiniSite/task5?id=&lt;user_id&gt;</code></p>
<p>响应格式：
<code>returnRes({&quot;recommendation&quot;:[{name:&lt;name1&gt;, profile:&lt;profile1&gt;},{name:&lt;name2&gt;, profile:&lt;profile2&gt;},...,{name:&lt;name10&gt;, profile:&lt;profile10&gt;]})</code></p>
<p>Done!</p>
<p><strong>Reference：</strong><br>
CMU15619课件：<code>Social Networking Timeline with Heterogeneous Backends</code>
小土刀博客：<code>http://wdxtub.com/vault/cc-17.html</code></p>
]]></content>
    
    <summary type="html">
    
      &lt;script src=&quot;/assets/js/DPlayer.min.js&quot;&gt; &lt;/script&gt;&lt;p&gt;本文是关于CMU15619Cloud Computing项目:&lt;code&gt;Social Networking Timeline with Heterogeneous Back
    
    </summary>
    
      <category term="Cloud Computing" scheme="http:starllap.space/categories/Cloud-Computing/"/>
    
    
      <category term="Cloud Computing" scheme="http:starllap.space/tags/Cloud-Computing/"/>
    
      <category term="MySQL" scheme="http:starllap.space/tags/MySQL/"/>
    
      <category term="HBase" scheme="http:starllap.space/tags/HBase/"/>
    
      <category term="Social Network" scheme="http:starllap.space/tags/Social-Network/"/>
    
      <category term="MongoDB" scheme="http:starllap.space/tags/MongoDB/"/>
    
  </entry>
  
  <entry>
    <title>Java Interview Questions</title>
    <link href="http:starllap.space/2017/02/11/Java-Interview-Questions/"/>
    <id>http:starllap.space/2017/02/11/Java-Interview-Questions/</id>
    <published>2017-02-11T17:51:39.000Z</published>
    <updated>2017-06-03T20:01:51.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/DPlayer.min.js"> </script><p>Always update...</p>
<h3 id="jvm-is-plantform-dependent">JVM is plantform dependent?</h3>
<blockquote>
<p>Is the JVM (Java Virtual Machine) platform dependent or platform independent? What is the advantage of using the JVM, and having Java be a translated language?</p>
</blockquote>
<hr>
<p><strong>JVM translates bytecode into machine language</strong>
Every Java program is first compiled into an intermediate language called Java bytecode. The JVM is used primarily for 2 things: the first is to translate the bytecode into the machine language for a particular computer, and the second thing is to actually execute the corresponding machine-language instructions as well. The JVM and bytecode combined give Java its status as a &quot;portable&quot; language – this is because Java bytecode can be transferred from one machine to another.</p>
<p><strong>Machine language is OS dependent</strong>
Since the JVM must translate the bytecode into machine language, and since the machine language depends on the operating system being used, it is clear that the JVM is platform (operating system) dependent – in other words, the JVM is not platform independent.</p>
<p><strong>The JVM is not platform independent</strong>
The key here is that the JVM depends on the operating system – so if you are running Mac OS X you will have a different JVM than if you are running Windows or some other operating system.</p>
<h3 id="overloading-amp-overriding">Overloading &amp; Overriding</h3>
<blockquote>
<p>In Java, what’s the difference between method overloading and method overriding?</p>
</blockquote>
<hr>
<p><strong>Overloading:</strong>
Method overloading in Java occurs when two or more methods in the same class have <strong>the exact same name but different parameters</strong> (remember that method parameters accept values passed into the method). However, method overloading is a compile-time phenomenon.
<strong>Can be
overloading:</strong>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div></pre></td><td class="code"><pre><div class="line">1.) The number of parameters is different for the methods.</div><div class="line">2.) The parameter types are different (like</div><div class="line">changing a parameter that was a float to an int).</div></pre></td></tr></table></figure></p>
<p><strong>Not overloading</strong>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div></pre></td><td class="code"><pre><div class="line">1. Just changing the return type of the method. (Compiler Error)</div><div class="line">2. Changing just the name of the method parameters, but</div><div class="line">not changing the parameter types.</div></pre></td></tr></table></figure></p>
<p><strong>Overriding:</strong>
[根本也记不住，其实我的方法是小朋友骑在爸爸肩膀上，他们主体是一样的，不会变的，即方法参数返回值不变，但内容变了。<em>(:зゝ∠)</em>]
Overriding means that a method inherited from a parent class will be changed. But, when overriding a method everything remains <strong>exactly the same except the method definition – basically what the method does is changed slightly to fit in with the needs of the child class</strong>. But, the method name, the number and types of parameters, and the return type will all remain the same.
Method overriding is a run-time phenomenon that is the driving force behind polymorphism.</p>
<h3 id="private-constructor">Private Constructor</h3>
<blockquote>
<p>What’s the point of having a private constructor?</p>
</blockquote>
<hr>
<p>Defining a constructor with the private modifier says that only the native class (as in the class in which the private constructor is defined) is allowed to create an instance of the class, and no other caller is permitted to do so.</p>
<p>There are two possible reasons why one would want to use a private constructor – the first is that <strong>you don’t want any objects of your class to be created at all,</strong> and the second is that <strong>you only want objects to be created internally – as in only created in your class.</strong></p>
<p>A <strong>singleton</strong> is a design pattern that allows only one instance of your class to be created, and this can be accomplished by using a private constructor.</p>
<h3 id="an-object-and-a-class">An object and a class</h3>
<blockquote>
<p>In Java, what’s the difference between an object and a class?</p>
</blockquote>
<p>Shortly: An object is an instance of a class.
Objects have a lifespan but classes do not.</p>
<h3 id="java-platform">Java platform</h3>
<blockquote>
<p>What is the main difference between Java platform and other platforms?</p>
</blockquote>
<p>The Java platform differs from most other platforms in the sense that it's a software-based platform that runs on top of other hardware-based platforms.It has two components:</p>
<ul>
<li>Runtime Environment</li>
<li>API(Application Programming Interface)</li>
</ul>
<h3 id="write-once-and-run-anywhere">write once and run anywhere</h3>
<blockquote>
<p>What gives Java its 'write once and run anywhere' nature?</p>
</blockquote>
<p>The bytecode. Java is compiled to be a byte code which is the intermediate language between source code and machine code. This byte code is not platform specific and hence can be fed to any platform.</p>
<h3 id="is-empty-java-file-name-a-valid-source-file-name">Is Empty .java file name a valid source file name?</h3>
<p>Yes, save your java file by .java only, compile it by javac .java and run by java yourclassname Let's take a simple example:
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div></pre></td><td class="code"><pre><div class="line">//save by .java only  </div><div class="line">class A&#123;  </div><div class="line">public static void main(String args[])&#123;  </div><div class="line">System.out.println(&quot;Hello java&quot;);  </div><div class="line">&#125;  </div><div class="line">&#125;  </div><div class="line">//compile by javac .java  </div><div class="line">//run by     java A</div></pre></td></tr></table></figure></p>
<p>compile it by javac .java</p>
<p>run it by java A</p>
<h3 id="if-i-dont-provide-any-arguments-on-the-command-line-then-the-string-array-of-main-method-will-be-empty-or-null">If I don't provide any arguments on the command line, then the String array of Main method will be empty or null?</h3>
<p>It is empty. But not null.</p>
<h3 id="what-if-i-write-static-public-void-instead-of-public-static-void">What if I write static public void instead of public static void?</h3>
<p>Program compiles and runs properly.</p>
<h3 id="what-is-the-default-value-of-the-local-variables">What is the default value of the local variables?</h3>
<p>The local variables are not initialized to any default value, neither primitives nor object references.</p>
<h3 id="what-is-difference-between-object-oriented-programming-language-and-object-based-programming-language">What is difference between object oriented programming language and object based programming language?</h3>
<p>Object based programming languages follow all the features of OOPs except Inheritance. Examples of object based programming languages are JavaScript, VBScript etc.</p>
<h3 id="what-will-be-the-initial-value-of-an-object-reference-which-is-defined-as-an-instance-variable">What will be the initial value of an object reference which is defined as an instance variable?</h3>
<p>The object references are all initialized to null in Java.</p>
<h3 id="constructor-in-java">Constructor in Java</h3>
<p>Constructor in java is a special type of method that is used to initialize the object.</p>
<p>Java constructor is invoked at the time of object creation. It constructs the values i.e. provides data for the object that is why it is known as constructor.</p>
<p>Rules for creating java constructor</p>
<p>There are basically two rules defined for the constructor.</p>
<p>Constructor name must be same as its class name
Constructor must have no explicit return type</p>
<p>Types of java constructors</p>
<p>There are two types of constructors:</p>
<p>Default constructor (no-arg constructor)
Parameterized constructor</p>
<h3 id=""></h3>
<hr>
<p><strong>References:</strong>
http://www.programmerinterview.com/
https://www.javatpoint.com/constructor</p>
]]></content>
    
    <summary type="html">
    
      &lt;script src=&quot;/assets/js/DPlayer.min.js&quot;&gt; &lt;/script&gt;&lt;p&gt;Always update...&lt;/p&gt;
&lt;h3 id=&quot;jvm-is-plantform-dependent&quot;&gt;JVM is plantform dependent?&lt;/h
    
    </summary>
    
      <category term="Interview" scheme="http:starllap.space/categories/Interview/"/>
    
    
      <category term="Interview" scheme="http:starllap.space/tags/Interview/"/>
    
      <category term="Java" scheme="http:starllap.space/tags/Java/"/>
    
  </entry>
  
  <entry>
    <title>Leetcode 271. Encode and Decode Strings</title>
    <link href="http:starllap.space/2017/02/10/271-Encode-and-Decode-Strings/"/>
    <id>http:starllap.space/2017/02/10/271-Encode-and-Decode-Strings/</id>
    <published>2017-02-11T07:52:01.000Z</published>
    <updated>2017-02-11T05:50:31.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/DPlayer.min.js"> </script><h3 id="question">Question:</h3>
<p>Design an algorithm to encode a list of strings to a string. The encoded string is then sent over the network and is decoded back to the original list of strings.</p>
<p>Machine 1 (sender) has the function:</p>
<p><figure class="highlight plain"><figcaption><span>encode(vector<string> strs) &#123;</string></span></figcaption><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div></pre></td><td class="code"><pre><div class="line">  // ... your code</div><div class="line">  return encoded_string;</div><div class="line">&#125;</div></pre></td></tr></table></figure></p>
<p>Machine 2 (receiver) has the function:
<figure class="highlight plain"><figcaption><span>decode(string s) &#123;</span></figcaption><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div></pre></td><td class="code"><pre><div class="line">  //... your code</div><div class="line">  return strs;</div><div class="line">&#125;</div></pre></td></tr></table></figure></p>
<p>So Machine 1 does:</p>
<p><figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div></pre></td><td class="code"><pre><div class="line">string encoded_string = encode(strs);</div></pre></td></tr></table></figure></p>
<p>and Machine 2 does:</p>
<p><figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div></pre></td><td class="code"><pre><div class="line">vector&lt;string&gt; strs2 = decode(encoded_string);</div></pre></td></tr></table></figure></p>
<p>strs2 in Machine 2 should be the same as strs in Machine 1.</p>
<p>Implement the encode and decode methods.</p>
<p>Note:</p>
<ul>
<li>The string may contain any possible characters out of 256 valid ascii characters. Your algorithm should be generalized enough to work on any possible characters.</li>
<li>Do not use class member/global/static variables to store states. Your encode and decode algorithms should be stateless.</li>
<li>Do not rely on any library method such as eval or serialize methods. You should implement your own encode/decode algorithm.</li>
</ul>
<h3 id="explanation">Explanation:</h3>
<p>乍一看，也不知道说的是什么。其实题意是给一个list的字符串，先要拼成一整个字符串，是encode。然后把这整个字符拆回一个list的字符。
所以考点是如何合理地分隔，然后还能识别出来。
这哪儿是算法，其实考的是Serilization这个计算机系统中的基本概念。
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div></pre></td><td class="code"><pre><div class="line">串行化(Serialization)是计算机科学中的一个概念，它是指将对象存储到介质（如文件、内存缓冲区等）中或是以二进制方式通过网络传输。之后可以通过反串行化从这些连续的字节（byte）数据重新构建一个与原始对象状态相同的对象，因此在特定情况下也可以说是得到一个副本，但并不是所有情况都这样。</div></pre></td></tr></table></figure></p>
<h3 id="code">Code:</h3>
<p><figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div><div class="line">23</div><div class="line">24</div></pre></td><td class="code"><pre><div class="line">public class Codec &#123;</div><div class="line"></div><div class="line">    // Encodes a list of strings to a single string.</div><div class="line">    public String encode(List&lt;String&gt; strs) &#123;</div><div class="line">        StringBuilder sb = new StringBuilder();</div><div class="line">        for(String s : strs) &#123;</div><div class="line">            sb.append(s.length()).append(&apos;/&apos;).append(s);</div><div class="line">        &#125;</div><div class="line">        return sb.toString();</div><div class="line">    &#125;</div><div class="line"></div><div class="line">    // Decodes a single string to a list of strings.</div><div class="line">    public List&lt;String&gt; decode(String s) &#123;</div><div class="line">        List&lt;String&gt; ret = new ArrayList&lt;String&gt;();</div><div class="line">        int i = 0;</div><div class="line">        while(i &lt; s.length()) &#123;</div><div class="line">            int slash = s.indexOf(&apos;/&apos;, i);</div><div class="line">            int size = Integer.valueOf(s.substring(i, slash));</div><div class="line">            ret.add(s.substring(slash + 1, slash + size + 1));</div><div class="line">            i = slash + size + 1;</div><div class="line">        &#125;</div><div class="line">        return ret;</div><div class="line">    &#125;</div><div class="line">&#125;</div></pre></td></tr></table></figure></p>
]]></content>
    
    <summary type="html">
    
      &lt;script src=&quot;/assets/js/DPlayer.min.js&quot;&gt; &lt;/script&gt;&lt;h3 id=&quot;question&quot;&gt;Question:&lt;/h3&gt;
&lt;p&gt;Design an algorithm to encode a list of strings to a s
    
    </summary>
    
      <category term="LeetCode" scheme="http:starllap.space/categories/LeetCode/"/>
    
    
      <category term="Google" scheme="http:starllap.space/tags/Google/"/>
    
      <category term="LeetCode" scheme="http:starllap.space/tags/LeetCode/"/>
    
      <category term="Medium" scheme="http:starllap.space/tags/Medium/"/>
    
      <category term="String" scheme="http:starllap.space/tags/String/"/>
    
      <category term="Design" scheme="http:starllap.space/tags/Design/"/>
    
  </entry>
  
  <entry>
    <title>Leetcode 346. Moving Average from Data Stream</title>
    <link href="http:starllap.space/2017/02/10/346-Moving-Average-from-Data-Stream/"/>
    <id>http:starllap.space/2017/02/10/346-Moving-Average-from-Data-Stream/</id>
    <published>2017-02-11T05:17:27.000Z</published>
    <updated>2017-02-11T02:20:33.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/DPlayer.min.js"> </script><h3 id="question">Question</h3>
<p>Given a stream of integers and a window size, calculate the moving average of all integers in the sliding window.</p>
<p>For example,
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div></pre></td><td class="code"><pre><div class="line">MovingAverage m = new MovingAverage(3);</div><div class="line">m.next(1) = 1</div><div class="line">m.next(10) = (1 + 10) / 2</div><div class="line">m.next(3) = (1 + 10 + 3) / 3</div><div class="line">m.next(5) = (10 + 3 + 5) / 3</div></pre></td></tr></table></figure></p>
<h3 id="explanation">Explanation:</h3>
<p>非常简单的题目，可以用queue或者arraylist或者array保存next的值。用一个sum存着总和，每次都计算一下平均值。</p>
<h3 id="code">Code:</h3>
<p><figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div><div class="line">23</div><div class="line">24</div><div class="line">25</div><div class="line">26</div><div class="line">27</div><div class="line">28</div><div class="line">29</div></pre></td><td class="code"><pre><div class="line">public class MovingAverage &#123;</div><div class="line"></div><div class="line">    /** Initialize your data structure here. */</div><div class="line">    Queue&lt;Integer&gt; q = new LinkedList&lt;&gt;();</div><div class="line">    int size;</div><div class="line">    int count = 0;</div><div class="line">    int total = 0;</div><div class="line">    public MovingAverage(int size) &#123;</div><div class="line">        this.size = size;</div><div class="line"></div><div class="line">    &#125;</div><div class="line"></div><div class="line">    public double next(int val) &#123;</div><div class="line">        if (count &lt; size) &#123;</div><div class="line">            count ++;</div><div class="line">            q.offer(val);</div><div class="line">            total += val;</div><div class="line">            return total*1.0 / count;</div><div class="line">        &#125; else &#123;</div><div class="line">            int remove = q.poll();</div><div class="line">            q.offer(val);</div><div class="line">            total -= remove;</div><div class="line">            total += val;</div><div class="line">            return total*1.0/ size;</div><div class="line">        &#125;</div><div class="line"></div><div class="line"></div><div class="line">    &#125;</div><div class="line">&#125;</div></pre></td></tr></table></figure></p>
]]></content>
    
    <summary type="html">
    
      &lt;script src=&quot;/assets/js/DPlayer.min.js&quot;&gt; &lt;/script&gt;&lt;h3 id=&quot;question&quot;&gt;Question&lt;/h3&gt;
&lt;p&gt;Given a stream of integers and a window size, calculate
    
    </summary>
    
      <category term="LeetCode" scheme="http:starllap.space/categories/LeetCode/"/>
    
    
      <category term="Google" scheme="http:starllap.space/tags/Google/"/>
    
      <category term="LeetCode" scheme="http:starllap.space/tags/LeetCode/"/>
    
      <category term="Design" scheme="http:starllap.space/tags/Design/"/>
    
      <category term="Easy" scheme="http:starllap.space/tags/Easy/"/>
    
      <category term="Queue" scheme="http:starllap.space/tags/Queue/"/>
    
  </entry>
  
  <entry>
    <title>Distributed System-Indirect Communication and Naming</title>
    <link href="http:starllap.space/2017/02/06/ds69/"/>
    <id>http:starllap.space/2017/02/06/ds69/</id>
    <published>2017-02-06T20:49:33.000Z</published>
    <updated>2017-03-30T01:52:40.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/DPlayer.min.js"> </script><p>CMU-95702分布式系统
第6、9章总结笔记</p>
<ul>
<li>
<p><strong>Indirect Messaging:</strong>
Indirect communication is defined as communication between entities in a distributed system through an intermediary with no direct coupling between the sender and the receiver(s).</p>
<ul>
<li>Decoupled in space:
<ul>
<li>Sender does not need to know the identify of the receiver(s) and visa-versa</li>
<li>Good for handling legacy systems
– Decoupled in time:</li>
<li>A component need not even be running</li>
<li>The messaging system can store messages until they are successfully delivered</li>
<li>Reliable delivery is insured</li>
</ul>
</li>
</ul>
</li>
<li>
<p><strong>Two messaging modes:</strong></p>
<ul>
<li>Point-to-point:
<ul>
<li>Inventory to Factory</li>
<li>Inventory to Sales</li>
<li>Factory to Accounting</li>
</ul>
</li>
<li>publish / subscribe :
<ul>
<li>parts to parts inventory and parts order</li>
</ul>
</li>
</ul>
</li>
<li>
<p><strong>some example scenarios:</strong></p>
<ul>
<li>asynchronous communication: like chat ant twitter type app,  report info to on or more interested systems</li>
<li>event-driven problem</li>
<li>decoupled/ multiple consumers</li>
<li>multiply interested parties</li>
</ul>
</li>
<li>
<p><strong>indirect messaging protocols:</strong></p>
<ul>
<li>two open standars: XMPP &amp; amqp</li>
</ul>
</li>
<li>
<p><strong>Java's JMS API:</strong></p>
<ul>
<li>An API for performing indirect messaging.</li>
<li>It is an abstraction API like JNDI and JDBC.</li>
<li>Interacts with some Message Oriented Middleware
(MOM)</li>
<li>JMS is a client-facing interface, meant to abstract
way the particulars of any MOM.</li>
<li>In theory, you should have portability of systems
written with JMS such that they can work with any
MOM.</li>
<li>API is javax.jms</li>
</ul>
</li>
<li>
<p><strong>JMS Queues and topics</strong></p>
</li>
<li>
<p><strong>JMS message types：</strong></p>
<ul>
<li>Stream
<ul>
<li>Sequential stream of Java primitive data types.</li>
</ul>
</li>
<li>Map
<ul>
<li>Set of name-value pairs</li>
</ul>
</li>
<li>Names are String objects</li>
<li>Values are Java primitives (including String)</li>
<li>Text
<ul>
<li>Message is a String object</li>
</ul>
</li>
<li>Plain-text message</li>
<li>XML messages</li>
<li>Object
<ul>
<li>Serialized Java object</li>
</ul>
</li>
<li>Simple if in Java-only environment</li>
<li>Bytes
<ul>
<li>Stream of un-interpreted bytes.</li>
<li>To encode a message body to match an existing message format</li>
</ul>
</li>
</ul>
</li>
<li>
<p><strong>Message driven beans：</strong>
components that are executedasynchronously by messages coming
available in a Queue or Topic.</p>
</li>
</ul>
<p>Line Question on class:</p>
<ul>
<li>abstract indirect messaging API: JMS</li>
<li>point-to-point messaging: Queue</li>
<li>mom used in lab: Glassfish</li>
<li>public subscribe message: Topics</li>
<li>Indirect Messaging Destination: Queue and Topics</li>
<li>Administratively managed resources (External to your program):
Connectionfactory, Queue, Topics</li>
</ul>
]]></content>
    
    <summary type="html">
    
      &lt;script src=&quot;/assets/js/DPlayer.min.js&quot;&gt; &lt;/script&gt;&lt;p&gt;CMU-95702分布式系统
第6、9章总结笔记&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Indirect Messaging:&lt;/strong&gt;
Indirect
    
    </summary>
    
      <category term="Distributed System" scheme="http:starllap.space/categories/Distributed-System/"/>
    
    
      <category term="Distributed System" scheme="http:starllap.space/tags/Distributed-System/"/>
    
      <category term="Indirect Messaging" scheme="http:starllap.space/tags/Indirect-Messaging/"/>
    
      <category term="JMS" scheme="http:starllap.space/tags/JMS/"/>
    
  </entry>
  
  <entry>
    <title>Distributed System-Mobile and ubiquitous computing</title>
    <link href="http:starllap.space/2017/02/06/Distributed-System/"/>
    <id>http:starllap.space/2017/02/06/Distributed-System/</id>
    <published>2017-02-06T20:28:32.000Z</published>
    <updated>2017-03-30T01:53:16.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/DPlayer.min.js"> </script><p>CMU-95702分布式系统
第19章总结笔记</p>
<ul>
<li><strong>Design issues in distributed mobile applications:</strong>
<ul>
<li><strong>Association:</strong>
• Devices
– Appear	and	disappear	from	the	space.
– Do	so	unpredictably
– May	be	totally	new	to	the	space.
– Or	may	be	returning	to	the	space.
• They	need	to	be:
– Perhaps	added	to	the	network
– Brought	into	Association	with	resources	and	applications
• Examples	of	Association
– Come	on	campus	and	be	able	to	be	associated	with	the	printers that	are	close	to	you.
– Be	alerted	if	someone	you	know	is	walking	near	you.
– Be	provided	with	selling	prices	in	your	local	area	for	your	goods
(not	prices	in	far-away	areas).</li>
<li><strong>Application-level Association:</strong>
often by discorvery, broadcasts</li>
</ul>
</li>
</ul>
<ul>
<li>
<p><strong>How a new device become part of the local network?</strong></p>
<ul>
<li>ARP</li>
<li>DHCP</li>
</ul>
</li>
<li>
<p><strong>Sensing and Context Awareness:</strong></p>
<ul>
<li>sensing: camera, time, acceleration, location, speed, temperature</li>
<li>context awareness: in terms of sensed data, or associated data</li>
</ul>
</li>
<li>
<p><strong>Location Sensing:</strong></p>
<ul>
<li>GPS</li>
<li>Database of	collected	Wifi access points
– stores the	access	point's	MAC	address	and	the	GPS
location	at	which	it	was	observed</li>
<li>Cellular	– compute	using	signal	strength	to
multiple	cellular	tower	locations</li>
<li>RFID	tags	– tags	are	associated	with	a	location</li>
</ul>
</li>
<li>
<p><strong>Adaptation:</strong></p>
<ul>
<li>Presentation	to	fit	the	screen</li>
<li>Use	of	JavaScript	to	fit	the	devices	capabilities</li>
<li>Media	quality	to	fit	the	screen	and	device	capabilities</li>
<li>Language	to	fit	the	user</li>
<li>Information	to	fit	the	physical	context.
• Give	only	movie	times	in	the	future,	and	in	nearby	theaters</li>
</ul>
</li>
<li>
<p><strong>Device awareness / browser detection:</strong>
Reply	differently	depending	on	what	device	makes	request.
3 HTTP	headers	provide	clues	of	what	the	device	is:</p>
</li>
</ul>
<ol>
<li>
<p>User-Agent
• Identifies	the	mobile	browser	and	almost always	the	device manufacturer	and	model.
• BlackBerry8330/4.3.0	Profile/MIDP-2.0	Configuration/CLDC-1.1
VendorID/105
• Collection	of	mobile	agent	strings:
– http://www.zytrax.com/tech/web/mobile_ids.html</p>
</li>
<li>
<p>X-Wap-Profile
• Link	to	an	XML	profile	of	the	phone’s	capabilities
•E.g.http://www.blackberry.net/go/mobile/profiles/uaprof/8310/4.2.2.rdf</p>
</li>
<li>
<p>Accept
• Supported	MIME(多用途的网际邮件扩充协议)types
• E.g.	text/html,	application/xhtml+xml,	etc.</p>
<p>These 3 headers can provide enough info , but:
- header can be missing
- have inaccurate values
- have invalid urls</p>
</li>
</ol>
<ul>
<li><strong>Feature detection(more flexible and reliable solution):</strong>
bottom line: using feature detection, not browser detection
<ul>
<li>Two strategies for feature detection:
<ul>
<li>Graceful	degradation
<ul>
<li>Design	for	modern	browsers</li>
<li>Where	features	are	not	available,	provide	a	simpler
alternative
<ul>
<li>If	not	possible,	alert	the	user</li>
</ul>
</li>
<li>Don't	allow	it	to	invisibly	fail</li>
</ul>
</li>
<li>Progressive	enhancement</li>
</ul>
<ul>
<li>Design	with	a	baseline	of	usable	functionality</li>
<li>Enrich	the	user	experience	step-by-step	by	testing	for
features	before	using	them.</li>
</ul>
</li>
</ul>
</li>
</ul>
<ul>
<li><strong>Responsive web design:</strong>
<ul>
<li>A	strategy	of	web	design	for	multiple	screen	sizes</li>
<li>Uses:
<ul>
<li>Fluid	grids	expressing	sizes	in	terms	of	percents,	not
pixels</li>
<li>Modify	size	of	media	using	relative	units
<ul>
<li>Keep	them	within	their	bounding	elements</li>
<li>Images</li>
<li>Media</li>
<li>Font	size</li>
</ul>
</li>
<li>Crossing	size	thresholds	switch	to	completely	different
designs
<ul>
<li>Accomplished	using	media	queries</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
</ul>
<ul>
<li>
<p><strong>Mobile first:</strong></p>
</li>
<li>
<p>A	philosophy	of	web	design</p>
</li>
<li>
<p>Design	for	mobile	first,	and	desktop	second</p>
</li>
<li>
<p>Counter	to	what	has	been	done	historically,	of	mobile	2nd</p>
</li>
<li>
<p>Benefits	of	Mobile	First:
- Focus	on	the	platform	on	which	you	will	reach	the	most	users
- Forces	designers	to	focus	on	the	most	important	content	and
functionality
- Allows	for	using	technologies	on	mobile:
* touch	events
* geolocation 地理定位
* accelerometer 加速计</p>
</li>
<li>
<p><strong>Mobile deployment strategies:</strong></p>
<ul>
<li>Native
<ul>
<li>E.g.	Android,	iOS</li>
<li>Requires	redeveloping	for	each	architecture</li>
</ul>
<ul>
<li>2	code	bases</li>
</ul>
</li>
<li>Native	with	Development	Framework
<ul>
<li>Use	a	framework	that	compiles	to	multiple	native
applications</li>
<li>E.g.	Corona	(http://www.coronalabs.com)</li>
</ul>
</li>
<li>Mobile	Web
<ul>
<li>Develop	in	HTML	/	CSS	/JavaScript</li>
<li>Accessed	in	a	browser</li>
<li>Can	install	local	icon	to	launch	to	site
<ul>
<li>Use	local	storage	to	store	information	when	off	line</li>
<li>Use	manifest	to	cache	application	to	use	when	off	line</li>
<li>Sync	when	Internet	is	again	available.</li>
</ul>
</li>
</ul>
<ul>
<li>Hybrid
<ul>
<li>Develop	in	HTML	/	CSS	/	JavaScript</li>
<li>Wrap	in	a	browser	wrapper	to	create	native	apps</li>
<li>Wrapper	provides	access	to	phone	hardware	not
accessible	from	the	browser</li>
<li>Apache	Cordova	(https://cordova.apache.org)	is	an	open
source	native	wrapper</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
</ul>
]]></content>
    
    <summary type="html">
    
      &lt;script src=&quot;/assets/js/DPlayer.min.js&quot;&gt; &lt;/script&gt;&lt;p&gt;CMU-95702分布式系统
第19章总结笔记&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Design issues in distributed mobile applic
    
    </summary>
    
      <category term="Distributed System" scheme="http:starllap.space/categories/Distributed-System/"/>
    
    
      <category term="Distributed System" scheme="http:starllap.space/tags/Distributed-System/"/>
    
      <category term="mobile" scheme="http:starllap.space/tags/mobile/"/>
    
  </entry>
  
  <entry>
    <title>Distributed System-ACID</title>
    <link href="http:starllap.space/2017/02/06/17/"/>
    <id>http:starllap.space/2017/02/06/17/</id>
    <published>2017-02-06T20:20:32.000Z</published>
    <updated>2017-03-30T01:53:48.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/DPlayer.min.js"> </script><p>CMU-95702分布式系统
ACID概念总结</p>
<ul>
<li><strong>ACID Transactions：</strong>
<ul>
<li><strong>Atomic:</strong>  All or nothing. No intermediate states are visible. No possibility that
only part of the transaction ran. If a transaction fails or aborts prior to
committing, the TP system will undo the effects of any updates (will
recover). We either commit or abort the entire process. Checkpointing and
Logging and recoverable objects can be used to ensure a transaction is
atomic with respect to failures.</li>
<li><strong>Consistent:</strong>  system invariants preserved, e.g., if there were n dollars in a
bank before a transfer transaction then there will be n dollars in the bank
after the transfer. This is largely in the hands of the application programmer.</li>
<li><strong>Isolated:</strong>  Two transactions do not interfere with each other. They appear as
serial executions. This is the case even though transactions may run
concurrently. Locking is often used to prevent one transaction from
interfering with another.</li>
<li><strong>Durable:</strong> The commit causes a permanent change to stable storage. This
property may be obtained with log-based recovery algorithms. If there has
been a commit but updates have not yet been a commit but updates have not yet been completed due to a crash,
the logs will hold the necessary information on recovery.</li>
</ul>
</li>
</ul>
]]></content>
    
    <summary type="html">
    
      &lt;script src=&quot;/assets/js/DPlayer.min.js&quot;&gt; &lt;/script&gt;&lt;p&gt;CMU-95702分布式系统
ACID概念总结&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;ACID Transactions：&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;&lt;stro
    
    </summary>
    
      <category term="Distributed System" scheme="http:starllap.space/categories/Distributed-System/"/>
    
    
      <category term="Distributed System" scheme="http:starllap.space/tags/Distributed-System/"/>
    
      <category term="ACID" scheme="http:starllap.space/tags/ACID/"/>
    
  </entry>
  
  <entry>
    <title>哪儿</title>
    <link href="http:starllap.space/2017/01/24/%E5%93%AA%E5%84%BF/"/>
    <id>http:starllap.space/2017/01/24/哪儿/</id>
    <published>2017-01-24T09:02:55.000Z</published>
    <updated>2017-01-24T06:06:42.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/DPlayer.min.js"> </script><p><embed src="http://www.xiami.com/widget/12753884_1769553081/singlePlayer.swf" type="application/x-shockwave-flash" width="257" height="33" wmode="transparent"></p>
<p>今天上Advanced Cloud Computing,三个教授就一个问题争论起来。</p>
<p>那真的是很触动我的一刻，加起来都快150岁的三个顶尖学术大牛，在投影仪前面，对待知识还是和孩子一样专注和热情，可爱极了。</p>
<p>忽然想到，Greg的主页上写着，“我才不是个不度假的教授，我2012年去冲浪了呢！”。</p>
<p>其实，每次选课都很头疼，时间太少，想学的课太多，只恨自己不能多读几年。我也知道上学期很不开心的时候，发过誓这学期不要选很难的课了。可是我的研究生只有一次，在CMU念书的机会也只有一次（当然我不介意以后再来哈哈哈），无法不说服自己再争取一下。</p>
<p>只是，最近有一个问题在渐渐放大。</p>
<p>我到底要去哪里。</p>
<p>其实，很多人也不知道到底要去哪里。但其实每个人，都能看到那么一些些微弱的光，好像是那个地方。只是有些人会选择努力跑着去，有些人怀疑自己是否真的看见了光，有些人，装作看不见。</p>
<p>之前和萱哥聊天，萱哥倒是很爽快，说不管去哪儿不想呆北京吸霾了，除非这次考上了北影。两年了啊，北影的梦还在。</p>
<p>你看你看，坚定的人都很坦然。</p>
<p>就像机器学习拿了满分的Mengyao，总和我说毕业找不到工作，打算去星巴克门口蹲着摆个碗。</p>
<p>所以有的时候，不知道自己要去哪儿其实也不可怕。</p>
<p>毕竟未来未知不可怕，已知才最可怕。</p>
<p>我算是想明白了，关于人生，我一直思考得太累了。这明明是一道无解题，就算我在某个时刻心满意足想明白了，也一定是幻觉。</p>
<p>没有答案的。所以不要思考了。</p>
<p>还不如痛痛快快活一次就好。</p>
]]></content>
    
    <summary type="html">
    
      &lt;script src=&quot;/assets/js/DPlayer.min.js&quot;&gt; &lt;/script&gt;&lt;p&gt;&lt;embed src=&quot;http://www.xiami.com/widget/12753884_1769553081/singlePlayer.swf&quot; type=&quot;app
    
    </summary>
    
      <category term="真实的生命" scheme="http:starllap.space/categories/%E7%9C%9F%E5%AE%9E%E7%9A%84%E7%94%9F%E5%91%BD/"/>
    
    
      <category term="真实的生命" scheme="http:starllap.space/tags/%E7%9C%9F%E5%AE%9E%E7%9A%84%E7%94%9F%E5%91%BD/"/>
    
  </entry>
  
  <entry>
    <title>Java中的排序问题</title>
    <link href="http:starllap.space/2017/01/08/Java%E4%B8%AD%E7%9A%84Sort/"/>
    <id>http:starllap.space/2017/01/08/Java中的Sort/</id>
    <published>2017-01-08T20:56:31.000Z</published>
    <updated>2017-06-24T02:00:22.000Z</updated>
    
    <content type="html"><![CDATA[<script src="/assets/js/DPlayer.min.js"> </script><p>总结几种排序方法。
冒泡排序，选择排序，插入排序，Quick Sort,Merge Sort（continue..）...</p>
<p><strong>Simple sorting</strong></p>
<h2 id="bubble-sort">Bubble Sort</h2>
<p>慢，但是简单。
<strong>Time complexity:    O(N^2)</strong></p>
<p><strong>步骤:</strong></p>
<p><figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div></pre></td><td class="code"><pre><div class="line">1. 每次只比较两个值。</div><div class="line">2. 如果左边的值大，就交换两个值。</div><div class="line">3. 移动大的值到右边。</div></pre></td></tr></table></figure></p>
<p><strong>举例:</strong></p>
<p><figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div></pre></td><td class="code"><pre><div class="line">原数组: [4,7, 2, 5, 3]</div><div class="line">Round 1:</div><div class="line">  -&gt; [4,7,2,5,3]</div><div class="line">  -&gt; [4,2,7,5,3]</div><div class="line">  -&gt; [4,2,5,7,3]</div><div class="line">  -&gt; [4,2,5,3,7]</div><div class="line">Round 2:</div><div class="line">  -&gt; [2,4,5,3,7]</div><div class="line">  -&gt; [2,4,5,3,7]</div><div class="line">  -&gt; [2,4,3,5,7]</div><div class="line">Round 3:</div><div class="line">  -&gt; [2,4,3,5,7]</div><div class="line">  -&gt; [2,3,4,5,7]</div><div class="line">Round 4:</div><div class="line">  -&gt; [2,3,4,5,7]</div></pre></td></tr></table></figure></p>
<p><strong>Code:</strong></p>
<p>int[ ] data = {4, 7, 2, 5, 3}</p>
<p>Swap Method
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div></pre></td><td class="code"><pre><div class="line">// a helper method that swaps two values in an int array</div><div class="line">private static void swap(int[] data, int one, int two) &#123;</div><div class="line">    int temp = data[one];</div><div class="line">    data[one] = data[two];</div><div class="line">    data[two] = temp;</div><div class="line">&#125;</div></pre></td></tr></table></figure></p>
<p>Bubble Sort:
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div></pre></td><td class="code"><pre><div class="line">public static void bubbleSort(int[] data) &#123;</div><div class="line">       // move backward from the last index</div><div class="line">       for (int out = data.length - 1; out &gt;= 1; out--) &#123;</div><div class="line">           // move forward from the beginning</div><div class="line">           // bubble up the largest value to the right</div><div class="line">           for (int in = 0; in &lt; out; in++) &#123;</div><div class="line">               if (data[in] &gt; data[in + 1]) &#123;</div><div class="line">                   swap(data, in, in + 1);</div><div class="line">               &#125;</div><div class="line">           &#125;</div><div class="line">       &#125;</div><div class="line">   &#125;</div></pre></td></tr></table></figure></p>
<h2 id="selection-sort">Selection Sort</h2>
<p>比冒泡排序快但是依旧不够快。
<strong>Time complexity:    O(N^2)</strong>
比冒泡排序少了很多swap的过程，所以稍微快一些。</p>
<p><strong>步骤:</strong>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div></pre></td><td class="code"><pre><div class="line">1. 选出最小的值。</div><div class="line">2. 移动到最左边。</div></pre></td></tr></table></figure></p>
<p><strong>举例:</strong></p>
<p><figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div></pre></td><td class="code"><pre><div class="line">原数组: [4,7, 2, 5, 3]</div><div class="line">Round1:</div><div class="line">  最小的是2：</div><div class="line">  [2,7,4,5,3]</div><div class="line">Round2:</div><div class="line">  最小的是3</div><div class="line">  [2,3,4,5,7]</div><div class="line">Round3:</div><div class="line">  最小的是4</div><div class="line">  [2,3,4,5,7]</div><div class="line">Round4:</div><div class="line">  最小的是5：</div><div class="line">  [2,3,4,5,7]</div></pre></td></tr></table></figure></p>
<p><strong>Code:</strong></p>
<p><figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div></pre></td><td class="code"><pre><div class="line">for (int out = 0; out&lt;data.length; out++) &#123;</div><div class="line">  min = out;</div><div class="line">  for (int in = out+1; in &lt; data.length; in ++) &#123;</div><div class="line">    if (data[in] &lt; min) &#123;</div><div class="line">      min = in;</div><div class="line">    &#125;</div><div class="line">  &#125;</div><div class="line">  if (out != min) &#123;</div><div class="line">    swap(data, out, min);</div><div class="line">  &#125;</div><div class="line">&#125;</div></pre></td></tr></table></figure></p>
<h2 id="insertion-sort">Insertion Sort:</h2>
<p>最直观的排序法。
<strong>Time complexity: O(N^2)</strong></p>
<p><strong>步骤:</strong>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div></pre></td><td class="code"><pre><div class="line">想象数组中有一个分割线。</div><div class="line">1. 左手边的是排好序的。</div><div class="line">2. 线的右边的第一个元素需要被插入到左边的相应位置中。</div></pre></td></tr></table></figure></p>
<p><strong>举例:</strong>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div></pre></td><td class="code"><pre><div class="line">原数组: [4,7, 2, 5, 3]</div><div class="line">-&gt; [4,|7,2,5,3]</div><div class="line">-&gt; [4,7,|2,5,3]</div><div class="line">-&gt; [2,4,7,|5,3]</div><div class="line">-&gt; [2,4,5,7,|3]</div><div class="line">-&gt; [2,3,4,5,7]</div></pre></td></tr></table></figure></p>
<p><strong>Code:</strong>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div></pre></td><td class="code"><pre><div class="line">public static void insertionSort(int[] data) &#123;</div><div class="line">   // start from the 1st index till the last index</div><div class="line">  for (int out = 1; out&lt;data.length; out++) &#123;</div><div class="line">    int in = out;</div><div class="line">    int temp = data[out];</div><div class="line">    /*</div><div class="line">    * loop to check the sorted section going backward</div><div class="line">    * but not necessarily all the way to the 0th</div><div class="line">    * On average, look halfway through the sorted section</div><div class="line">    */</div><div class="line">    while (in&gt;0 &amp;&amp; data[in-1]&gt;=temp) &#123;</div><div class="line">      data[in] = data[in-1];</div><div class="line">      in --;</div><div class="line">    &#125;</div><div class="line">    if (out != in) &#123;</div><div class="line">      data[in] = temp;</div><div class="line">    &#125;</div><div class="line">  &#125;</div><div class="line"></div><div class="line">&#125;</div></pre></td></tr></table></figure></p>
<p>##Quick Sort##
<strong>Time complexity: O(NlogN)</strong></p>
<p><strong>步骤:</strong>
是一种对冒泡排序的一种改进。通过一趟把数据分成独立的两部分，其中所有的数据都比另一份数据小。然后再继续排序。递归直到整个数据变成更有序序列。</p>
<p><strong>举例</strong>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div></pre></td><td class="code"><pre><div class="line">原数组: [4,7, 2, 5, 3]</div><div class="line">-&gt; [4,|7,2,5,3]</div><div class="line">-&gt; [4,7,|2,5,3]</div><div class="line">-&gt; [2,4,7,|5,3]</div><div class="line">-&gt; [2,4,5,7,|3]</div><div class="line">-&gt; [2,3,4,5,7]</div></pre></td></tr></table></figure></p>
<p><strong>代码</strong>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div><div class="line">23</div><div class="line">24</div></pre></td><td class="code"><pre><div class="line">static void quicksort(int n[], int left, int right) &#123;</div><div class="line">    int dp;</div><div class="line">    if (left &lt; right) &#123;</div><div class="line">        dp = partition(n, left, right);</div><div class="line">        quicksort(n, left, dp - 1);</div><div class="line">        quicksort(n, dp + 1, right);</div><div class="line">    &#125;</div><div class="line">&#125;</div><div class="line"></div><div class="line">static int partition(int n[], int left, int right) &#123;</div><div class="line">    int pivot = n[left];</div><div class="line">    while (left &lt; right) &#123;</div><div class="line">        while (left &lt; right &amp;&amp; n[right] &gt;= pivot)</div><div class="line">            right--;</div><div class="line">        if (left &lt; right)</div><div class="line">            n[left++] = n[right];</div><div class="line">        while (left &lt; right &amp;&amp; n[left] &lt;= pivot)</div><div class="line">            left++;</div><div class="line">        if (left &lt; right)</div><div class="line">            n[right--] = n[left];</div><div class="line">    &#125;</div><div class="line">    n[left] = pivot;</div><div class="line">    return left;</div><div class="line">&#125;</div></pre></td></tr></table></figure></p>
<p>##Merge Sort##
<strong>Time complexity: O(NlogN)</strong></p>
<p><strong>Code</strong>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><div class="line">1</div><div class="line">2</div><div class="line">3</div><div class="line">4</div><div class="line">5</div><div class="line">6</div><div class="line">7</div><div class="line">8</div><div class="line">9</div><div class="line">10</div><div class="line">11</div><div class="line">12</div><div class="line">13</div><div class="line">14</div><div class="line">15</div><div class="line">16</div><div class="line">17</div><div class="line">18</div><div class="line">19</div><div class="line">20</div><div class="line">21</div><div class="line">22</div><div class="line">23</div><div class="line">24</div><div class="line">25</div><div class="line">26</div><div class="line">27</div><div class="line">28</div><div class="line">29</div><div class="line">30</div><div class="line">31</div><div class="line">32</div><div class="line">33</div><div class="line">34</div><div class="line">35</div><div class="line">36</div></pre></td><td class="code"><pre><div class="line"></div><div class="line">public  void sort(int[] A) &#123;</div><div class="line">  int[] tmp = new int[A.length];</div><div class="line">  mergeSort(A, 0, A.length -1 , tmp);</div><div class="line">&#125;</div><div class="line"></div><div class="line">public void mergeSort(int[] A, int start, int end, int[] tmp) &#123;</div><div class="line">  if (start &gt;= end) return;</div><div class="line">  int left = start, right = end;</div><div class="line">  int mid = (start+end)/2;</div><div class="line">  mergeSort(A, start, mid, tmp);</div><div class="line">  mergeSort(A, mid+1, end,tmp);</div><div class="line">  merge(A, start, mid, end, tmp);</div><div class="line">&#125;</div><div class="line"></div><div class="line">public void merge(int[] A, int start, int mid, int end, int[] tmp) &#123;</div><div class="line">  int left = start;</div><div class="line">  int right = mid + 1;</div><div class="line">  int index = start;</div><div class="line">  while(left &lt;= mid &amp;&amp; right &lt;= end) &#123;</div><div class="line">    if (A[left] &lt; A[right]) &#123;</div><div class="line">      tmp[index++] = A[left++];</div><div class="line">    &#125; else &#123;</div><div class="line">      tmp[index++] = A[right++];</div><div class="line">    &#125;</div><div class="line">  &#125;</div><div class="line">  while(left &lt;= mid) &#123;</div><div class="line">    tmp[index++] = A[left++];</div><div class="line">  &#125;</div><div class="line">  while(right &lt;= end) &#123;</div><div class="line">    tmp[index+++] = A[right++];</div><div class="line">  &#125;</div><div class="line">  for(index = start; index &lt;= end; index++) &#123;</div><div class="line">    A[index] = tmp[index];</div><div class="line">  &#125;</div><div class="line">&#125;</div></pre></td></tr></table></figure></p>
]]></content>
    
    <summary type="html">
    
      &lt;script src=&quot;/assets/js/DPlayer.min.js&quot;&gt; &lt;/script&gt;&lt;p&gt;总结几种排序方法。
冒泡排序，选择排序，插入排序，Quick Sort,Merge Sort（continue..）...&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Simple sor
    
    </summary>
    
      <category term="Data Structure" scheme="http:starllap.space/categories/Data-Structure/"/>
    
    
      <category term="Java" scheme="http:starllap.space/tags/Java/"/>
    
      <category term="Sort" scheme="http:starllap.space/tags/Sort/"/>
    
  </entry>
  
</feed>
