您现在的位置： Linux教程網 >> UnixLinux > >> Linux綜合 >> 學習Linux

分布式搜索引擎Elasticsearch性能優化與配置，elasticsearch分布式

1、內存優化

在bin/elasticsearch.in.sh中進行配置

修改配置項為盡量大的內存：

ES_MIN_MEM=8g
ES_MAX_MEM=8g

兩者最好改成一樣的，否則容易引發長時間GC（stop-the-world）

elasticsearch默認使用的GC是CMS GC，如果你的內存大小超過6G，CMS是不給力的，容易出現stop-the-world，建議使用G1 GC

JAVA_OPTS=”$JAVA_OPTS -XX:+UseParNewGC”
JAVA_OPTS=”$JAVA_OPTS -XX:+UseConcMarkSweepGC”

JAVA_OPTS=”$JAVA_OPTS -XX:CMSInitiatingOccupancyFraction=75″
JAVA_OPTS=”$JAVA_OPTS -XX:+UseCMSInitiatingOccupancyOnly”

#修改為：

JAVA_OPTS=”$JAVA_OPTS -XX:+UseG1GC”
JAVA_OPTS=”$JAVA_OPTS -XX:MaxGCPauseMillis=200″

G1 GC優點是減少stop-the-world的幾率，但是CPU占有率高

二、服務器配置

1、節點配置

cluster.name: es-cluster  //這個是配置集群的名字，為了能進行自動查找
node.name: "node-100" //這個是配置當前節點的名字，當然每個節點的名字都應該是唯一的
node.master: false
node.data: true 
//這兩個配置有4種配置方法，表示這個節點是否可以充當主節點，這個節點是否充當數據節點。
//如果你的節點數目只有兩個的話，為了防止腦裂的情況，需要手動設置主節點和數據節點。
//其他情況建議直接不設置，默認兩個都為true

network.host: "0.0.0.0" //綁定host，0.0.0.0代表所有IP，為了安全考慮，建議設置為內網IP'
transport.tcp.port: 9300 //節點到節點之間的交互是使用tcp的，這個設置設置啟用的端口
http.port: 9200  //這個是對外提供http服務的端口，安全考慮，建議修改，不用默認的9200

當master為false，而data為true時，會對該節點產生嚴重負荷；
當master為true，而data為false時，該節點作為一個協調者；
當master為false，data也為false時，該節點就變成了一個負載均衡器。

2、自動發現　

es提供了四種選擇，一種是默認實現，其他都是通過插件實現。

Azure discovery 插件方式，多播
EC2 discovery 插件方式，多播
Google Compute Engine (GCE)discovery 插件方式多播
zen discovery默認實現多播/單播

elasticsearch的集群是內嵌自動發現功能的。

單播配置下，節點向指定的主機發送單播請求，配置如下：

discovery.zen.ping.multicast.enabled: false
discovery.zen.fd.ping_timeout: 100s
discovery.zen.ping.timeout: 100s
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.unicast.hosts: ["172.31.26.200", "172.31.26.222"]

多播配置下，意思就是說，你只需要在每個節點配置好了集群名稱，節點名稱，互相通信的節點會根據es自定義的服務發現協議去按照多播的方式來尋找網絡上配置在同樣集群內的節點。和其他的服務發現功能一樣，es是支持多播和單播的。多播和單播的配置分別根據這幾個參數：

discovery.zen.ping.multicast.enabled: true
discovery.zen.fd.ping_timeout: 100s
discovery.zen.ping.timeout: 100s
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.unicast.hosts: ["172.31.26.200"]

discovery.zen.ping.multicast.enabled //這個設置把組播的自動發現給關閉了，為了防止其他機器上的節點自動連入。
discovery.zen.fd.ping_timeout和discovery.zen.ping.timeout是設置了節點與節點之間的連接ping時長
discovery.zen.minimum_master_nodes //這個設置為了避免腦裂。比如3個節點的集群，如果設置為2，那麼當一台節點脫離後，不會自動成為master。
discovery.zen.ping.unicast.hosts //這個設置了自動發現的節點。
action.auto_create_index: false //這個關閉了自動創建索引。為的也是安全考慮，否則即使是內網，也有很多掃描程序，一旦開啟，掃描程序會自動給你創建很多索引。

多播是需要看服務器是否支持的，由於其安全性，其實現在基本的雲服務（比如阿裡雲）是不支持多播的，所以即使你開啟了多播模式，你也僅僅只能找到本機上的節點。單播模式安全，也高效，但是缺點就是如果增加了一個新的機器的話，就需要每個節點上進行配置才生效了。

3、自動選舉

elasticsearch集群一旦建立起來以後，會選舉出一個master，其他都為slave節點。但是具體操作的時候，每個節點都提供寫和讀的操作，你不論往哪個節點中做寫操作，這個數據也會分配到集群上的所有節點中。

如果是從節點掛掉了

那麼首先關心，數據會不會丟呢？不會。如果你開啟了replicate，那麼這個數據一定在別的機器上是有備份的。別的節點上的備份分片會自動升格為這份分片數據的主分片。

這裡要注意的是這裡會有一小段時間的yellow狀態時間

如果是主節點掛掉了

從節點發現和主節點連接不上了，那麼他們會自己決定再選舉出一個節點為主節點。但是這裡有個腦裂的問題，假設有5台機器，3台在一個機房，2台在另一個機房，當兩個機房之間的聯系斷了之後，每個機房的節點會自己聚會，推舉出一個主節點。這個時候就有兩個主節點存在了，當機房之間的聯系恢復了之後，這個時候就會出現數據沖突了

解決的辦法就是設置參數：discovery.zen.minimum_master_nodes為3(超過一半的節點數)，那麼當兩個機房的連接斷了之後，就會以大於等於3的機房的master為主，另外一個機房的節點就停止服務了

對於自動服務這裡不難看出，如果把節點直接暴露在外面，不管怎麼切換master，必然會有單節點問題。所以一般我們會在可提供服務的節點前面加一個負載均衡。

4、Too many open files

查看max_file_descriptors

curl http://localhost:9200/_nodes/process\?pretty
{
  "cluster_name" : "elasticsearch",
  "nodes" : {
    "qAZYd8jbSWKxFAcOu9Ax9Q" : {
      "name" : "Masque",
      "transport_address" : "127.0.0.1:9300",
      "host" : "127.0.0.1",
      "ip" : "127.0.0.1",
      "version" : "2.2.1",
      "build" : "d045fc2",
      "http_address" : "127.0.0.1:9200",
      "process" : {
        "refresh_interval_in_millis" : 1000,
        "id" : 31917,
        "mlockall" : true
      }
    }
  }
}

然而並沒有

# ps -ef | grep 'Xms' | grep -v grep | awk '{print $2}'
31917

# cat /proc/31917/limits  | grep 'Max open files'
Max open files            102400               102400               files

官方文檔建議

Make sure to increase the number of open files descriptors on the machine (or for the user running elasticsearch). Setting it to 32k or even 64k is recommended.　

最簡單的做法, 在bin/elasticsearch文件開始的位置加入

ulimit -n 102400

5、設置合理的刷新時間　

建立的索引，不會立馬查到，這是因為elasticsearch為near-real-time，需要配置index.refresh_interval參數，默認是1s

index.refresh_interval：1s

這樣所有新建的索引都使用這個刷新頻率

6、大量unassigned shards

其實剛搭完運行時就是status: yellow(所有主分片可用，但存在不可用的從分片), 只有一個節點, 主分片啟動並運行正常, 可以成功處理請求, 但是存在unassigned_shards, 即存在沒有被分配到節點的從分片.(只有一個節點.....)

curl -XGET http://localhost:9200/_cluster/health\?pretty
{
  "cluster_name" : "elasticsearch",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 15,
  "active_shards" : 15,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 15,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 50.0
}

處理方式: 找台內網機器, 部署另一個節點，再次檢查集群健康, 發現unassigned_shards減少, active_shards增多，操作完後, 集群健康從yellow恢復到 green

7、fix unassigned shards

找出UNASSIGNED分片

curl -s "http://localhost:9200/_cat/shards" | grep UNASSIGNED
index                3 p UNASSIGNED
index                3 r UNASSIGNED
index                1 p UNASSIGNED
index                1 r UNASSIGNED

查詢得到master節點的唯一標識　

curl 'localhost:9200/_nodes/process?pretty'

{
  "cluster_name" : "elasticsearch",
  "nodes" : {
    "AfUyuXmGTESHXpwi4OExxx" : {
      "name" : "Master",

執行reroute(分多次, 變更shard的值為UNASSIGNED查詢結果中編號, 上一步查詢結果是1和3)

curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
        "commands" : [ {
              "allocate" : {
                  "index" : "pv-2015.05.22",
                  "shard" : 1,
                  "node" : "AfUyuXmGTESHXpwi4OExxx",
                  "allow_primary" : true
              }
            }
        ]
    }'

三、插件的安裝

集群安裝成功之後，需要對集群中的索引數據、運行情況等信息進行查看，索引需要安裝一些插件，方面後續工作

1、head

通過head，可以查看集群幾乎所有信息，還能進行簡單的搜索查詢，觀察自動恢復的情況等等

ES_HOME/bin/plugin -install mobz/elasticsearch-head

安裝成功之後訪問： http://ip:9200/_plugin/head/

比如：cluster health:green (2, 20) : 表示該集群目前處於健康狀態，集群包含2台機器，索引總共20個分片。粗線綠框表示主分片，細線綠框為備份分片

2、bigdesk

bigdesk是集群監控插件，通過該插件可以查看整個集群的資源消耗情況，cpu、內存、http鏈接等等

ES_HOME/bin/plugin -install lukas-vlcek/bigdesk

安裝完成之後輸入：http://ip:9200/_plugin/bigdesk/#nodes即可

可以查看單個節點的資源使用情況，包括JVM、Thread Pools、OS、Process、HTTP&Transport、Indice、File system

插件大全：http://www.searchtech.pro/elasticsearch-plugins

參考文檔

http://m.blog.csdn.net/article/details?id=51203276
https://www.elastic.co/blog/performance-considerations-elasticsearch-indexing
http://blog.csdn.net/napoay/article/details/52002180
http://blog.csdn.net/napoay/article/details/52012249
http://blog.csdn.net/laigood/article/details/7421173
http://www.yalasao.com/77/elasticsearch-config-tuning
http://keenwon.com/1359.html
http://es.xiaoleilu.com/080_Structured_Search/40_bitsets.html
http://lxw1234.com/archives/2015/12/582.htm
http://www.wklken.me/posts/2015/05/23/elasticsearch-issues.html
http://chrissimpson.co.uk/elasticsearch-yellow-cluster-status-explained.html
https://www.elastic.co/guide/en/elasticsearch/reference/2.2/cluster-health.html
http://zhaoyanblog.com/archives/732.html
http://www.cnblogs.com/huangpeng1990/p/4364341.html
http://zhaoyanblog.com/archives/555.html
http://kibana.logstash.es/content/elasticsearch/principle/auto-discovery.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery-zen.html
http://www.cnblogs.com/yjf512/p/4897294.html