ElasticSearch:全家桶入门分析

最近进入了大数据领域,又一个之前没有涉足过的方向,但基本以ElasticSearch为基础,依据ES的读写结合数据采集传输分析提供服务,在技术架构中再掺和中间件,消息,以及微服务相关的注册中心,网关等,就组成了完整的一套大数据采集分析平台;对于一项新技术的研究,我依旧是按照自己的节奏,本地将涉及到的各个开源组件逐一研究一遍,了解下各自的功能和用法,进而串联起来熟悉整个系统

一个一个组件来用,大致流程:

NewImage

 

1、首先自然是核心模块ES了,很多公司运维都会维护一套ES,用于日志检索,比如开发会将服务日志接入,假如出现报警,直接通过Kibana页面搜索INFO,WARN,ERROR等相关的日志信息来定位问题,Elasticsearch它本身不仅是一个分布式文档存储,而且还是一个分布式搜索引擎,因此就直接本地部署一个试试用法

下载可以直接官网下,如果是容器化部署可以直接docker拉镜像,我这里下载整个release包,下载链接如下,可以自行选择版本:

https://www.elastic.co/cn/downloads/past-releases#elasticsearch

我这里下载的7.10.0版本,开箱即用,十分方便,因为是java写的,因此可以看下jvm配置文件,config/jvm.options里默认内存1G,根据自己的机器设置

################################################################

# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space

-Xms1g
-Xmx1g

接着就可以直接启动ES:bin/elasticsearch

启动之后,curl以下9200端口,可以看到我启动的es版本是7.10.0

 lihui@2023  ~  curl 'http://localhost:9200'
{
  "name" : "2023",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "Np25HHCVTQ6X-WVVGMyNHA",
  "version" : {
    "number" : "7.10.0",
    "build_flavor" : "default",
    "build_type" : "tar",
    "build_hash" : "51e9d6f22758d0374a0f3f5c6e8f3a7997850f96",
    "build_date" : "2020-11-09T21:30:33.964949Z",
    "build_snapshot" : false,
    "lucene_version" : "8.7.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

通过jps命令可以看到elasticsearch进程

 lihui@2023  ~  jps
65538 Elasticsearch
65831 Jps

既然elasticsearch是一个分布式系统,那么肯定支持高可用,有一个节点异常或者停止不会影响数据和服务

下面试着启动多个elasticsearch节点实例组成一个集群,多个elasticsearch节点实例可以修改配置文件的名称,或者直接命令行指定-E cluster.name=xxxx来指定,集群为demo,比如

lihui@2023  ~/elk/elasticsearch-7.10.0  bin/elasticsearch -E node.name=one -E cluster.name=demo -E path.data=one_data -d
lihui@2023  ~/elk/elasticsearch-7.10.0  bin/elasticsearch -E node.name=two -E cluster.name=demo -E path.data=two_data -d
lihui@2023  ~/elk/elasticsearch-7.10.0  bin/elasticsearch -E node.name=three -E cluster.name=demo -E path.data=three_data -d
lihui@2023  ~/elk/elasticsearch-7.10.0  jps
66192 Elasticsearch
66089 Elasticsearch
66137 Elasticsearch
66202 Jps
 lihui@2023  ~/elk/elasticsearch-7.10.0  lsof -i:9200
COMMAND   PID  USER   FD   TYPE             DEVICE SIZE/OFF NODE NAME
java    66089 lihui  274u  IPv6 0xd49f2a75bef7ea0d      0t0  TCP localhost:wap-wsp (LISTEN)
java    66089 lihui  275u  IPv6 0xd49f2a75bef7f64d      0t0  TCP localhost:wap-wsp (LISTEN)
 lihui@2023  ~/elk/elasticsearch-7.10.0  lsof -i:9201
COMMAND   PID  USER   FD   TYPE             DEVICE SIZE/OFF NODE NAME
java    66137 lihui  300u  IPv6 0xd49f2a7595c047ad      0t0  TCP localhost:wap-wsp-wtp (LISTEN)
java    66137 lihui  301u  IPv6 0xd49f2a758a2492ed      0t0  TCP localhost:wap-wsp-wtp (LISTEN)
 lihui@2023  ~/elk/elasticsearch-7.10.0  lsof -i:9202
COMMAND   PID  USER   FD   TYPE             DEVICE SIZE/OFF NODE NAME
java    66192 lihui  313u  IPv6 0xd49f2a75c0cdf90d      0t0  TCP localhost:wap-wsp-s (LISTEN)
java    66192 lihui  327u  IPv6 0xd49f2a75c0cdf2ed      0t0  TCP localhost:wap-wsp-s (LISTEN)

可以看到,的确是起了3个Java进程,也就是elasticsearch实例,端口也是9200-9202

 

接下来可以装一个Cerebro来查看elasticsearch节点的具体情况,下载地址:https://github.com/lmenezes/cerebro/releases/tag/v0.9.4

启动一下cerebro

 lihui@2023  ~/elk/cerebro-0.9.4  bin/cerebro
[info] play.api.Play - Application started (Prod) (no global state)
[info] p.c.s.AkkaHttpServer - Listening for HTTP on /0:0:0:0:0:0:0:0:9000

打开链接:http://localhost:9000/

就会跳到web页面,Node地址也就是elasticsearch的地址,因为我是三个实例都在我本地一台机器上,因此直接输入9200即可

NewImage

连接了之后,可以看到集群demo的所有elasticsearch实例信息,一共one,two,three三个node,都是部署在本地127.0.0.1,0个索引,0个分片,0个文档数据,存储了0b数据

NewImage

具体主分片和副本的关系

NewImage

这里有主分片P0,P1,P2,分布在三个Node上,同时每个分片的副本都分布在和主分片不同的另一个Node上,既然是高可用,分片和副本肯定不能在同一个Node上,否则出现故障,数据就直接丢失了,分居在不同的Node,出现故障才能够进行切换转移,但有一点的是,在索引的创建时,必须是已经分片配置完毕了进行,假如索引已经分布在三个Node的分片中,此时又去扩容一个Node,原索引是不会分布在新扩容的Node上的

 

2、接着来看Kibana,它提供了对接elasticsearch的数据可视化,各版本下载链接:https://www.elastic.co/cn/downloads/past-releases#kibana

我这里下载的是7.10.0版本,开箱即用,需要注意的是这个配置,我本地起了3个实例,因此都要列出来

# The URLs of the Elasticsearch instances to use for all your queries.
elasticsearch.hosts: ["http://localhost:9200","http://localhost:9201","http://localhost:9202"]

假如kibana和elasticsearch不在同一个机器,这里可以指向elasticsearch对应的host

启动kibana:bin/kibana

报了个错,没启动成功

  log   [09:59:33.646] [error][data][elasticsearch] [resource_already_exists_exception]: index [.kibana_1/42rHlHt5RAmAe6xeywF3Cg] already exists
  log   [09:59:33.647] [warning][savedobjects-service] Unable to connect to Elasticsearch. Error: resource_already_exists_exception
  log   [09:59:33.648] [warning][savedobjects-service] Another Kibana instance appears to be migrating the index. Waiting for that migration to complete. If no other Kibana instance is attempting migrations, you can get past this message by deleting index .kibana_1 and restarting Kibana.
  log   [09:59:33.654] [info][savedobjects-service] Creating index .kibana_task_manager_1.
  log   [09:59:33.662] [error][data][elasticsearch] [resource_already_exists_exception]: index [.kibana_task_manager_1/mvyFrbD_TEC7EZVN7NVLEw] already exists
  log   [09:59:33.663] [warning][savedobjects-service] Unable to connect to Elasticsearch. Error: resource_already_exists_exception
  log   [09:59:33.663] [warning][savedobjects-service] Another Kibana instance appears to be migrating the index. Waiting for that migration to complete. If no other Kibana instance is attempting migrations, you can get past this message by deleting index .kibana_task_manager_1 and restarting Kibana.

google了一把,这也是奇葩了,我kibana和es版本完全一致,怎么会出现兼容性,支持矩阵在这里:https://www.elastic.co/cn/support/matrix#matrix_compatibility

按照网上做法操作了一把,删除了残留的

curl -XDELETE http://localhost:9200/.kibana*

但是依旧没暖用,奇葩,因此我就直接杀掉集群es进程了,只启动一个es实例,因此上面elasticsearch.hosts注释掉就行了,页面链接:http://localhost:5601

NewImage

这个Dev Tools非常有用,可以调用API查看es相关的数据信息,比如看下集群情况,调用GET请求,_cluster/health即可

NewImage

可以看到这是一个elasticsearch集群,状态green,说明正常,集群只有一个node,就是一个数据node,6个分片都是主分片,对比一下cerebro

这里集群有一个节点2023上,一共有6个索引,6个分片都是主分片,没有副本,数据有30docs,数据大小10.66MB

NewImage

具体elasticsearch有些概念和传统sql数据库有点不一样,比如index不是B+Tree那索引,而是一类文档的集合,类比如下

RDBMS
ES
Table
Index
Row
Document
Colume
Field
Schema
Mapping
SQL
DSL

通过Kibana,可以查看es的任何数据信息;其实在这里可以通过API调用,来进行index和document的操作,但为了体现最终端到端的效果,数据部分还是放在后面进行

 

3、Logstash,数据进行采集,转换,过滤,输出,各版本下载地址:https://www.elastic.co/cn/downloads/past-releases#logstash

我这里下载的是7.10.0版本,下面做一个测试,将一个日志文件的数据,通过logstash读取,过滤,输出,写到es里

构造一个csv文件:user.csv

{"name":"lilei", "age":22, "url":"lilei.com"}
{"name":"lucy", "age":23, "url":"lucy.com"}
{"name":"lihao", "age":24, "url":"lihao.com"}
{"name":"lily", "age":25, "url":"lily.com"}
{"name":"hanmeimei", "age":26, "url":"haimeimei.com"}
{"name":"tom", "age":27, "url":"tom.com"}

这里作为源数据文件,接着新建logstash配置文件logstash.conf

input {
  file {
    path => "/Users/lihui/user.csv"
    start_position => "beginning"
    sincedb_path => "/dev/null"
  }
}

filter {

}

output {
  elasticsearch {
    hosts => "http://localhost:9200"
    index => "user"
    document_id => "%{id}"
  }
  stdout {}
}

这个配置文件,根据自勉意思看就很简单,input指定源文件,filter可以指定一些过滤规则,这里不指定,output指定过滤完之后,传输的方向,这里是传输给本地9200端口的es,写到索引user里

下面就可以指定配置文件启动logstash:./logstash -f logstash.conf

结果报了一个错

  Pipeline_id:main
  Plugin: "beginning", path=>["/Users/lihui/user.csv"], id=>"cb4f1782edea376545ea345a79bf77bd21052f6a3077abec87011b432c583322", sincedb_path=>"/dev/null", enable_metric=>true, codec=>"plain_d5a484b6-2475-4e82-a084-40bda805fd0b", enable_metric=>true, charset=>"UTF-8">, stat_interval=>1.0, discover_interval=>15, sincedb_write_interval=>15.0, delimiter=>"\n", close_older=>3600.0, mode=>"tail", file_completed_action=>"delete", sincedb_clean_after=>1209600.0, file_chunk_size=>32768, file_chunk_count=>140737488355327, file_sort_by=>"last_modified", file_sort_direction=>"asc", exit_after_read=>false, check_archive_validity=>false>
  Error: Permission denied - Permission denied
  Exception: Errno::EACCES
  Stack: org/jruby/RubyFile.java:1267:in `utime'
uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/fileutils.rb:1133:in `block in touch'

看着是权限问题,加上sudo执行即可:sudo ./logstash -f logstash.conf

终端输出下列写入es的内容

[2023-02-26T20:21:01,028][INFO ][logstash.javapipeline    ][main] Pipeline Java execution initialization time {"seconds"=>0.93}
[2023-02-26T20:21:01,236][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
[2023-02-26T20:21:01,282][INFO ][filewatch.observingtail  ][main][cb4f1782edea376545ea345a79bf77bd21052f6a3077abec87011b432c583322] START, creating Discoverer, Watch with file and sincedb collections
[2023-02-26T20:21:01,305][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2023-02-26T20:21:01,548][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
{
       "message" => "{\"name\":\"lilei\", \"age\":22, \"url\":\"lilei.com\"}",
    "@timestamp" => 2023-02-26T12:21:01.647Z,
          "path" => "/Users/lihui/user.csv",
      "@version" => "1",
          "host" => "2023"
}
{
       "message" => "{\"name\":\"lihao\", \"age\":24, \"url\":\"lihao.com\"}",
    "@timestamp" => 2023-02-26T12:21:01.667Z,
          "path" => "/Users/lihui/user.csv",
      "@version" => "1",
          "host" => "2023"
}
{
       "message" => "{\"name\":\"tom\", \"age\":27, \"url\":\"tom.com\"}",
    "@timestamp" => 2023-02-26T12:21:01.668Z,
          "path" => "/Users/lihui/user.csv",
      "@version" => "1",
          "host" => "2023"
}
{
       "message" => "{\"name\":\"lucy\", \"age\":23, \"url\":\"lucy.com\"}",
    "@timestamp" => 2023-02-26T12:21:01.667Z,
          "path" => "/Users/lihui/user.csv",
      "@version" => "1",
          "host" => "2023"
}
{
       "message" => "{\"name\":\"hanmeimei\", \"age\":26, \"url\":\"haimeimei.com\"}",
    "@timestamp" => 2023-02-26T12:21:01.668Z,
          "path" => "/Users/lihui/user.csv",
      "@version" => "1",
          "host" => "2023"
}
{
       "message" => "{\"name\":\"lily\", \"age\":25, \"url\":\"lily.com\"}",
    "@timestamp" => 2023-02-26T12:21:01.668Z,
          "path" => "/Users/lihui/user.csv",
      "@version" => "1",
          "host" => "2023"
}

 通过kibana可以看到logstash传入写到es的索引user的数据

NewImage

可以看到这里索引user的健康状态是yellow,而不是green,一般yellow的原因都是主分片是好的,副本状态异常,通过kibana查看一下shards

NewImage

不出所料,user的副本异常,不过这也很简单理解,我是单节点部署,根本就不需要副本,因此直接删掉就好了,kibana直接操作

NewImage

这样,就恢复正常了

NewImage

这样,csv数据就通过logstash写入了es里

NewImage

这里有个问题,只写进去了一条数据,原因我猜是主键_id没有赋值成功,导致后面几条数据_id都没变,从而变成了update操作,而不是新插入,将logstash.conf文件里的document_id不用指定,文档数据库主键都是字符串,应该会自动生成一个,重新操作了一遍,在Discover里的确可以查到相关的message

NewImage

因此这里就是logstash通过采集文本日志数据,中间经过filter,重新组装后,传到elasticsearch存起来,进而可以提供用户查询,比如通过kibana可以进行搜索;而logstash的数据源除了直接从file里读取,还可以stdin,filebeat,kafka,redis等等都是可以的,比如轻量级的采数工具filebeat

 

4、Filebeat

这玩意非常轻量级,go语言写的,用来采集数据效率很高,下载各版本链接:

https://www.elastic.co/cn/downloads/past-releases#filebeat

我下载的依旧是7.10.0版本,这下用它来采集,logstash收数过滤传输给es,因此需要修改配置数据源流向

测试日志文件demo.log

2023-02-26T22:00:00Z The first Document
2023-02-26T22:01:00Z The second Document
2023-02-26T22:02:00Z The third Document
2023-02-26T22:03:00Z The fourth Document
2023-02-26T22:04:00Z The fifth Document
2023-02-26T22:05:00Z The sixth Document
2023-02-26T22:06:00Z The seventh Document
2023-02-26T22:07:00Z The eighth Document
2023-02-26T22:08:00Z The nighth Document
2023-02-26T22:09:00Z The tenth Document

filebeat-demo.yml配置文件,指定input数据源文件,output指向logstash

filebeat.inputs:
  type: log
  enabled: true
  paths:
    ./demo.log

output.logstash:
  hosts: [“localhost:8888"]

logstash.conf配置文件,指定input来自filebeat,output指向elasticsearch,中间加一个过滤条件

input {
  beats {
    port => 8888
  }
}

filter {
  grok {
    match => [
      "message", "%{TIMESTAMP_ISO8601:timestamp_string}%{SPACE}%{GREEDYDATA:line}"
    ]
  }

  date {
    match => ["timestamp_string", "ISO8601"]
  }

  mutate {
    remove_field => [message, timestamp_string]
  }
}

output {
  elasticsearch {
    hosts => "http://localhost:9200"
    document_type => "_doc"
    index => "demo"
  }
  stdout {
    codec => rubydebug
  }
}

这里grok的match类似正则表达式,timestamp_string保存匹配了时间部分的字符串,SPACE匹配空格,line保存匹配了后面字符串内容

这里date可以将timestamp_string变成ISO8601的时间,存在@timestamp字段里,如果没有此变化,@timestamp就是运行的时间,而不是log日志里的时间

这里mutate把之前的message和timestamp_string都删掉

启动logstash,配置的端口是8888,因此filebeat配置里指定8888端口

[2023-02-26T22:19:12,031][INFO ][logstash.javapipeline    ][main] Pipeline Java execution initialization time {"seconds"=>0.97}
[2023-02-26T22:19:12,057][INFO ][logstash.inputs.beats    ][main] Beats inputs: Starting input listener {:address=>"0.0.0.0:8888"}
[2023-02-26T22:19:12,085][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
[2023-02-26T22:19:12,159][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2023-02-26T22:19:12,221][INFO ][org.logstash.beats.Server][main][1da348c69b339b6e040093af28eaebe2fa75547c695d42ba57cc09a242ef50a7] Starting server on port: 8888
[2023-02-26T22:19:12,447][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}

启动filebeat,报了一个错

./filebeat -c filebeat-demo.yml
Exiting: no modules or inputs enabled and configuration reloading disabled. What files do you want me to watch?

坑,配置文件少了两道杠

filebeat.inputs:
- type: log
  enabled: true
  paths:
    - ./demo.log

output.logstash:
  hosts: ["localhost:8888"]

这样就可以了;当采集了demo.log之后,传给logstash,就会打印日志

[2023-02-26T22:19:12,221][INFO ][org.logstash.beats.Server][main][1da348c69b339b6e040093af28eaebe2fa75547c695d42ba57cc09a242ef50a7] Starting server on port: 8888
[2023-02-26T22:19:12,447][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}



{
      "@version" => "1",
           "ecs" => {
        "version" => "1.6.0"
    },
         "agent" => {
                "name" => "2023",
                  "id" => "c171d8ac-3546-44d4-a156-73aef217e0c4",
                "type" => "filebeat",
             "version" => "7.10.0",
        "ephemeral_id" => "144bc0ef-4562-4d65-91de-6f58b40217b0",
            "hostname" => "2023"
    },
          "tags" => [
        [0] "beats_input_codec_plain_applied"
    ],
           "log" => {
          "file" => {
            "path" => "/Users/lihui/elk/filebeat-7.10.0-darwin-x86_64/demo.log"
        },
        "offset" => 202
    },
          "line" => "The sixth Document",
    "@timestamp" => 2023-02-26T22:05:00.000Z,
          "host" => {
        "name" => "2023"
    },
         "input" => {
        "type" => "log"
    }
}
{
      "@version" => "1",
           "ecs" => {
        "version" => "1.6.0"
    },
         "agent" => {
                "name" => "2023",
                  "id" => "c171d8ac-3546-44d4-a156-73aef217e0c4",
                "type" => "filebeat",
             "version" => "7.10.0",
        "ephemeral_id" => "144bc0ef-4562-4d65-91de-6f58b40217b0",
            "hostname" => "2023"
    },
          "tags" => [
        [0] "beats_input_codec_plain_applied"
    ],
           "log" => {
          "file" => {
            "path" => "/Users/lihui/elk/filebeat-7.10.0-darwin-x86_64/demo.log"
        },
        "offset" => 40
    },
          "line" => "The second Document",
    "@timestamp" => 2023-02-26T22:01:00.000Z,
          "host" => {
        "name" => "2023"
    },
         "input" => {
        "type" => "log"
    }
}
{
      "@version" => "1",
           "ecs" => {
        "version" => "1.6.0"
    },
         "agent" => {
                "name" => "2023",
                  "id" => "c171d8ac-3546-44d4-a156-73aef217e0c4",
                "type" => "filebeat",
             "version" => "7.10.0",
        "ephemeral_id" => "144bc0ef-4562-4d65-91de-6f58b40217b0",
            "hostname" => "2023"
    },
          "tags" => [
        [0] "beats_input_codec_plain_applied"
    ],
           "log" => {
          "file" => {
            "path" => "/Users/lihui/elk/filebeat-7.10.0-darwin-x86_64/demo.log"
        },
        "offset" => 366
    },
          "line" => "The tenth Document",
    "@timestamp" => 2023-02-26T22:09:00.000Z,
         "input" => {
        "type" => "log"
    },
          "host" => {
        "name" => "2023"
    }
}
{
      "@version" => "1",
           "ecs" => {
        "version" => "1.6.0"
    },
         "agent" => {
                "name" => "2023",
                  "id" => "c171d8ac-3546-44d4-a156-73aef217e0c4",
                "type" => "filebeat",
             "version" => "7.10.0",
        "ephemeral_id" => "144bc0ef-4562-4d65-91de-6f58b40217b0",
            "hostname" => "2023"
    },
          "tags" => [
        [0] "beats_input_codec_plain_applied"
    ],
           "log" => {
          "file" => {
            "path" => "/Users/lihui/elk/filebeat-7.10.0-darwin-x86_64/demo.log"
        },
        "offset" => 81
    },
          "line" => "The third Document",
    "@timestamp" => 2023-02-26T22:02:00.000Z,
         "input" => {
        "type" => "log"
    },
          "host" => {
        "name" => "2023"
    }
}
{
      "@version" => "1",
           "ecs" => {
        "version" => "1.6.0"
    },
         "agent" => {
                "name" => "2023",
                "type" => "filebeat",
                  "id" => "c171d8ac-3546-44d4-a156-73aef217e0c4",
             "version" => "7.10.0",
        "ephemeral_id" => "144bc0ef-4562-4d65-91de-6f58b40217b0",
            "hostname" => "2023"
    },
          "tags" => [
        [0] "beats_input_codec_plain_applied"
    ],
           "log" => {
          "file" => {
            "path" => "/Users/lihui/elk/filebeat-7.10.0-darwin-x86_64/demo.log"
        },
        "offset" => 121
    },
          "line" => "The fourth Document",
    "@timestamp" => 2023-02-26T22:03:00.000Z,
         "input" => {
        "type" => "log"
    },
          "host" => {
        "name" => "2023"
    }
}
{
      "@version" => "1",
           "ecs" => {
        "version" => "1.6.0"
    },
         "agent" => {
                "name" => "2023",
                  "id" => "c171d8ac-3546-44d4-a156-73aef217e0c4",
                "type" => "filebeat",
             "version" => "7.10.0",
        "ephemeral_id" => "144bc0ef-4562-4d65-91de-6f58b40217b0",
            "hostname" => "2023"
    },
          "tags" => [
        [0] "beats_input_codec_plain_applied"
    ],
           "log" => {
          "file" => {
            "path" => "/Users/lihui/elk/filebeat-7.10.0-darwin-x86_64/demo.log"
        },
        "offset" => 284
    },
          "line" => "The eighth Document",
    "@timestamp" => 2023-02-26T22:07:00.000Z,
          "host" => {
        "name" => "2023"
    },
         "input" => {
        "type" => "log"
    }
}
{
      "@version" => "1",
           "ecs" => {
        "version" => "1.6.0"
    },
         "agent" => {
                "name" => "2023",
                "type" => "filebeat",
                  "id" => "c171d8ac-3546-44d4-a156-73aef217e0c4",
             "version" => "7.10.0",
        "ephemeral_id" => "144bc0ef-4562-4d65-91de-6f58b40217b0",
            "hostname" => "2023"
    },
          "tags" => [
        [0] "beats_input_codec_plain_applied"
    ],
           "log" => {
          "file" => {
            "path" => "/Users/lihui/elk/filebeat-7.10.0-darwin-x86_64/demo.log"
        },
        "offset" => 0
    },
          "line" => "The first Document",
    "@timestamp" => 2023-02-26T22:00:00.000Z,
          "host" => {
        "name" => "2023"
    },
         "input" => {
        "type" => "log"
    }
}
{
      "@version" => "1",
           "ecs" => {
        "version" => "1.6.0"
    },
         "agent" => {
                "name" => "2023",
                  "id" => "c171d8ac-3546-44d4-a156-73aef217e0c4",
                "type" => "filebeat",
             "version" => "7.10.0",
        "ephemeral_id" => "144bc0ef-4562-4d65-91de-6f58b40217b0",
            "hostname" => "2023"
    },
          "tags" => [
        [0] "beats_input_codec_plain_applied"
    ],
           "log" => {
          "file" => {
            "path" => "/Users/lihui/elk/filebeat-7.10.0-darwin-x86_64/demo.log"
        },
        "offset" => 325
    },
          "line" => "The nighth Document",
    "@timestamp" => 2023-02-26T22:08:00.000Z,
         "input" => {
        "type" => "log"
    },
          "host" => {
        "name" => "2023"
    }
}
{
      "@version" => "1",
           "ecs" => {
        "version" => "1.6.0"
    },
         "agent" => {
                "name" => "2023",
                  "id" => "c171d8ac-3546-44d4-a156-73aef217e0c4",
                "type" => "filebeat",
             "version" => "7.10.0",
        "ephemeral_id" => "144bc0ef-4562-4d65-91de-6f58b40217b0",
            "hostname" => "2023"
    },
          "tags" => [
        [0] "beats_input_codec_plain_applied"
    ],
           "log" => {
          "file" => {
            "path" => "/Users/lihui/elk/filebeat-7.10.0-darwin-x86_64/demo.log"
        },
        "offset" => 242
    },
          "line" => "The seventh Document",
    "@timestamp" => 2023-02-26T22:06:00.000Z,
          "host" => {
        "name" => "2023"
    },
         "input" => {
        "type" => "log"
    }
}
{
      "@version" => "1",
           "ecs" => {
        "version" => "1.6.0"
    },
         "agent" => {
                "name" => "2023",
                  "id" => "c171d8ac-3546-44d4-a156-73aef217e0c4",
                "type" => "filebeat",
             "version" => "7.10.0",
        "ephemeral_id" => "144bc0ef-4562-4d65-91de-6f58b40217b0",
            "hostname" => "2023"
    },
          "tags" => [
        [0] "beats_input_codec_plain_applied"
    ],
           "log" => {
          "file" => {
            "path" => "/Users/lihui/elk/filebeat-7.10.0-darwin-x86_64/demo.log"
        },
        "offset" => 162
    },
          "line" => "The fifth Document",
    "@timestamp" => 2023-02-26T22:04:00.000Z,
         "input" => {
        "type" => "log"
    },
          "host" => {
        "name" => "2023"
    }
}

可以看到这些数据都已经进行了格式化,过滤,应该就是传给es存储的格式了

最终,我们通过kibana来查看数据

NewImage

完美收工,OVER

发表回复