logstash 配置文件:[code]input {
kafka {
zk_connect => "xxxxxxx:2181/kafka/xxx"
group_id => "logstash_hadoop"
topic_id => "log"
reset_beginning => true
consumer_threads => 5
decorate_events => true
codec => json
}
}
filter {
date {
locale => "en"
match => ["timestamp", "yyyy-MM-dd HH:mm:ss"]
}
}
webhdfs {
use_httpfs => false
flush_size => 5
idle_flush_time => 0.5
workers => 1
host => “xxx.xxx.xx.x”
port => 50070
user => “logstash”
path => “/api/logs/dt=%{+YYYY-MM-dd}/logstash-%{+HH}.log”
}[/code]
问题 1: Failed to flush outgoing items[code]{:timestamp=>“2016-06-25T09:26:37.151000+0800”, :message=>“Failed to flush outgoing items”, :outgoing_count=>1, :exception=>“WebHDFS::ServerError”, :backtrace=>["/usr/local/logstash-2.0.0/vendor/bundle/jruby/1.9/gems/webhdfs-0.8.0/lib/webhdfs/client_v1.rb:351:in request'", "/usr/local/logstash-2.0.0/vendor/bundle/jruby/1.9/gems/webhdfs-0.8.0/lib/webhdfs/client_v1.rb:349:in
request’", “/usr/local/logstash-2.0.0/vendor/bundle/jruby/1.9/gems/webhdfs-0.8.0/lib/webhdfs/client_v1.rb:270:in operate_requests'", "/usr/local/logstash-2.0.0/vendor/bundle/jruby/1.9/gems/webhdfs-0.8.0/lib/webhdfs/client_v1.rb:73:in
create’”, “/usr/local/logstash-2.0.0/vendor/bundle/jruby/1.9/gems/logstash-output-webhdfs-2.0.4/lib/logstash/outputs/webhdfs.rb:210:in write_data'", "/usr/local/logstash-2.0.0/vendor/bundle/jruby/1.9/gems/logstash-output-webhdfs-2.0.4/lib/logstash/outputs/webhdfs.rb:205:in
write_data’”, “/usr/local/logstash-2.0.0/vendor/bundle/jruby/1.9/gems/logstash-output-webhdfs-2.0.4/lib/logstash/outputs/webhdfs.rb:195:in flush'", "org/jruby/RubyHash.java:1342:in
each’”, “/usr/local/logstash-2.0.0/vendor/bundle/jruby/1.9/gems/logstash-output-webhdfs-2.0.4/lib/logstash/outputs/webhdfs.rb:183:in flush'", "/usr/local/logstash-2.0.0/vendor/bundle/jruby/1.9/gems/stud-0.0.22/lib/stud/buffer.rb:219:in
buffer_flush’”, “org/jruby/RubyHash.java:1342:in each'", "/usr/local/logstash-2.0.0/vendor/bundle/jruby/1.9/gems/stud-0.0.22/lib/stud/buffer.rb:216:in
buffer_flush’”, “/usr/local/logstash-2.0.0/vendor/bundle/jruby/1.9/gems/stud-0.0.22/lib/stud/buffer.rb:159:in buffer_receive'", "/usr/local/logstash-2.0.0/vendor/bundle/jruby/1.9/gems/logstash-output-webhdfs-2.0.4/lib/logstash/outputs/webhdfs.rb:166:in
receive’”, “/usr/local/logstash-2.0.0/vendor/bundle/jruby/1.9/gems/logstash-core-2.0.0-java/lib/logstash/outputs/base.rb:80:in handle'", "(eval):409:in
output_func’”, “/usr/local/logstash-2.0.0/vendor/bundle/jruby/1.9/gems/logstash-core-2.0.0-java/lib/logstash/pipeline.rb:252:in outputworker'", "/usr/local/logstash-2.0.0/vendor/bundle/jruby/1.9/gems/logstash-core-2.0.0-java/lib/logstash/pipeline.rb:169:in
start_outputs’”], :level=>:warn}
[/code]
解决方案:
- 需要在 logstash 所在机器 /etc/hosts 配置 hadoop 所在机器,
- logstash 配置的用户要对 hdfs 目录有写权限 (root 用户则不用 chown chgrp)
[code]hdfs dfs -mkdir -p /api/logs/
hdfs dfs -chown logstash /api/logs/
hdfs dfs -chgrp logstash /api/logs/[/code]
问题 2: AlreadyBeingCreatedException
原因:
1台机器,数据备份默认设置为3,通过webhdfs写报错。修改备份数为1正常。
[quote]
hdfs-site.conf
[/quote]
[code]<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/data/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/data/hadoop/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.support.append</name>
<value>true</value>
</property>
<property>
<name>dfs.support.broken.append</name>
<value>true</value>
</property>
</configuration>[/code]