Flume:構建高可用、可擴展的海量日誌採集系統

Flume:構建高可用、可擴展的海量日誌採集系統

《Flume:構建高可用、可擴展的海量日誌採集系統》是2015年8月電子工業出版社出版的圖書,作者是【美】Hari Shreedharan(哈里·史瑞德哈倫)。

基本介紹

  • 書名:Flume:構建高可用、可擴展的海量日誌採集系統
  • 作者:【美】Hari Shreedharan(哈里·史瑞德哈倫)
  • 譯者:馬延輝 史東傑
  • ISBN:978-7-121-26558-7
  • 頁數:232頁
  • 定價:69.00元
  • 出版社電子工業出版社
  • 出版時間:2015年8月
  • 裝幀:平裝
  • 開本:16
內容簡介,內容提要,作者簡介,目錄,

內容簡介

《Flume:構建高可用、可擴展的海量日誌採集系統》從Flume 的基本概念和設計原理開始講解,分別介紹了不同種類的組件、如何配置組件、如何運行Flume Agent 等。同時,分別討論SourceChannelSink三種核心組件,不僅僅闡述每個組件的基本概念,而且結合實際的編程案例,深入、全面地介紹每個組件的詳細用法,並且這部分內容也是整個Flume 框架的重中之重。之後,講解攔截器、Channel選擇器、Sink 組和Sink 處理器等內容,它們為Flume 提供靈活的擴展支持。最後,介紹了Flume 的高級使用,如何使用Flume軟體開發工具集(SDK)和Embedded Agent API,如何設計、部署和監控Flume 生產集群。

內容提要

《Flume:構建高可用、可擴展的海量日誌採集系統》從Flume 的基本概念和設計原理開始講解,分別介紹了不同種類的組件、如何配置組件、如何運行Flume Agent 等。同時,分別討論Source、Channel 和Sink 三種核心組件,不僅僅闡述每個組件的基本概念,而且結合實際的編程案例,深入、全面地介紹每個組件的詳細用法,並且這部分內容也是整個Flume 框架的重中之重。之後,講解攔截器、Channel選擇器、Sink 組和Sink 處理器等內容,它們為Flume 提供靈活的擴展支持。最後,介紹了Flume 的高級使用,如何使用Flume 軟體開發工具集(SDK)和Embedded Agent API,如何設計、部署和監控Flume 生產集群。
總而言之,《Flume:構建高可用、可擴展的海量日誌採集系統》是一本理論結合實戰,深度、廣度兼備的海量日誌採集系統的著作。

作者簡介

Hari Shreedharan是Cloudera的一名軟體工程師,他工作於Apache Spark、Apache Flume和Apache Sqoop。他也是Flume項目的一個提交者和PMC成員,幫助項目的方向做決定。

目錄

譯者序 ........................................................................... v
序 ................................................................................xiii
前言 ............................................................................... x
第1 章 認識Apache Hadoop 和Apache HBase ............ 1
分散式檔案系統HDFS ..........................................................................................1
HDFS 的數據格式 ...........................................................................................3
處理HDFS 中的數據 ......................................................................................4
Apache HBase ........................................................................................................4
總結 .......................................................................................................................5
參考文獻 ................................................................................................................6
第2 章 用Apache Flume 處理流數據 ............................ 7
我們需要Flume .....................................................................................................7
Flume 是否適合呢? .............................................................................................9
Flume Agent 內部原理 .........................................................................................10
配置Flume Agent .................................................................................................13
Flume Agent 之間的相互通信 ..............................................................................17
複雜的流 ..............................................................................................................17
複製數據到不同目的地 ........................................................................................20
動態路由 ..............................................................................................................21
Flume 的無數據丟失保證,Channel 和事務 ........................................................22
Flume Channel 中的事務 ...............................................................................23
Agent 失敗和數據丟失 ........................................................................................25
批量的重要性 ......................................................................................................26
重複怎么樣? ......................................................................................................27
運行Flume Agent .................................................................................................27
總結 .....................................................................................................................29
參考文獻 ..............................................................................................................30
第3 章 源(Source) .................................................. 31
Source 的生命周期 ...............................................................................................31
Sink-to-Source 通信 .............................................................................................33
Avro Source ...................................................................................................34
Thrift Source .................................................................................................37
RPC Sources 的失敗處理 ..............................................................................39
HTTP Source ........................................................................................................40
針對HTTP Source 寫處理程式* ..................................................................42
Spooling Directory Source ....................................................................................47
使用Deserializers 讀取自定義格式* ............................................................50
Spooling Directory Source 性能.....................................................................55
Syslog Source .......................................................................................................55
Exec Source ..........................................................................................................59
JMS Source ..........................................................................................................61
轉換JMS 訊息為Flume 事件* .....................................................................63
編寫自定義Source* .............................................................................................65
Event-Driven Source 和Pollable Source ........................................................66
總結 .....................................................................................................................73
參考文獻 ..............................................................................................................73
第4 章 Channel ......................................................... 75
事務工作流 ..........................................................................................................76
Flume 自帶的Channel .........................................................................................78
Memory Channel ...........................................................................................78
File Channel ..................................................................................................80
總結 .....................................................................................................................86
參考文獻 ..............................................................................................................86
第5 章 Sink ............................................................... 87
Sink 的生命周期 ..................................................................................................88
最佳化Sink 的性能 .................................................................................................89
寫入到HDFS :HDFS Sink ..................................................................................89
理解Bucket ...................................................................................................90
配置HDFS Sink ............................................................................................93
使用序列化器控制數據格式* ..................................................................... 100
HBase Sink ......................................................................................................... 106
用序列化器將Flume 事件轉換成HBase Put 和Increment* ....................... 108
RPC Sink ............................................................................................................ 113
Avro Sink ..................................................................................................... 113
Thrift Sink ................................................................................................... 115
Morphline Solr Sink ........................................................................................... 116
Elastic Search Sink ............................................................................................. 119
自定義數據格式* ....................................................................................... 121
其他Sink :Null Sink、Rolling File Sink 和Logger Sink .................................. 124
編寫自定義Sink* .............................................................................................. 125
總結 ................................................................................................................... 129
參考文獻 ............................................................................................................ 129
第6 章 攔截器、Channel 選擇器、Sink 組和
Sink 處理器 ................................................... 131
攔截器 ................................................................................................................ 131
時間戳攔截器 .............................................................................................. 132
主機攔截器 ................................................................................................. 133
靜態攔截器 ................................................................................................. 133
正則過濾攔截器 .......................................................................................... 134
Morphline 攔截器 ........................................................................................ 135
UUID 攔截器 ............................................................................................... 136
編寫攔截器* ............................................................................................... 137
Channel 選擇器 .................................................................................................. 140
複製Channel 選擇器 ................................................................................... 140
多路復用Channel 選擇器 ........................................................................... 141
自定義Channel 選擇器* ............................................................................ 144
Sink 組和Sink 處理器 ....................................................................................... 146
Load-Balancing Sink 處理器 ....................................................................... 148
Failover Sink 處理器 ................................................................................... 151
總結 ................................................................................................................... 153
參考文獻 ............................................................................................................ 154
第7 章 傳送數據到Flume* ....................................... 155
構建Flume 事件 ................................................................................................ 155
Flume 客戶端SDK ............................................................................................. 156
創建Flume RPC 客戶端 .............................................................................. 157
RPC 客戶端接口 ......................................................................................... 157
所有RPC 客戶端的公共配置參數 .............................................................. 158
默認RPC 客戶端......................................................................................... 165
Load-Balancing RPC 客戶端 ....................................................................... 168
Failover RPC 客戶端 ................................................................................... 171
Thrift RPC 客戶端 ....................................................................................... 172
嵌入式Agent ..................................................................................................... 173
配置嵌入式Agent ....................................................................................... 175
log4j Appender ................................................................................................... 180
Load-Balancing log4j Appender ................................................................... 181
總結 ................................................................................................................... 182
參考文獻 ............................................................................................................ 183
第8 章 規劃、部署和監控Flume ............................... 185
規劃一個Flume 部署 ......................................................................................... 185
修復時間 ..................................................................................................... 185
我的Flume Channel 需要多少容量? ......................................................... 186
多少層? ..................................................................................................... 186
通過跨數據中心連結傳送數據 .................................................................... 188
層分片 ......................................................................................................... 190
部署Flume ......................................................................................................... 191
部署自定義代碼 .......................................................................................... 191
監控Flume ......................................................................................................... 193
從自定義組件報告度量 ............................................................................... 196
總結 ................................................................................................................... 196
參考文獻 ............................................................................................................ 196
索引 ........................................................................... 197

相關詞條

熱門詞條

聯絡我們