博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Flink学习笔记记录
阅读量:2080 次
发布时间:2019-04-29

本文共 4374 字,大约阅读时间需要 14 分钟。

技术交流

  • 微信:thinktothings
  • 微博:
  • Flink版本为1.7.2
  • 本站持续更新中.......2019-03-13.......

源码

中文文档

  • Flink 1.7中文文档(官网英文翻译过来) 在线版:
  • Flink 1.7中文文档(官网英文翻译过来) PDF版:

Flink 本地运行交互Shell

  • start-scala-shell.sh local
  • 参数说明: [local | remote | yarn]
benv.fromElements(1,2,3).map(i => i * i ).print
  • 输出结果
149

运行 jar 到 Flink 集群

flink run -c  com.opensourceteams.module.bigdata.flink.example.stream.worldcount.nc.SocketWindowWordCount    ./flink-maven-scala-2-0.0.1.jar

创建flink java 项目

mvn archetype:generate                               \      -DarchetypeGroupId=org.apache.flink              \      -DarchetypeArtifactId=flink-quickstart-java     \      -DarchetypeVersion=1.7.1      -DgroupId=com.opensourceteams \      -DartifactId=flink-maven-java \      -Dversion=0.0.1 \      -Dpackage=com.opensourceteams.module.bigdata.flink  \      -DinteractiveMode=false

创建flink scala项目

mvn archetype:generate                               \      -DarchetypeGroupId=org.apache.flink              \      -DarchetypeArtifactId=flink-quickstart-scala     \      -DarchetypeVersion=1.7.1      -DgroupId=com.opensourceteams \      -DartifactId=flink-maven-scala-2 \      -Dversion=0.0.1 \      -Dpackage=com.opensourceteams.module.bigdata.flink  \      -DinteractiveMode=false

查看jar中文件列表

jar tvf test.jar

maven 运行某个类

mvn exec:java -Dexec.mainClass=wikiedits.WikipediaAnalysis

执行计划图

  • 用Firefox 打开,显示的比较全(别浏览器有显示不全的现象)
  • 地址:
//执行计划      //println(env.getExecutionPlan)      //StreamGraph     //println(env.getStreamGraph.getStreamingPlanAsJSON)

Execute Plan

{"nodes":[{"id":1,"type":"Source: Socket Stream","pact":"Data Source","contents":"Source: Socket Stream","parallelism":1},{"id":2,"type":"Flat Map","pact":"Operator","contents":"Flat Map","parallelism":1,"predecessors":[{"id":1,"ship_strategy":"FORWARD","side":"second"}]},{"id":3,"type":"Map","pact":"Operator","contents":"Map","parallelism":1,"predecessors":[{"id":2,"ship_strategy":"FORWARD","side":"second"}]},{"id":5,"type":"Window(TumblingProcessingTimeWindows(3000), ProcessingTimeTrigger, SumAggregator, PassThroughWindowFunction)","pact":"Operator","contents":"Window(TumblingProcessingTimeWindows(3000), ProcessingTimeTrigger, SumAggregator, PassThroughWindowFunction)","parallelism":1,"predecessors":[{"id":3,"ship_strategy":"HASH","side":"second"}]},{"id":6,"type":"Sink: Print to Std. Out","pact":"Data Sink","contents":"Sink: Print to Std. Out","parallelism":1,"predecessors":[{"id":5,"ship_strategy":"FORWARD","side":"second"}]}]}

StreamGraph Plan

{"nodes":[{"id":1,"type":"Source: Socket Stream","pact":"Data Source","contents":"Source: Socket Stream","parallelism":1},{"id":2,"type":"Flat Map","pact":"Operator","contents":"Flat Map","parallelism":1,"predecessors":[{"id":1,"ship_strategy":"FORWARD","side":"second"}]},{"id":3,"type":"Map","pact":"Operator","contents":"Map","parallelism":1,"predecessors":[{"id":2,"ship_strategy":"FORWARD","side":"second"}]},{"id":5,"type":"Window(TumblingProcessingTimeWindows(3000), ProcessingTimeTrigger, SumAggregator, PassThroughWindowFunction)","pact":"Operator","contents":"Window(TumblingProcessingTimeWindows(3000), ProcessingTimeTrigger, SumAggregator, PassThroughWindowFunction)","parallelism":1,"predecessors":[{"id":3,"ship_strategy":"HASH","side":"second"}]},{"id":6,"type":"Sink: Print to Std. Out","pact":"Data Sink","contents":"Sink: Print to Std. Out","parallelism":1,"predecessors":[{"id":5,"ship_strategy":"FORWARD","side":"second"}]}]}

Flink 环境,配置

  • Flink 源码debug方法:
  • Flink 名词术语 :
  • Flink 源码编译:

example

  • scala 版Flink WordCount单词统计 :
  • wordCount Dataset批处理

    • start-scala-shell.sh local
    • 参数说明:[local | remote | yarn]
    benv.fromElements("a b a c").flatMap(x => x.split(" ")).map((_,1)).groupBy(0).sum(1).print
    • 输出结果
    (a,2)  (b,1)  (c,1)
  • Flink 1.7.2 DataStream operator 示例 :
  • Flink1.7.2 Dataset transformation示例:

Flink1.7.2 DataStream 源码分析(流处理)

  • Flink MiniCluster 作业提交:
  • Flink1.7.2 local WordCount源码分析:
  • Flink Sink 接收数据的顺序(Window发送数据顺序):
  • Flink Window 排序:
  • Flink1.7.2 Source、Window数据交互源码分析:
  • Flink1.7.2 并行计算源码分析:
  • Flink 1.7.2 业务时间戳分析流式数据源码分析:
  • Flink 005-source-operation-sink源码分析:

Flink1.7.2 Dataset 源码分析(批处理)

  • Flink1.7.2 Dataset local 源码分析 :
  • Flink1.7.2 Dataset 文件切片计算方式和切片数据读取源码分析:
  • Flink1.7.2 Dataset 并行计算源码分析:

Flink1.7.2 时序图

  • Flink 客户端提交程序到MiniCluster(时序图):
  • Flink ExecutionGraph的构建和Execution.deploy之前(时序图):
  • Flink Execution deploy和source数据读取(时序图):
  • Flink OperatorChian计算source数据(时序图):

Flink 1.7.2 Error 收集

  • Flink 1.7.2 Error 收集:

转载地址:http://jqfqf.baihongyu.com/

你可能感兴趣的文章
linux中fork()函数详解
查看>>
C语言字符、字符串操作偏僻函数总结
查看>>
Git的Patch功能
查看>>
分析C语言的声明
查看>>
TCP为什么是三次握手,为什么不是两次或者四次 && TCP四次挥手
查看>>
C结构体、C++结构体、C++类的区别
查看>>
进程和线程的概念、区别和联系
查看>>
CMake 入门实战
查看>>
绑定CPU逻辑核心的利器——taskset
查看>>
Linux下perf性能测试火焰图只显示函数地址不显示函数名的问题
查看>>
c结构体、c++结构体和c++类的区别以及错误纠正
查看>>
Linux下查看根目录各文件内存占用情况
查看>>
A星算法详解(个人认为最详细,最通俗易懂的一个版本)
查看>>
利用栈实现DFS
查看>>
逆序对的数量(递归+归并思想)
查看>>
数的范围(二分查找上下界)
查看>>
算法导论阅读顺序
查看>>
Windows程序设计:直线绘制
查看>>
linux之CentOS下文件解压方式
查看>>
Django字段的创建并连接MYSQL
查看>>