1. Storm是什么?
Apache Storm is a free and open source distributed realtime computation system. Apache Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing. Apache Storm is simple, can be used with any programming language, and is a lot of fun to use!
Apache Storm has many use cases: realtime analytics, online machine learning, continuous computation, distributed RPC, ETL, and more. Apache Storm is fast: a benchmark clocked it at over a million tuples processed per second per node. It is scalable, fault-tolerant, guarantees your data will be processed, and is easy to set up and operate.
Apache Storm integrates with the queueing and database technologies you already use. An Apache Storm topology consumes streams of data and processes those streams in arbitrarily complex ways, repartitioning the streams between each stage of the computation however needed. Read more in the tutorial.
Apache Storm是一个免费的开源分布式实时计算系统,使得处理无限数据流变得容易,实时处理就像Hadoop批处理一样。
Apache Storm有许多用例:实时分析,在线机器学习,连续计算,分布式RPC,ETL等。
Apache Storm速度很快:基准测试表明它每秒可处理单个节点超过一百万个元组。它具有可扩展性,容错性,可确保数据将得到处理,并且易于设置和操作。
特点
- 快: a million tuples processed per second per node.
- 可扩展: It is scalable.
- 容错:fault-tolerant ,guarantees your data will be processed, and is easy to set up and operate.
Strom能实现高频数据和大规模数据的实时处理
2. Storm优势
- 编程模型:在大数据处理方面使得java工程师能够快速高效的写出高并发实时处理任务,能够大大降低研发成本
- 扩展性
- 可靠性:能够保证数据源发出的每一条数据都能够被处理且处理一次
- 容错性
- 多语言
3. 应用案例
电商
- 一淘:实时分析用户属性、反馈给搜索引擎
- 携程:实时分析系统监控携程网的网站性能
- 阿里妈妈:实时计算用户的兴趣数据…
电信
- 比如说手机流量超过了就短信告知,实时性越强用户损失就会越少
- 电话拦截判断诈骗电话等分析…