883Accesses
Abstract
Human beings keep exploring the physical space using information means. Only recently, with the rapid development of information technologies and the increasing accumulation of data, human beings can learn more about the unknown world with data-driven methods. Given data timeliness, there is a growing awareness of the importance of real-time data. There are two categories of technologies accounting for data processing: batching big data and streaming processing, which have not been integrated well. Thus, we propose an innovative incremental processing technology named after Stream Cube to process both big data and stream data. Also, we implement a real-time intelligent data processing system, which is based on real-time acquisition, real-time processing, real-time analysis, and real-time decision-making. The real-time intelligent data processing technology system is equipped with a batching big data platform, data analysis tools, and machine learning models. Based on our applications and analysis, the real-time intelligent data processing system is a crucial solution to the problems of the national society and economy.
This is a preview of subscription content,log in via an institution to check access.
Access this article
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
Price includes VAT (Japan)
Instant access to the full article PDF.
Similar content being viewed by others
References
Pan Y. Heading toward artificial intelligence 2.0. Engineering, 2016, 2: 409–413
Chen C. Real-time processing technology, platform and application of streaming big data. Big Data, 2017, 3: 1–8
Shvachko K, Kuang H, Radia S, et al. The hadoop distributed file system. In: Proceedings of Mass Storage Systems and Technologies (MSST), 2010. 1–10
Dean J, Ghemawat S. MapReduce: simplified data processing on large clusters. Commun ACM, 2008, 51: 107–113
Zaharia M, Chowdhury M, Franklin M J, et al. Spark: cluster computing with working sets. HotCloud, 2010, 10: 95
Zhang Q, Cheng L, Boutaba R. Cloud computing: state-of-the-art and research challenges. J Internet Serv Appl, 2010, 1: 7–18
Hashem I A T, Yaqoob I, Anuar N B, et al. The rise of “big data” on cloud computing: review and open research issues. Inf Syst, 2015, 47: 98–115
Wu Q, Ishikawa F, Zhu Q, et al. Deadline-constrained cost optimization approaches for workflow scheduling in clouds. IEEE Trans Parallel Distrib Syst, 2017, 28: 3401–3412
Saha B, Shah H, Seth S, et al. Apache tez: a unifying framework for modeling and building data processing applications. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, 2015. 1357–1369
Maarala A I, Rautiainen M, Salmi M, et al. Low latency analytics for streaming traffic data with Apache Spark. In: Proceedings of IEEE International Conference on Big Data (Big Data), 2015. 2855–2858
Toshniwal A, Taneja S, Shukla A, et al. Storni@ twitter. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, 2014. 147–156
Carbone P, Katsifodimos A, Ewen S, et al. Apache flink: stream and batch processing in a single engine. Bull IEEE Comput Soc Tech Committee Data Eng, 2015, 36: 4
Zaharia M, Das T, Li H, et al. Discretized streams: an efficient and fault-tolerant model for stream processing on large clusters. HotCloud, 2012, 12: 10
Zhao X, Garg S, Queiroz C, et al. A taxonomy and survey of stream processing systems. In: Proceedings of Software Architecture for Big Data and the Cloud, 2017. 183–206
Ali M. An introduction to microsoft SQL server streaminsight. In: Proceedings of the 1st International Conference and Exhibition on Computing for Geospatial Research & Application, 2010. 66
Hyde J. Data in flight. Commun ACM, 2010, 53: 48–52
Demers A J, Gehrke J, Panda B, et al. Cayuga: a general purpose event monitoring system. In: Proceedings of the 3rd Biennial Conference on Innovative Data Systems Research, Asilomar, 2007. 7: 412–422
Strohbach M, Ziekow H, Gazis V, et al. Towards a big data analytics framework for IoT and smart city applications. In: Proceedings of Modeling and Processing for Next-generation Big-data Technologies, 2015. 257–282
Noghabi S A, Paramasivam K, Pan Y, et al. Samza: stateful scalable stream processing at LinkedIn. Proc VLDB Endow, 2017, 10: 1634–1645
Chauhan J, Chowdhury S A, Makaroff D. Performance evaluation of Yahoo! S4: a first look. In: Proceedings of the 7th International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, 2012. 58–65
Fernandez R C, Pietzuch P R, Kreps J, et al. Liquid: unifying nearline and offline big data integration. In: Proceedings of the 7th Biennial Conference on Innovative Data Systems Research, Asilomar, 2015
Pacaci A, Ozsu M T. Distribution-aware stream partitioning for distributed stream processing systems. In: Proceedings of the 5th ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond, 2018. 6
Jin H, Chen F, Wu S, et al. Towards low-latency batched stream processing by pre-scheduling. IEEE Trans Parallel Distrib Syst, 2019, 30: 710–722
Venkataraman S, Panda A, Ousterhout K, et al. Drizzle: fast and adaptable stream processing at scale. In: Proceedings of the 26th Symposium on Operating Systems Principles, 2017. 374–389
Zhang B, Jin X, Ratnasamy S, et al. Awstream: adaptive wide-area streaming analytics. In: Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, 2018. 236–252
Li W X, Niu D, Liu Y N, et al. Wide-area spark streaming: automated routing and batch sizing. IEEE Trans Parall Distributed Syst, 2019, 30: 1434–1448
Traub J, Grulich P M, Cuellar A R, et al. Scotty: efficient window aggregation for out-of-order stream processing. In: Proceedings of 2018 IEEE 34th International Conference on Data Engineering, 2018. 1300–1303
Srinivasan V, Bulkowski B, Chu W L, et al. Aerospike. Proc VLDB Endow, 2016, 9: 1389–1400
Carlson J L. Redis in Action. New York: Manning Publications Co., 2013
Author information
Authors and Affiliations
College of Computer Science and Technology, Zhejiang University, Hangzhou, 310027, China
Tongya Zheng, Gang Chen, Xinyu Wang, Chun Chen, Xingen Wang & Sihui Luo
- Tongya Zheng
You can also search for this author inPubMed Google Scholar
- Gang Chen
You can also search for this author inPubMed Google Scholar
- Xinyu Wang
You can also search for this author inPubMed Google Scholar
- Chun Chen
You can also search for this author inPubMed Google Scholar
- Xingen Wang
You can also search for this author inPubMed Google Scholar
- Sihui Luo
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toXingen Wang.
Rights and permissions
About this article
Cite this article
Zheng, T., Chen, G., Wang, X.et al. Real-time intelligent big data processing: technology, platform, and applications.Sci. China Inf. Sci.62, 82101 (2019). https://doi.org/10.1007/s11432-018-9834-8
Received:
Accepted:
Published:
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative