Part of the book series:Communications in Computer and Information Science ((CCIS,volume 729))
Included in the following conference series:
1443Accesses
Abstract
It is crucial to evaluate performance of a cloud platform and determine the main factors influencing the property. Moreover, the analysis results of related performance indicators can be applied to making theoretical predictions about the performance status of the cloud platform. This work mainly focuses on researching the interrelations between the performance indicators based on the Spark technology of the cloud platform and the load performance of the cluster, and furthermore makes effective predictions for the load performance. Firstly, we put forward the analytic frameworks of Spark performance analysis, the specific indicators analysis as well as the prediction models towards the cluster load. Secondly, with respect to the evaluation indicators, we explore the basis for their selections as well as their concrete implications, and then objectively, accurately calculate the correlation formula between the practically produced performance parameters and the load performance of the cluster when the Spark cluster performs the batch applications utilizing the MLR (Multiple Linear Regression) method, and, therefore, determine the main factors impacting the load performance. Finally, we predict the load value utilizing the Spark indicator analysis and the load prediction model. The results indicate that accuracy is up to 92.307%. Consequently, the solution presented in this paper predicts the cluster load value with effetioncy.
This is a preview of subscription content,log in via an institution to check access.
Access this chapter
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
- Chapter
- JPY 3498
- Price includes VAT (Japan)
- eBook
- JPY 5719
- Price includes VAT (Japan)
- Softcover Book
- JPY 7149
- Price includes VAT (Japan)
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Mesbahi, M.R., Hashemi, M., Rahmani, A.M.: Performance evaluation and analysis of load balancing algorithms in cloud computing environments. In: Second International Conference on Web Research, pp. 145–151. IEEE (2016)
Li, M., Tan, J., Wang, Y., et al.: SparkBench: a comprehensive benchmarking suite for in memory data analytic platform Spark. In: ACM International Conference on Computing Frontiers, pp. 1–8. ACM (2015)
Mershad, K., Artail, H., Saghir, M., et al.: A mathematical model to analyze the utilization of a cloud datacenter middleware. J. Netw. Comput. Appl.59(3), 399–415 (2014)
Gu, L., Li, H.: Memory or time: performance evaluation for iterative operation on Hadoop and Spark. In: IEEE International Conference on High Performance Computing and Communications and 2013 IEEE International Conference on Embedded and Ubiquitous Computing, pp. 721–727. (2013)
Villalpando, L.E.B., April, A., Abran, A.: Methodology to determine relationships between performance factors in hadoop cloud computing applications. In: International Conference on Cloud Computing and Services Sciences, pp. 375–386. (2014)
Sha, L., Ding, J., Chen, X., et al.: Performance modeling of openstack cloud computing platform using performance evaluation process algebra. In: International Conference on Cloud Computing and Big Data, pp. 49–56. IEEE (2015)
Expósito, R.R., Taboada, G.L., Ramos, S., et al.: Evaluation of messaging middleware for high-performance cloud computing. Pers. Ubiquit. Comput.17(8), 1709–1719 (2013)
Grandhi, S., Wibowo, S.: Performance evaluation of cloud computing providers using fuzzy multiattribute group decision making model. In: International Conference on Fuzzy Systems and Knowledge Discovery, pp. 130–135. IEEE (2015)
Villalpando, L.E.B., April, A., Abran, A.: Performance analysis model for big data applications in cloud computing. J. Cloud Comput.3(1), 1–20 (2014)
Prieto, M., Tanner, P., Andrade, C.: Multiple linear regression model for the assessment of bond strength in corroded and non-corroded steel bars in structural concrete. Mater. Struct.49(11), 4749–4763 (2016)
Pavón-Domínguez, P., Jiménez-Hornero, F.J., Ravé, E.G.D.: Evaluation of the temporal scaling variability in forecasting ground-level ozone concentrations obtained from multiple linear regressions. Env. Monit. Assess.185(5), 3853–3866 (2013)
Khedher, O., Jarraya, M.: Performance evaluation and improvement in cloud computing environment. In: International Conference on High Performance Computing and Simulation, pp. 650–652. IEEE (2015)
Ataş, G., Gungor, V.C.: Performance evaluation of cloud computing platforms using statistical methods. Comput. Electr. Eng.40(5), 1636–1649 (2014)
Gong, L., Xie, J., Li, X., et al.: Study on energy saving strategy and evaluation method of green cloud computing system. In: IEEE, Conference on Industrial Electronics and Applications, pp. 483–488. IEEE (2013)
Goga, K., Terzo, O., Ruiu, P., et al.: Simulation, modeling, and performance evaluation tools for cloud applications. In: Eighth International Conference on Complex, Intelligent and Software Intensive Systems, pp. 226–232. IEEE (2014)
Li, L., Rong, M., Zhang, G.: An internet of things QoE evaluation method based on multiple linear regression analysis. In: International Conference on Computer Science and Education, pp. 925–928. IEEE (2015)
Acknowledgments
The subject is sponsored by the National Natural Science Foundation of P. R. China (Nos. 61373017, 61572260, 61572261, 61672296, 61602261), the Natural Science Foundation of Jiangsu Province (Nos. BK20140886, BK20140888, BK20160089), Scientific & Technological Support Project of Jiangsu Province (Nos. BE2015702, BE2016777, BE2016185), China Postdoctoral Science Foundation (Nos. 2014M551636, 2014M561696), Jiangsu Planned Projects for Postdoctoral Research Funds (Nos. 1302090B, 1401005B), Jiangsu High Technology Research Key Laboratory for Wireless Sensor Networks Foundation (No. WSNLBZY201508), Research Innovation Program for College Graduates of Jiangsu Province (SJZZ16_0148).
Author information
Authors and Affiliations
School of Computer Science and Technology, Nanjing University of Posts and Telecommunications, Nanjing, 210003, China
Lu Dong, Peng Li, He Xu, Baozhou Luo & Yu Mi
Jiangsu High Technology Research Key Laboratory for Wireless Sensor Networks, Jiangsu Province, Nanjing, 210003, China
Peng Li & He Xu
- Lu Dong
You can also search for this author inPubMed Google Scholar
- Peng Li
You can also search for this author inPubMed Google Scholar
- He Xu
You can also search for this author inPubMed Google Scholar
- Baozhou Luo
You can also search for this author inPubMed Google Scholar
- Yu Mi
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toPeng Li.
Editor information
Editors and Affiliations
Nanjing University of Posts and Telecommunications, Nanjing, Jiangsu, China
Guoliang Chen
Sun Yat-sen University, Guangzhou, Guangdong, China
Hong Shen
Hainan University, Haikou, Hainan, China
Mingrui Chen
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd
About this paper
Cite this paper
Dong, L., Li, P., Xu, H., Luo, B., Mi, Y. (2017). Performance Prediction of Spark Based on the Multiple Linear Regression Analysis. In: Chen, G., Shen, H., Chen, M. (eds) Parallel Architecture, Algorithm and Programming. PAAP 2017. Communications in Computer and Information Science, vol 729. Springer, Singapore. https://doi.org/10.1007/978-981-10-6442-5_7
Download citation
Published:
Publisher Name:Springer, Singapore
Print ISBN:978-981-10-6441-8
Online ISBN:978-981-10-6442-5
eBook Packages:Computer ScienceComputer Science (R0)
Share this paper
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative