Movatterモバイル変換


[0]ホーム

URL:


Skip to main content

Advertisement

Springer Nature Link
Log in

Challenges in Resource Provisioning for the Execution of Data Wrangling Workflows on the Cloud: A Case Study

  • Conference paper
  • First Online:

Part of the book series:Lecture Notes in Computer Science ((LNISA,volume 12392))

  • 931Accesses

Abstract

Data Wrangling (DW) is an essential component of any big data analytics job, encompassing a large variety of complex operations to transform, integrate and clean sets of unrefined data. The inherent complexity and execution cost associated with DW workflows make the provisioning of resources from a cloud provider a sensible solution for executing these workflows in a reasonable amount of time. However, the lack of detailed profiles of the input data and the operations composing these workflows makes the selection of resources to run these workflows on the cloud a hard task due to the large search space to select appropriate resources, their interactions, dependencies, trade-offs and prices that need to be considered. In this paper, we investigate the complex problem of provisioning cloud resources to DW workflows, by carrying out a case study on a specific Traffic DW workflow from the Smart Cities domain. We carry out a number of simulations where we change resource provisioning, focusing on what may impact the execution of the DW workflow most. The insights obtained from our results suggest that fine-grained cloud resource provisioning based on workflow execution profile and input data properties has the potential to improve resource utilization and prevent significant over- and under-provisioning.

This is a preview of subscription content,log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 5719
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7149
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide -see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Similar content being viewed by others

Notes

  1. 1.

References

  1. Bhavani, B.H., Guruprasad, H.S.: Resource provisioning techniques in cloud computing environment: a survey. Int. J. Res. Comput. Commun. Technol.3, 395–401 (2014)

    Google Scholar 

  2. Chen, W., Deelman, E.: Workflowsim: a toolkit for simulating scientific workflows in distributed environments. IEEE 8th International Conference on E-Science, pp. 1–8 (2012)

    Google Scholar 

  3. Sampaio, S.D.F.M., Dong, C., Sampaio, P.: DQ\({}^{\text{2 }}\)S - a framework for data quality-aware information management. Expert Syst. Appl.42(21), 8304–8326 (2015)

    Article  Google Scholar 

  4. Furche, T., Gottlob, G., Libkin, L., Orsi, G., Paton, N.W.: Data wrangling for big data: challenges and opportunities. In: Proceedings of the 19th International Conference on Extending Database Technology, EDBT 2016, Bordeaux, France, 15–16 March 2016, Bordeaux, France, pp. 473–478 (2016)

    Google Scholar 

  5. Gill, S.S., Buyya, R.: Resource provisioning based scheduling framework for execution of heterogeneous and clustered workloads in clouds: from fundamental to autonomic offering. J. Grid Comput.17(3), 385–417 (2019)

    Article  Google Scholar 

  6. Gill, S.S., Chana, I., Singh, M., Buyya, R.: RADAR: self-configuring and self-healing in resource management for enhancing quality of cloud services. Concurrency and Computation: Practice and Experience31(1), (2019)

    Google Scholar 

  7. Gill, S.S., et al.: Holistic resource management for sustainable and reliable cloud computing: an innovative solution to global challenge. J. Syst. Softw.155, 104–129 (2019)

    Article  Google Scholar 

  8. Hellerstein, J.M., et al.: Ground: a data context service. In: CIDR 2017, 8th Biennial Conference on Innovative Data Systems Research, Online Proceedings, Chaminade, CA, USA, 8–11 January 2017 (2017)

    Google Scholar 

  9. Nahrstedt, K.: To overprovision or to share via QoS-aware resource management? In: Proceedings of the Eighth International Symposium on High Performance Distributed Computing (Cat. No. 99TH8469), Redondo Beach, CA, USA, 6 August, pp. 205–212 (1999)

    Google Scholar 

  10. Pietri, I., Sakellariou, R.: A Pareto-based approach for CPU provisioning of scientific workflows on clouds. Future Gener. Comput. Syst.94, 479–487 (2019)

    Article  Google Scholar 

  11. Sampaio, S., Aljubairah, M., Permana, H.A., Sampaio, P.: A conceptual approach for supporting traffic data wrangling tasks. Comput. J.62(3), 461–480 (2019)

    Article  Google Scholar 

  12. Singh, S., Chana, I.: Q-aware: quality of service based cloud resource provisioning. Comput. Electr. Eng.47, 138–160 (2015)

    Article  Google Scholar 

  13. Singh, S., Chana, I.: Cloud resource provisioning: survey, status and future research directions. Knowl. Inf. Syst.49(3), 1005–1069 (2016).https://doi.org/10.1007/s10115-016-0922-3

    Article  Google Scholar 

  14. Singh, S., Chana, I.: A survey on resource scheduling in cloud computing: issues and challenges. J. Grid Comput.14(2), 217–264 (2016)

    Article  Google Scholar 

  15. Stonebraker, M., Ilyas, I.F.: Data integration: the current status and the way forward. IEEE Data Eng. Bull.41(2), 3–9 (2018)

    Google Scholar 

  16. Vassiliadis, P.: A survey of extract-transform-load technology. Int. J. Data Warehouse. Min.5, 1–27 (2009)

    Article  Google Scholar 

Download references

Acknowledgement

Partial support from the H2020 I-BiDaaS project (grant agreement No. 780787) is gratefully acknowledged.

Author information

Authors and Affiliations

  1. Department of Computer Science, The University of Manchester, Manchester, UK

    Abdullah Khalid A. Almasaud, Agresh Bharadwaj, Sandra Sampaio & Rizos Sakellariou

Authors
  1. Abdullah Khalid A. Almasaud

    You can also search for this author inPubMed Google Scholar

  2. Agresh Bharadwaj

    You can also search for this author inPubMed Google Scholar

  3. Sandra Sampaio

    You can also search for this author inPubMed Google Scholar

  4. Rizos Sakellariou

    You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toSandra Sampaio.

Editor information

Editors and Affiliations

  1. Clausthal University of Technology, Clausthal-Zellerfeld, Germany

    Sven Hartmann

  2. Johannes Kepler University of Linz, Linz, Austria

    Josef Küng

  3. Johannes Kepler University of Linz, Linz, Austria

    Gabriele Kotsis

  4. IFS, Vienna University of Technology, Vienna, Wien, Austria

    A Min Tjoa

  5. Johannes Kepler University of Linz, Linz, Austria

    Ismail Khalil

Rights and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Almasaud, A.K.A., Bharadwaj, A., Sampaio, S., Sakellariou, R. (2020). Challenges in Resource Provisioning for the Execution of Data Wrangling Workflows on the Cloud: A Case Study. In: Hartmann, S., Küng, J., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2020. Lecture Notes in Computer Science(), vol 12392. Springer, Cham. https://doi.org/10.1007/978-3-030-59051-2_5

Download citation

Publish with us

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 5719
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7149
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide -see info

Tax calculation will be finalised at checkout

Purchases are for personal use only


[8]ページ先頭

©2009-2025 Movatter.jp