Movatterモバイル変換

Part of the book series:Lecture Notes in Computer Science ((LNAI,volume 3034))

Included in the following conference series:

International Atlantic Web Intelligence Conference

310Accesses

Abstract

Web Farms are clustered systems designed to provide high availability and high performance web services. A web farm is a group of replicated HTTP servers that reply web requests forwarded by a single point of access to the service. To deal with this task the point of access executes a load balancing algorithm to distribute web request among the group of servers. The present algorithms provides a short-term dynamic configuration for this operation, but some corrective actions (granting different session priorities or distributed WAN forwarding) cannot be achieved without a long-term estimation of the future web load. On this paper we propose a method to forecast web service work load. Our approach also includes an innovative segmentation method for the web pages using EDAs (estimation of distribution algorithms) and the application of semi-naïve Bayes classifiers to predict future web load several minutes before. All our analysis has been performed using real data from a world-wide academic portal.

This is a preview of subscription content,log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Predictive Web Prefetching: A Combined Approach Using Clustering Algorithms and WEKA in High-Traffic Settings

A predictive framework for load balancing clustered web servers

Article09 December 2015

Distributed Web Server’s Data Performance Processing with Application of Spatial Econometrics Models

References

Weka 3: Data mining with open source machine learning software in java (2003),http://www.cs.waikato.ac.nz/ml/weka/
Andresen, D., Yang, T., Ibarra, O.H.: Towards a scalable distributed WWW server on workstation clusters. In: Proc. of 10th IEEE Intl. Symp. Of Parallel Processing (IPPS 1996), pp. 850–856 (1996)
Google Scholar
Zhang, W., Jin, S., Wu, Q.: Creating Linux virtual servers. In: LinuxExpo 1999 Conference (1999)
Google Scholar
Baños, R., Gil, C., Ortega, J., Montoya, F.G.: Multilevel heuristic algorithm for graph partitioning. In: Raidl, G.R., Cagnoni, S., Cardalda, J.J.R., Corne, D.W., Gottlieb, J., Guillot, A., Hart, E., Johnson, C.G., Marchiori, E., Meyer, J.-A., Middendorf, M. (eds.) EvoIASP 2003, EvoWorkshops 2003, EvoSTIM 2003, EvoROB/EvoRobot 2003, EvoCOP 2003, EvoBIO 2003, and EvoMUSART 2003. LNCS, vol. 2611, pp. 143–153. Springer, Heidelberg (2003)
Chapter Google Scholar
Baños, R., Gil, C., Ortega, J., Montoya, F.G.: Partición de grafos mediante optimización evolutiva paralela. In: Proceedings de las XIV Jornadas de Paralelismo, pp. 245–250 (2003)
Google Scholar
Brisco, T.: RFC 1794: DNS support for load balancing, April 1995. Status: INFORMATIONAL (1995)
Google Scholar
Bui, T.N., Jones, C.: Finding good approximate vertex and edge partitions is np-hard. Information Processing Letters 42, 153–159 (1992)
Article MATH MathSciNet Google Scholar
Bui, T.N., Moon, B.: Genetic algorithms and graph partitioning. IEEE Transactions on Computers 45(7), 841–855 (1996)
Article MATH MathSciNet Google Scholar
Conti, M., Gregori, E., Panzieri, F.: Load distribution among replicated Web servers: A QoS-based approach. In: Proceedings of the Workshop on Internet Server Performance, WISP 1999 (1999)
Google Scholar
Domingos, P., Pazzani, M.: Beyond independence: conditions for the optimality of the simple Bayesian classifier. In: Proceedings of the 13th International Conference on Machine Learning, pp. 105–112 (1996)
Google Scholar
Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. Proceedings of the 12th International Conference on Machine Learning, 194–202 (1995)
Google Scholar
Fayyad, U., Irani, K.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the 13th International Conference on Artificial Intelligence, pp. 1022–1027 (1993)
Google Scholar
Fiduccia, C., Mattheyses, R.: A linear time heuristic for improving network partitions. In: Proceedings of the 19th IEEE Design Automation Conference, pp. 175–181 (1982)
Google Scholar
Ghini, V., Panzieri, F., Roccetti, M.: Client-centered load distribution: A mechanism for constructing responsive web services. In: HICSS (2001)
Google Scholar
Hand, D.J., Yu, K.: Idiot’s Bayes - not so stupid after all? International Statistical. Review 69(3), 385–398 (2001)
MATH Google Scholar
Hochsztain, E., Millán, S., Menasalvas, E.: A granular approach for analyzing the degree of affability of a web site. In: Alpigini, J.J., Peters, J.F., Skowron, A., Zhong, N. (eds.) RSCTC 2002. LNCS (LNAI), vol. 2475, pp. 479–486. Springer, Heidelberg (2002)
Chapter Google Scholar
Holte, R.C.: Very simple classification rules perform well on most commonly used datasets. Machine Learning 11, 63–90 (1993)
Article MATH Google Scholar
Kohavi, R.: Scaling up the accuracy of naive-Bayes classifiers: a decision-tree hybrid. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, pp. 202–207 (1996)
Google Scholar
Kwan, T.T., McGrath, R.E., Reed, D.A.: NCSA’s World Wide Web server: Design and performance. IEEE Computer, 68–74 ( November 1995)
Google Scholar
Larrañaga, P., Lozano, J.A.: Estimation of Distribution Algorithms. A New Tool for Evolutionary Computation. Kluwer Academic Publisher, Dordrecht (2002)
MATH Google Scholar
Martin, B.: Instance-based learning: Nearest neigbour with generalisation. working paper series 95/18 computer science. Technical report, Hamilton, University of Waikato
Google Scholar
Pazzani, M.: Constructive induction of Cartesian product attributes. Information, Statistics and Induction in Science, 66–77 (1996)
Google Scholar
Quinlan, R.: C4.5 Programs for Machine Learning. Morgan Kauffman, San Francisco (1993)
Google Scholar
Robles, V., Larrañaga, P., Peña, J.M., Menasalvas, E., Pérez, M.S., Herves, V.: Learning semi naïve Bayes structures by estimation of distribution algorithms. In: Pires, F.M., Abreu, S.P. (eds.) EPIA 2003. LNCS (LNAI), vol. 2902, pp. 244–258. Springer, Heidelberg (2003)
Chapter Google Scholar
Engelschall, R.S.: Load balancing your web site: Practical approaches for distributing HTTP traffic. Web Techniques Magazine 3(5) (1998)
Google Scholar
Simon, H.D., Teng, S.: How good is recursive bisection? SIAM Journal of Scientific Computing 18(5), 1436–1445 (1997)
Article MATH MathSciNet Google Scholar
Srisuresh, P., Gan, D.: RFC 2391: Load sharing using IP network address translation (LSNAT) (August 1998); Status: INFORMATIONAL
Google Scholar
Ting, K.M.: Discretization of continuous-valued attributes and instance-based learning. Technical Report 491, University of Sydney (1994)
Google Scholar
Walshaw, C., Cross, M.: Mesh partitioning: a multilevel balancing and refinement algorithm. SIAM Journal of Science Computation 22(1), 63–80 (2000)
Article MATH MathSciNet Google Scholar
Zhang, W.: Linux virtual server for scalable network services. In: Ottawa Linux Symposium (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

DATSI, Universidad Politécnica de Madrid, Madrid, Spain
José M. Peña, Víctor Robles & María S. Pérez
DLSIS, Universidad Politécnica de Madrid, Madrid, Spain
Óscar Marbán

Authors

José M. Peña
View author publications
You can also search for this author inPubMed Google Scholar
Víctor Robles
View author publications
You can also search for this author inPubMed Google Scholar
Óscar Marbán
View author publications
You can also search for this author inPubMed Google Scholar
María S. Pérez
View author publications
You can also search for this author inPubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, CICESE Research Center, Ensenada, México
Jesús Favela
Facultad de Informática, Universidad Politécnica de Madrid., Campus de Montegancedo s/n, 28660, Boadilla del Monte (Madrid), Spain
Ernestina Menasalvas
Escuela de Ciencias Físico-Matemáticas, Universidad Michoacana de San Nicolás de Hidalgo,, Av.Francisco J. Mujica, Morelia - Michoacán, México
Edgar Chávez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Peña, J.M., Robles, V., Marbán, Ó., Pérez, M.S. (2004). Bayesian Methods to Estimate Future Load in Web Farms. In: Favela, J., Menasalvas, E., Chávez, E. (eds) Advances in Web Intelligence. AWIC 2004. Lecture Notes in Computer Science(), vol 3034. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24681-7_24

Download citation

DOI:https://doi.org/10.1007/978-3-540-24681-7_24
Publisher Name:Springer, Berlin, Heidelberg
Print ISBN:978-3-540-22009-1
Online ISBN:978-3-540-24681-7
eBook Packages:Springer Book Archive

Publish with us

Policies and ethics

Movatterモバイル変換

Bayesian Methods to Estimate Future Load in Web Farms

Abstract

Access this chapter

Preview

Similar content being viewed by others

Predictive Web Prefetching: A Combined Approach Using Clustering Algorithms and WEKA in High-Traffic Settings

A predictive framework for load balancing clustered web servers

Distributed Web Server’s Data Performance Processing with Application of Spatial Econometrics Models

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Access this chapter