Part of the book series:Lecture Notes in Computer Science ((LNAI,volume 3034))
Included in the following conference series:
310Accesses
Abstract
Web Farms are clustered systems designed to provide high availability and high performance web services. A web farm is a group of replicated HTTP servers that reply web requests forwarded by a single point of access to the service. To deal with this task the point of access executes a load balancing algorithm to distribute web request among the group of servers. The present algorithms provides a short-term dynamic configuration for this operation, but some corrective actions (granting different session priorities or distributed WAN forwarding) cannot be achieved without a long-term estimation of the future web load. On this paper we propose a method to forecast web service work load. Our approach also includes an innovative segmentation method for the web pages using EDAs (estimation of distribution algorithms) and the application of semi-naïve Bayes classifiers to predict future web load several minutes before. All our analysis has been performed using real data from a world-wide academic portal.
This is a preview of subscription content,log in via an institution to check access.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Weka 3: Data mining with open source machine learning software in java (2003),http://www.cs.waikato.ac.nz/ml/weka/
Andresen, D., Yang, T., Ibarra, O.H.: Towards a scalable distributed WWW server on workstation clusters. In: Proc. of 10th IEEE Intl. Symp. Of Parallel Processing (IPPS 1996), pp. 850–856 (1996)
Zhang, W., Jin, S., Wu, Q.: Creating Linux virtual servers. In: LinuxExpo 1999 Conference (1999)
Baños, R., Gil, C., Ortega, J., Montoya, F.G.: Multilevel heuristic algorithm for graph partitioning. In: Raidl, G.R., Cagnoni, S., Cardalda, J.J.R., Corne, D.W., Gottlieb, J., Guillot, A., Hart, E., Johnson, C.G., Marchiori, E., Meyer, J.-A., Middendorf, M. (eds.) EvoIASP 2003, EvoWorkshops 2003, EvoSTIM 2003, EvoROB/EvoRobot 2003, EvoCOP 2003, EvoBIO 2003, and EvoMUSART 2003. LNCS, vol. 2611, pp. 143–153. Springer, Heidelberg (2003)
Baños, R., Gil, C., Ortega, J., Montoya, F.G.: Partición de grafos mediante optimización evolutiva paralela. In: Proceedings de las XIV Jornadas de Paralelismo, pp. 245–250 (2003)
Brisco, T.: RFC 1794: DNS support for load balancing, April 1995. Status: INFORMATIONAL (1995)
Bui, T.N., Jones, C.: Finding good approximate vertex and edge partitions is np-hard. Information Processing Letters 42, 153–159 (1992)
Bui, T.N., Moon, B.: Genetic algorithms and graph partitioning. IEEE Transactions on Computers 45(7), 841–855 (1996)
Conti, M., Gregori, E., Panzieri, F.: Load distribution among replicated Web servers: A QoS-based approach. In: Proceedings of the Workshop on Internet Server Performance, WISP 1999 (1999)
Domingos, P., Pazzani, M.: Beyond independence: conditions for the optimality of the simple Bayesian classifier. In: Proceedings of the 13th International Conference on Machine Learning, pp. 105–112 (1996)
Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. Proceedings of the 12th International Conference on Machine Learning, 194–202 (1995)
Fayyad, U., Irani, K.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the 13th International Conference on Artificial Intelligence, pp. 1022–1027 (1993)
Fiduccia, C., Mattheyses, R.: A linear time heuristic for improving network partitions. In: Proceedings of the 19th IEEE Design Automation Conference, pp. 175–181 (1982)
Ghini, V., Panzieri, F., Roccetti, M.: Client-centered load distribution: A mechanism for constructing responsive web services. In: HICSS (2001)
Hand, D.J., Yu, K.: Idiot’s Bayes - not so stupid after all? International Statistical. Review 69(3), 385–398 (2001)
Hochsztain, E., Millán, S., Menasalvas, E.: A granular approach for analyzing the degree of affability of a web site. In: Alpigini, J.J., Peters, J.F., Skowron, A., Zhong, N. (eds.) RSCTC 2002. LNCS (LNAI), vol. 2475, pp. 479–486. Springer, Heidelberg (2002)
Holte, R.C.: Very simple classification rules perform well on most commonly used datasets. Machine Learning 11, 63–90 (1993)
Kohavi, R.: Scaling up the accuracy of naive-Bayes classifiers: a decision-tree hybrid. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, pp. 202–207 (1996)
Kwan, T.T., McGrath, R.E., Reed, D.A.: NCSA’s World Wide Web server: Design and performance. IEEE Computer, 68–74 ( November 1995)
Larrañaga, P., Lozano, J.A.: Estimation of Distribution Algorithms. A New Tool for Evolutionary Computation. Kluwer Academic Publisher, Dordrecht (2002)
Martin, B.: Instance-based learning: Nearest neigbour with generalisation. working paper series 95/18 computer science. Technical report, Hamilton, University of Waikato
Pazzani, M.: Constructive induction of Cartesian product attributes. Information, Statistics and Induction in Science, 66–77 (1996)
Quinlan, R.: C4.5 Programs for Machine Learning. Morgan Kauffman, San Francisco (1993)
Robles, V., Larrañaga, P., Peña, J.M., Menasalvas, E., Pérez, M.S., Herves, V.: Learning semi naïve Bayes structures by estimation of distribution algorithms. In: Pires, F.M., Abreu, S.P. (eds.) EPIA 2003. LNCS (LNAI), vol. 2902, pp. 244–258. Springer, Heidelberg (2003)
Engelschall, R.S.: Load balancing your web site: Practical approaches for distributing HTTP traffic. Web Techniques Magazine 3(5) (1998)
Simon, H.D., Teng, S.: How good is recursive bisection? SIAM Journal of Scientific Computing 18(5), 1436–1445 (1997)
Srisuresh, P., Gan, D.: RFC 2391: Load sharing using IP network address translation (LSNAT) (August 1998); Status: INFORMATIONAL
Ting, K.M.: Discretization of continuous-valued attributes and instance-based learning. Technical Report 491, University of Sydney (1994)
Walshaw, C., Cross, M.: Mesh partitioning: a multilevel balancing and refinement algorithm. SIAM Journal of Science Computation 22(1), 63–80 (2000)
Zhang, W.: Linux virtual server for scalable network services. In: Ottawa Linux Symposium (2000)
Author information
Authors and Affiliations
DATSI, Universidad Politécnica de Madrid, Madrid, Spain
José M. Peña, Víctor Robles & María S. Pérez
DLSIS, Universidad Politécnica de Madrid, Madrid, Spain
Óscar Marbán
- José M. Peña
You can also search for this author inPubMed Google Scholar
- Víctor Robles
You can also search for this author inPubMed Google Scholar
- Óscar Marbán
You can also search for this author inPubMed Google Scholar
- María S. Pérez
You can also search for this author inPubMed Google Scholar
Editor information
Editors and Affiliations
Department of Computer Science, CICESE Research Center, Ensenada, México
Jesús Favela
Facultad de Informática, Universidad Politécnica de Madrid., Campus de Montegancedo s/n, 28660, Boadilla del Monte (Madrid), Spain
Ernestina Menasalvas
Escuela de Ciencias Físico-Matemáticas, Universidad Michoacana de San Nicolás de Hidalgo,, Av.Francisco J. Mujica, Morelia - Michoacán, México
Edgar Chávez
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Peña, J.M., Robles, V., Marbán, Ó., Pérez, M.S. (2004). Bayesian Methods to Estimate Future Load in Web Farms. In: Favela, J., Menasalvas, E., Chávez, E. (eds) Advances in Web Intelligence. AWIC 2004. Lecture Notes in Computer Science(), vol 3034. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24681-7_24
Download citation
Publisher Name:Springer, Berlin, Heidelberg
Print ISBN:978-3-540-22009-1
Online ISBN:978-3-540-24681-7
eBook Packages:Springer Book Archive
Share this paper
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative