Wednesday, October 25, 2006

PHP and Database Connection Pooling

During my recent vacation I worked on this blog entry which talks about connection pooling in PHP and some of the recent interesting results IBM published in a recent Zend Developer Zone article. I just saw Christopher Jones’ blog entry on a new feature coming in Oracle 11g so I’ve slightly adjusted it to include the news.

As is apparent from the likes of Yahoo!, Facebook and other large Web 2.0 companies who have billions of page views per month, PHP can indeed be deployed in a way which scales extremely well. Naturally with any technology that has to scale to huge volumes of traffic using large clusters of servers, there are always scalability issues that need resolving. Common issues that IT personnel deal with are monitoring and management of their clusters, application deployment, file system scalability, network topology, high availability and database scalability.

In my experience, databases have had a long history of being the typical bottleneck in PHP applications. There are many reasons for that including:
A) Dynamic Web applications tend to be heavily database driven and use the database as a searchable content store (CMS) and for transactional operations (e-commerce).
B) PHP developers prefer writing PHP code than SQL, thus often leading to sub optimal queries and moving the majority of the processing logic into PHP.
C) Web sites often serve millions of users, thus requiring PHP to deal with very heavy traffic. This in turn puts huge loads on the databases.
D) Lack of good cheap tools to find bottlenecks in database queries (such tools have always been very expensive. It will be interesting to see how MySQL's new service which also promises performance monitoring will change this).
D) PHP works best in multi-process environments such as Apache. There are many benefits to such a model especially stability, but it also typically requires each process to manage its own connection to the database.

What I'd like to talk about is the last point. Most production PHP deployments prefer to use persistent database connections in order to save the overhead of opening new connections on each request. This overhead doesn't only include the TCP/IP connection and the handshake with the database, but also server-side resource allocation such as processes, threads and the connection context which can take considerable time.

Using persistent database connections with the multi-process model creates a huge problem for large PHP clusters. Let's assume that the average PHP server is configured to spawn 200 Apache worker processes. For companies who have clusters with dozens or hundreds of PHP servers, this can end up being a huge strain on database servers who have to deal with thousands of concurrent connections. For example, a 100 server cluster will typically require 20000 concurrent connections. The performance hit on the server can be devastating for reasons including large memory consumption and crippling context switch overhead due to large quantities of threads and/or processes.

Such problems are not foreign to IBM's DB2 team. They have faced such multi-process architectures in the past and implemented a feature called connection concentrator. This feature efficiently handles large numbers of connections that have short-lived transactions which are very typical to PHP applications. Just recently IBM did some PHP specific benchmarks of this feature and wrote an article describing the results. These results really impressed me as they managed to easily scale to 10,000 concurrent DB connections (equal about to the size of a PHP cluster with 50 server) on a standard two-socket, dual-core processor Linux machine with 4 GB of memory. And not only was it on what is considered a lightweight machine for a database, there was memory to spare.

Although stability of the multi-process model has always been one of the main selling points for Apache, it has often been critiqued as not allowing for application-server like sharing of database connections between the various worker processes. This not only affects PHP but also other languages such as Python and Perl. Application server advocates have tended to point this out time and again, when in fact, it's a workaround for the database servers not scaling in this kind of scenario. Why should we be solving this problem by abandoning the rock-solid multi-process model rather than having the databases handle this common and increasingly growing runtime architecture? The traditional J2EE application server model is dying in favor of a more loosely coupled, robust and stable set of technologies that all interoperate and scale well together. Whether it's LAMP, OPAL, WIMP or other combinations of technologies, the world is moving away from monolithic systems to heterogeneous loosely-coupled systems. In such environments it becomes increasingly important for each part of the system to scale in various configurations and system architectures.

IBM has shown with this feature that they are successfully playing the loosely coupled game. Now Oracle are following and adding a feature which is also supposed to solve this problem and allow them to better scale in a heavy duty PHP environment. It will be interesting to see how their performance numbers stack up against IBM’s when the feature is released.


  1. If PHP could run safely in a multithreaded environment, PHP itself could manage a pool of connections and use them on demand, instead of relying on middleware or any specific database API to do it.

    This is one more reason why it is necessary to audit and fix the PHP extensions that currently are not yet thread safe.

  2. There is major solution missed - simply have lightweight connection establishment costs on database.

    Say, connecting to properly tuned MySQL database in heavy workload environment still takes just one or two milliseconds - that happens even with multiple thousands of requests per second.

    At such costs pooling doesn't make too much sense, so it is more of a problem for systems, where it takes quite some time to connect.

  3. It appears that PHP is already able to manage persistent connections with mysql_pconnect() function without the need for strong multithreading. Then, what's the catch ? Therefore, good programmers optimize their queries and use caching properly. Also Oracle connections are much slower thant mysql connections and it's because it needs pooling.

  4. Hi Andy,

    a while back when I started with PHP and MySQL I learned that persistant connections will never work in an environment where the Webserver is forked.

    Reason is, the Apache process spawns, the PHP script is execute, opens a persistant connection and then "dies". The next process is a totally new process and therefor unable to use database connections opened by the previous one.

    So that is what I was told on various forums, and so on - there are even comments on which hint the same.

    Anyway. Assuming this is correct (please correct me if I am wrong) when you use Apache 1.3 persistant connections do not work.

    Are you saying that with a threaded Apache2 this is different? Do you know if you have to check the model you use - e.g. prefork, are there limits to when it works and when not?

    What about lighttpd? Probably not at all, right?

    I'll keep this blog entry on my watch list. Would be great if you reply.


  5. Tom,

    Apache processes do not die after they execute the request - there's a pool of processes handling requests over and over, hence the persistent connections still work. The problem is that as you have multiple apache children, you have n*(host,user,database) connections, which in environments that have multiple users may be really unacceptable.

    Lighttpd does not use PHP internally, just over FastCGI interface, which has persistent PHP processes, where again, persistent connections work.

    I'd still advocate tuning your network/database environment, rather than using persistent connections.


  6. Yeah, all things aside, we tuned that part and actually gained far more with non-persistant connections. :-)

    When I said "died", I really meant the PHP itself, since from what I understand this is what "shared nothing" is all about.
    So I assumed (wrong ;-)), that handles are lost after a script is execute and this process "ends".

    Anyway, what is the formula for a good max_connection setting, when I have a forking Apache (1.3) and a MySQL Server and want to use persistant connections? Is it the number of childs of the apache?

  7. Just wanted to give a honorable mention to pgpool, an opensource "connection concentrator" for PostgreSQL database. I have used it in sevreral big projects and it worked like a charm. I especially like that in work tranparantly so neither the PHP application nor the PostgreSQL database itself need any modifications.

  8. There is not an ultimate advice because everyone interprets odors in their own way, and the same fragrance can smell totally different considering type of skin, hair color, temperament and even the season of a year. There are important nuances if you do not want to seem vulgar or lacking of taste.

  9. Since ages, chocolate is loved by all people due to its special taste. Many experts have noted that chocolates are unhealthy

    and continuous eating often results in tooth decay. Some medical experts have stated that chocolates contain sugar

    substances and thus add calories to the body and increase the sugar levels in the blood. But, it has been recently found

    that chocolates are good for our health as they have a lot of advantages. Chocolates contain antioxidants. Therefore they

    kill the free radicals and obstruct the oxidization of lipids into our body. The antioxidants are a concentrated form of


  10. Currently, most of us are using mobile phones. True to our nature, we have used the different features of these sophisticated gadgets to put a smile on our faces and that of others. We are sending text messages that are really humorous and listening to ring tones that would tickle our funny bones, every time the phone rings.

    At present, a multitude of ring tones are easily available and can be downloaded in different models of mobile phone handsets. Mobile phone users can choose from monophonic ring tones, polyphonic ring tones, true tones, real tones, SMS ring tones, buddy name tones, caller id tunes, etc., according to their specific requirements. With the technological advances achieved in this sector, a high degree of customization of ring tones has also become possible. Users of mobile phones can give vent to their creativity and design some witty and humorous ring tones in many of the latest models of handsets. For instance, users are free to use their own voices or any other sound to create a number of personalized ring tones! Depending on their intelligence, wit and comic timing, they would be able to create some hilarious ring tones that are guaranteed to put a smile on the lips of anyone who happens to hear them.

  11. Fair Trade Coffee is an organization that protects the laborers who work hard to bring you great coffee. Fair Trade Coffee

    is produced a bit differently from regular beans, but still ensures that the beans are of the highest caliber.

    In the United States people drink a lot of coffee, from cappuccino to espresso. Most people couldn't face a day without

    their morning cup, but most people have no clue how coffee is made. Fair Trade Coffee helps bring information on this

    process to the many coffee drinkers in the U.S.

  12. The recipe I am going to share with you today is about 350 years old! A great favourite from the Cape where the first brandy

    from Cape grapes were distilled in 1672! We have come a very long way since then when it comes to the quality of our brandy,

    but still, Cape Brandy Pudding remains an old time favourite :) Growing up in South Africa is great fun with all the recipes

    your mother makes and teaches you during your younger years!

  13. Here's how to make delicious chocolate covered strawberries. First of all ensure that the strawberries you are intending to use are dry, then allow them to be room temperature warm prior to making them. After the strawberries have been covered in chocolate, put them in your refrigerator to cool, but do not store them in the fridge. Consume within 1-2 days.

  14. Are you in search of a good amplifier? Then I would suggest you check out the JL car audio 500/1 amplifier. This amplifier is very efficient and will give little, if any, reason to worry. Here is something more about this product.

    It is a class D amplifier. What does this mean? It means that it belongs to the class of highly efficient amplifiers that are up to 90% power efficient. This is a great advantage as it means your battery will not be easily run down. It makes the most efficient use of power when compared to other amplifier classes. And for your car, this is an invaluable feature.

    This amplifier features a 12 inch polymer-coated subwoofer and a 12 inch polymer-coated passive radiator. These two are responsible for providing ample bass.

  15. Fair Trade Coffee is an organization that protects the laborers who work hard to bring you great coffee. Fair Trade Coffee is produced a bit differently from regular beans, but still ensures that the beans are of the highest caliber.

    In the United States people drink a lot of coffee, from cappuccino to espresso. Most people couldn't face a day without their morning cup, but most people have no clue how coffee is made. Fair Trade Coffee helps bring information on this process to the many coffee drinkers in the U.S.

  16. A few years ago, it was difficult to find synthetic motor oils, and equally difficult to find someone who admitted to using them. Nowadays, however, you can find synthetic motor oils on the shelves of Wal-Mart, and other retailers, and the number of people turning to synthetic motor oils, particularly in light of the recent events affecting fuel prices, has risen greatly.

    So why do people use synthetic motor oils rather than sticking with the old petroleum based stand-bys which are admittedly cheaper?

    1. Let's start with the cost per quart issue. Synthetic motor oils ARE more expensive at purchase. However, these oils last longer, requiring fewer oil changes. As a synthetic motor oil outlasts several changes of petroleum based lubricants, the ultimate out-of-pocket cost of the lubricant is less. This cost savings becomes even greater if you have someone else change your oil for you rather than doing it yourself!

  17. The history of perfume goes back to Egypt, although it was prevalent in East Asia as well. Early perfumes were based on incense, not chemicals, so aromas were passed around through fumes. The Roman and Islamic cultures further refined the harvesting and manufacturing of perfumery processes to include other aromatic ingredients.

    Thus, the ancient Islamic culture marked the history of modern perfumery with the introduction of spices and herbs. Fragrances and other exotic substances, such as Jasmine and Citruses, were adapted to be harvested in climates outside of their indigenous Asia.

  18. Earning money online never been this easy and transparent. You would find great tips on how to make that dream amount every month. So go ahead and click here for more details and open floodgates to your online income. All the best.

  19. Fantastic post. Here’s a tool that lets your build your online database without programming. There is no need to hand code PHP. Cut your development time by 90%

  20. Awesome interesting information and attractive.This blog is really rocking...Yes, the post is very interesting and I really like.

  21. I read and walked for miles at night along the beach, writing bad blank verse and searching endlessly for someone wonderful who would step out of the darkness and change my life. It never crossed my mind that that person could be me.
    cheap business electricity