Peruser MPM for Apache2

apache2-mpm-peruser [mirror / folk]

Peruser is an Apache 2.x module based on metuxmpm, a working implementation of the perchild MPM. The fundamental concept behind all of them is to run each apache child process as its own user and group, each handling its own set of virtual hosts. Peruser and recent metuxmpm releases can also chroot() apache processes. The result is a sane and secure web server environment for your users, without kludges like PHP’s safe_mode.

Metuxmpm creates one child process per unique user and group, which then spawns threads to handle requests. This requires you to use multithreaded versions of PHP, as well as Perl and Python if you want to use mod_perl and mod_python. Between the three of them, and all the third-party modules and libraries they link to, there can be a lot of non-threadsafe code involved. That can cause nasty crashes that are very hard to reproduce and diagnose.

Currently a non-threaded Apache, along with non-threaded PHP, Perl, and Python is the most stable solution for hosting services. Unfortunately, just removing thread support from metuxmpm leaves you with just one apache child handling requests for one or more virtual hosts. Peruser, provides multiple processes for each unique user/group/chroot. Although it’s working well so far but there is still a lot of room for improvement. Leave your comment if you have questions, suggestions, or patches 🙂

Download

Peruser is distributed as a patch to the Apache source: httpd-2.2.16-peruser-0.4.0rc2.patch

I’ve precompiled packages (i386) for Debian Squeeze and Debian Wheezy.

You may browse the complete download directory containing the mirror backups of the original sources and my Debian Apache2 and PHP5 patches for Squeeze and Wheezy under https://4ufiles.flo.sh/webhosting/apache2-mpm-peruser/.

Configuration

So now that you have the peruser MPM installed, you want to configure it. For this, you might want to get to know how peruser internally works, so you can make the most of it.

Basic configuration

To get a minimal copy working, you need to specify the server environments and attach them to the virtualhosts. Note that the server environments must be specified before the virtualhosts. A single server environment can be applied to multiple virtualhosts.

To create a simple server environment, write this to your apache configuration file:

<Processor myserver>
    User john
    Group john
</Processor>

This creates a server environment named myserver running under the john user and group. To apply this server environment to the virtualhost, add ServerEnvironment directive to your virtualhost:

<Virtualhost *:80>
    ServerName john.example.com
    ServerEnvironment myserver
</Virtualhost>

So here you go, all requests for john.example.com will be using the myserver server environment and will be run under the john user and group.

Now just remember that all virtualhosts need to have server environment specified. Having a virtualhost without a server environment, will result in all requests for that virtualhost dropped with 503 (Internal Server Error).

The limits

Now that you have your server environments set up, you should specify the process limits. Here are your default limits:

ServerLimit 256
MaxClients 256
MinProcessors 0
MaxProcessors 10
MinSpareProcessors 2
MaxSpareProcessors 0
MinMultiplexers 3
MaxMultiplexers 10
MultiplexerIdleTimeout 0
MaxRequestsPerChild 1000
ExpireTimeout 1800
IdleTimeout 900
ProcessorWaitTimeout 5 10

ServerLimit and MaxClients determine the total limit of all children. You should set them the same.
MinProcessors and MaxProcessors set the minimum and maximum worker limits for a single server environment.
MinSpareProcessors and MaxSpareProcessors sets the limits of idle workers in a single server environment.
MinMultiplexers and MaxMultiplexers set the minimum and maximum multiplexer count.
MultiplexerIdleTimeout sets the idle timeout (time the multiplexer can be idle, before stopped). 0 = disabled.
ExpireTimeout sets the maximum time a child can handle a single request (you want to set this high, if you want to allow long downloads). 0 = disable
IdleTimeout sets the maximum time a child can be idle. 0 = disable
ProcessorWaitTimeout sets the amount of time the multiplexer waits the processor, if it is busy. First argument sets the time, second argument the amount of levels between not waiting and waiting the maximum time. See Waiting for the processor.

Peruser also allows setting the next directives inside a <Processor> tag, to make them server-environment-specific:

MinProcessors
MaxProcessors
MinSpareProcessors
MaxSpareProcessors
Chroot
Cgroup

Monitoring

Now that you have your server set up and running, you want to know, how the peruser is handling itself and are the limits specified enough for the server to handle.

To do this, peruser has an in-built server-status hook, which displays the list of all children and statistic information about them. To see this, you need to enable the server-status page and ExtendedStatus:

<Location /server-status>
    SetHandler server-status
</Location>

ExtendedStatus On

After enabling this, restart the server and go to the server-status page. Under the request list, you should see the peruser status.

Peruser status

An example of the peruser status is below:

ID	PID	STATUS	SB STATUS	Type	        Processor	Pss	AVAIL
0	0	STANDBY	DEAD	        PROCESSOR	senv1	        0/0/30	100%
1	0	STANDBY	DEAD	        PROCESSOR	senv2	        0/0/30	100%
2	0	STANDBY	DEAD	        PROCESSOR	senv3	        0/0/30	100%
3	2653	ACTIVE	BUSY_WRITE      PROCESSOR	senv4	        5/3/30	100%
4	2654	READY	READY           MULTIPLEXER	Multiplexer     2/2/10	100%
5	2655	READY	READY           MULTIPLEXER	Multiplexer     2/2/10	100%
6	2656	READY	READY           WORKER	        senv4	        5/3/30	100%
7	2657	READY	READY           WORKER     	senv4	        5/3/30	100%
8	2658	READY	READY           WORKER     	senv4	        5/3/30	100%
9	2659	ACTIVE	BUSY_WRITE      WORKER   	senv4	        5/3/30	100%
10	0	STANDBY	DEAD	        UNKNOWN	        (null)	        0/0/0	0%
11	0	STANDBY	DEAD	        UNKNOWN	        (null)	        0/0/0	0%
12	0	STANDBY	DEAD	        UNKNOWN	        (null)	        0/0/0	0%
13	0	STANDBY	DEAD	        UNKNOWN	        (null)	        0/0/0	0%
14	0	STANDBY	DEAD	        UNKNOWN	        (null)	        0/0/0	0%

The ID and PID fields should be self-explanatory.
The status displays the peruser status of the child, it can be one of:
- STANDBY (child is dying or is dead)
- STARTING (child is starting up)
- READY (child is idle)
- ACTIVE (child is handling a request)
The SB STATUS is the child’s scoreboard status (just for reference).
Type displays the child type (MULTIPLEXER, PROCESSOR, WORKER or UNKNOWN). See the process types for detailed explanation.
Processor displays the server environment.
Pss displays the server environment’s children (alive/idle/max).
AVAIL displays the server environment’s availability. This should be 100% at all times. If this gets below 100%, then there aren’t enough free children for the server environment and the multiplexers have dropped requests to save the rest of the server from collapsing.

Also, in this list you can see some odd entries with STANDBY status and UNKNOWN child type. These slots mean that these were previously in use, but are currently free, so it’s completely normal to have these.

Peruser statistics

As of Peruser 0.4.0rc2, it is now possible to see some statistics about each server environment. To see this, use the ?peruser_stats var to access server_status, /server_status?peruser_stats for example.

This displays the list of server environments set in the server, ordered by the most active (which has the most children alive) server environment. In the list you can see the count of handled connections, handled requests and the number of requests that have been dropped because of hitting the child limit.

Under the Hood

Here’s some description how peruser internally works and how requests get handled inside the peruser.

The request cycle

Here’s an example, how a simple request gets all the way to the processing:

Client connects to HTTP port, sends the request
Multiplexer receives the connection, reads the request and checks for which virtualhost it is
Multiplexer checks if a worker is available for the request, then forwards it.
- If no worker is available, it waits for some time (see Waiting for the processor)
Multiplexer returns to waiting for new connections

Worker receives the connection and handles the request
If KeepAlive? is enabled, the worker listens on the connection for more requests:
- If the client happens to send a request for another virtualhost (which the worker cannot handle) within the same connection, it sends the connection back to the multiplexer
Worker returns to waiting for new connections

Process types

Main process

The main process never handles any connections from the outside, but only handles the maintenance of the children and spawns new children if required.

This process runs under root privileges as these are required to switch users after forking.

Multiplexer

Multiplexers basically listen on the public port (:80) and read the request to determine for which virtualhost it is. After determining the virtualhost, it forwards the request to it’s server environment (worker pool). Multiplexers run under the user/group defined by “User” and “Group” directives.

It does this by firstly determining the virtualhost to which the request is made and then passing the request to the worker pool assigned to that virtualhost.

If the request is made to an SSL-enabled ip/port then the virtualhost determination is skipped and the socket is directly passed to the first virtualhost’s worker pool defined in that ip/port. This also means that the multiplexer does no SSL handshaking – this is all done by the worker.

Processor / worker

Worker is the process where all the requests will be finally executed. Workers run under the user/group defined in the <Processor> tag.

It receives connections from the multiplexer and processes them.

Internally each <Processor> tag creates a PROCESSOR type worker in the child table that never gets cleaned up. This means that if the server limit has been reached, then a single worker can still handle requests for his virtualhosts.

Waiting for the processor

In Peruser 0.3.0 and earlier the multiplexer would send the connection to the worker pool without checking if there are any free workers to receive the connection. This would leave the multiplexer in a blocked state until some worker comes around and receives the connection. If there is some major problem with the virtualhost (eg MySQL server is not responding) then no worker may become available until one of them is killed by the parent if it’s ExpireTimeout is reached – this leaves any new request made to that virtualhost leaving a multiplexer blocked, which will gradually bring the server down to a halt.

In order to fix this problem, the multiplexers now check if there are any free workers in the pool before passing the socket and try to wait for them if there isn’t (or drop the request if none comes available).

If the workers are busy when the multiplexer starts to pass the connection, it will try to wait for them to finish – the maximum time in seconds to wait is calculated by formula: (availability / 100) * ProcessorWaitTimeout where availability is by default 100 and ProcessorWaitTimeout is the directive in the configuration (default is 5).

If the workers are still busy after waiting the maximum time, then it will reduce the availability of the worker pool by 10 and drop the request with error 503 (SERVICE UNAVAILABLE). However if any of the workers come available, the availability is reset to 100 and the request is passed.

—

DISCLAIMER: The software and patches are provided “as is” and you use the stuff at your own risk. The information above and the original source code is a copy of the deprecated peruser.org (trac) website. Apache2-mpm-peruser was written by Stefan Klingner (Hey! Many thanks for this great code 🙂 ) – I’ve “only” patched the stuff for Debian Squeeze and Wheezy.

8 thoughts on “Peruser MPM for Apache2”

Go here 2014-03-18 at 11:16 pm

Nice post, thank you!

Reply ↓
flo Post author2014-09-26 at 4:17 am

UPDATE!

Latest Debian wheezy deb files are now available for the i386 and amd64 architecture.

Reply ↓
flo Post author2014-10-08 at 4:17 am

Latest Debian squeeze (lts) deb files are now available for the i386 architecture.

Reply ↓
flo Post author2014-10-17 at 2:46 pm

Debian squeeze (lts) deb files are updated (i386):
– apache2 (mpm-peruser): 2.2.16-6+squeeze14
– php5: 5.3.3-7+squeeze22

Debian wheezy deb files are updated (i386 and amd64):
– apache2 (mpm-peruser): 2.2.22-13+deb7u3
– php5: 5.4.4-14+deb7u14

Reply ↓
flo Post author2014-11-03 at 2:36 am

Hi Evgenij,

indeed I’m currently only using dedicated ip/port combinations for SSL-enabled virtualhosts. I guess this is the standard scenario having to use individual SSL certificates for website hosting.

Which scenario do you have requiring one SSL certificate being valid for multiple SSL enabled virtualhosts using different user IDs and permissions?

Reply ↓
flo Post author2014-11-05 at 3:37 am

Hi Evgenij,

usually for technical reasons you can only bind one SSL certificate per ip/port.
If you have to use multiple SSL certificates per ip/port you may have a look at http://en.wikipedia.org/wiki/Server_Name_Indication …

But for security reasons I prefer the old style configuration using one ip/port per SSL hosted customer domain space.

-Flo

Reply ↓
flo Post author2016-08-15 at 1:03 am

Debian wheezy deb files are updated (i386 and amd64):
– apache2 (mpm-peruser): 2.2.22-13+deb7u7
– php5: 5.4.45-0+deb7u4

Reply ↓
flo Post author2017-03-08 at 3:07 am

Debian wheezy deb files are updated (i386 and amd64)
– php5: 5.6.30-1~dotdeb+7.1

Reply ↓

Flo's press 4u

Flo's public space 4u :-)