Saturday 5 November 2016

SELinux Sucks!

If you find yourself here then you probably have at least some vague idea about how security is enforced on Unix systems. Or maybe you just wanted to read my continuing woes with computer systems.

I did spend some time thinking about a suitable title for this post. There were so many to choose from:
  • SELinux considered harmful 
  • The emperor's new clothes
  • I want to believe
...but SELinux sucks sums it up nicely.

TL&DR

SELinux is ridiculously expensive and is unlikely to improve the Security of your system. It may make it worse.

Introduction

For those who know nothing about SELinux.....don't be hard on yourself. As a lot of this post discusses, there are no SELinux experts. But in case you really know nothing about SELinux then a bit of context may help.

Unix (and therefore Linux, BSD etc) has a very elegant permissions system. There are lots of descriptions of how it works on the internet. Its read/write/execute and owner/group/other primitives can be combined to implement complex authorization models, not to mention the setuid, setgid and sticky bits.

But it doesn't end there.

There's sudo, capabilitites, filesystem ACLs, chroot, filesystem/network namespaces and containers. Or privilege seperation using process isolation with shared memory, local/network sockets.

Apparently this still leaves some gaps. Step forward SELinux.

What is SELinux?

(In fairness, what I'm talking about here is not SELinux per se - but the combination of SELinux and the targeted policy. There are other policies which I don't have any experience of, but given the complexity of the targeted policy, these will likely be similar in practice. For that reason, given that those other policies are not as widely used, and for brevity I will simply refer to the implementation of the targeted policy and SELinux as SElinux).

SELinux (and its policy) is a set of rules which are compiled into a configuration loaded and implemented at runtime.

Operations on entities are mediated by this abstract set of rules based on the labels attached to those entities and the user trying to effect the change.

So apart from the compilation step, not that different from permissions?

Well, actually, yes – the configuration of SELinux is a mystery black box. Most experienced Linux/Unix users can tell by looking at permissions exposed in 'ls -l' and be able to make an accurate prediction about the outcome of an operation – and how to resolve the problem when this is not as required. The permissions are presented as 10 characters, sometimes more if we're talking about the directory the file is in or facls. While 'ls -Z' displays the SELinux labels on files, it doesn't say much about the permissions these enable. For that you need to look at the policy.

The targeted SELinux policy from Fedora is currently distributed as 1271 files containing 118815 lines of configuration. The rpm contains no documentation. On the other hand, the standard installation of Apache 2.4 on the machine I'm sitting in front of, has 143 configuration files (an unusually high number due to Ubuntu distributing stub files for every available module) and 2589 lines of configuration. So, SELinux has 10 times as many files and 45 times as much config as a very complex software package. Do I need to spell out the problem here?

Indeed, the recommended practice is not to change these files, but rather add more configuration to change the behaviour of SELinux.

One consequence of this Gordian knot is that upgrades to the configuration (which at least won't trash the extra config you have added) often need to change the labels on the filesystem; a simple upgrade can unexpectedly morph from a brief outage to hours or days of disk thrashing while every file on your disks is relabelled. And hopefully that didn't also break your security model. But...

It breaks existing security models

The targeted policy not only overrules the filesystem permissions, but also the access control mechanisms built into programs, for example 'at' is unable to read from /etc/at.allow running as a system_u user!

With the setuid bit set on an executable, you can run it as a different user, but retains the original SELinux context!

 

It is inconsistent by design

"By default, Linux users in the guest_t and xguest_t domains cannot execute applications in their home directories or the /tmp/ directory, preventing them from executing applications, which inherit users' permissions, in directories they have write access to. This helps prevent flawed or malicious applications from modifying users' files"
        - https://access.redhat.com/
       
In other words, Linux users can't run compiled C programs but can run (malicious) Java, shell script, python, PDF, Flash.... where the logic is bootstrapped by an existing executable but does not require the executable bit to be set.

What about networking?

Of course SELinux can solve every security problem; it has the capability to restrict network access. This is not available in AppArmor, and you can't apply restrictions on a per-user or per binary application using iptables.

OTOH, TCP wrappers , Network namespaces, iptables and network interfaces of type 'dummy' provide primitives which can be combined to implement complex security policies on multi (or single) tenant machines.

Debugging SELinux problems

Selinux has an option to only report, and not prevent actions. Great, that should simplify fixing things, right? However, it is my experience that it does not log all exceptions that it subsequently enforces.

Under very controlled conditions, I was investigating a problem with a system_u user running 'at'. Suspecting that SELinux was the culprit, I setenforcing 0, tried running application - it worked, no log entries found. Maybe SELinux was not the problem? So I setenforcing 1, ran app - got message "cannot set euid: Operation not permitted", no log entries found.

WTF?

Again, I set enforcing 0, ran the app. Again it worked. Again, no log entries. Just to be sure I run some stuff which I knew would violate the policy – and that produced log entries. With no idea how to begin to fix the problem, I setforcing 1 again, ran the app, this time it worked!

Yeah! problem solved.

Then, 10 minutes later "cannot set euid: Operation not permitted", but now I was getting log entries.

Automated Baselining

You don't start reading through the kernel source every time something misbehaves on Linux, so maybe you should treat the default policy in the same way, as a black box. It sounds like a reasonable strategy. Just run your system in a learning mode then apply those lessons to the policy. Indeed several commentators advocate just such an approach.  
(Trying to fix permissions in enforcing mode is a thankless task - each blocked operation is usually masking 3 or 4 further rules preventing your application from working).
So the first step to getting your application working is to switch off a critical security control? Really??!!!

Anyone who has worked on developing a complex system will tell you that getting full code coverage in test environments is a myth.

Yes, as Darwin proved, evolution works really well - but it takes a very long time.

And there are "times when it [audit2allow] doesn't get it right"

sealert is painfully slow;  in recent exercise I clocked it at around 20-25 log entries per second. Not great when you have a 100Mb log file to read. Amongst some of the oddities it identified:

SELinux is preventing /usr/libexec/mysqld from write access on the file /tmp/vmware-root/vmware145. 

- you might think this means that mysqld was attempting to write to /tmp/vmware-root/vmware145, and you'd be wrong. This is part of vmware's memory management. But vmware also uses this as a general dumping ground. The odd thing is the directory is:
       
        drwxrwxrwt.   6 root root        4096 Jul 11 15:20 tmp
        drwx------.   2 root root        4096 Jul 11 15:10 /tmp/vmware-root
       
SELinux is preventing /sbin/telinit from using the setuid capability.
SELinux is preventing /sbin/telinit from read access on the file /var/run/utmp.
SELinux is preventing /sbin/telinit from write access on the file wtmp.

Clearly Redhat are not reading their audit logs, or maybe they just disable SELinux?

SELinux encourages dumb workarounds

One the problems we ran into when SELinux was enabled on a server I recently migrated was that email stopped working. The guys with root access started working on this (I made sure they had a test script to replicate the problem) while I started looking at other ways of solving the problem - it was having a significant impact on the service. Guess who came up with a solution first?

In about 2 hours I had a working drop in replacement for '/usr/sbin/sendmail -t -i' which PHP uses for sending emails.

I'm not criticizing the Unix guys. The people working on this are very capable and do have expertise in SELinux. The problem is SELinux.

But go back and re-read my previous sentence; in 2 hours I had written a MTA from scratch and bypassed the SELinux policy written by the experts at RedHat. WTF????? If I am some sort of uber-cracker then I really am not getting paid enough.

(spookily enough one of the reasons the server could not send email is shown in the screen shot at https://fedorahosted.org/setroubleshoot/ ! This might be why Redhat 7 now has a selinux bool httpd_can_sendmail)

Now, which do you think is more secure, the original Postfix installation using a standardized config which has been extensively tested in house or the MTA I knocked up in between other jobs?

Maybe its just me?

I've spent a very long time searching the internet for stories about how people have used SELinux to prevent and investigate attacks. While there are a huge number of articles proclaiming its benefits, I struggled to find many which demonstrated any real effectiveness.

Excluding the cases where a patch had been available for at least a week before the alleged incident, I was left with:

Mambo exploint blocked by SELinux – http://www.linuxjournal.com/article/9176?page=0,0

HPLIP Security flaw – https://rhn.redhat.com/errata/RHSA-2007-0960.html

OpenPegasus vulnerability blocked by SELinux – http://james-morris.livejournal.com/25421.html
       
Just 3 cases. The first one is a very good article and I recommend reading it (although there are some gaps in the story).

Exaggerating for dramatic effect?

Am I? Yes, SELinux does have documentation. Indeed there's masses of it (after all, my point is that there *needs* to be!). The following extract comes from the httpd_t man page; this document is intended to equip administrators with enough information to manage the sandbox that the webserver runs within on a machine running RedHat's Type Enforcement policy:

If you want to allow HTTPD scripts and modules to connect to databases over the network, you must turn on the httpd_can_network_connect_db boolean.
setsebool -P httpd_can_network_connect_db 1

What this actually means is if you want your httpd to connect to entities labelled as type database_t across the network then you need to enable this. And, of course, create labels representing those resources as well. Was that obvious from the instructions? Of course, there are a whole lot of databases which are not lablled as databases in RedHat - ldap and memcache for instance.

It is interesting to note that despite specific booleans for Bugzilla, Cobbler and Zabbix (does that mean I should create a new type for every application I might run on the box?) there's no boolean controlling integration of an application server such as mod_php, fastCGI, mod_jk, mod_perl....

It also seems SELinux doesn't like other security tools muscling in on its action:
If you want to allow all daemons to use tcp wrappers, you must turn on the daemons_use_tcp_wrapper boolean. Disabled by default.
setsebool -P daemons_use_tcp_wrapper 1
 
I thought SELinux was supposed to be granular - but I can only use TCPWrappers with everything or nothing?

Should I care?

One of the problems developing secure web based applications is that everything in the site ends up running as the same user id. This doesn't mean you can't do privilege separation using sudo or daemons, but it does mean that you always have to implement security controls in your applications. A mandatory access system does not solve these problems, but it should simplify some some of them. Fortunately SELinux is not the only game in town; AppArmor, GRSecurity and Smack are all available, well tested and widely implemented on Linux systems.

Of course, if you are Google or Facebook, then you can afford to spend 100's of man years working out how to get SELinux working properly (and of course there are no security bugs in Android).


What is wrong with SELinux?

The people developing SELinux (or insisting on its use) have missed out on something I have drummed into every junior programmer I have trained:

We don't write code for computers to understand; we write it for humans to understand.
 
SELinux/Targeted policy is 

- really bad for productivity, 

- bad for availability, 

- bad for functionality

It is quicker to bypass SELinux/Targeted policy restrictions than change them to allow a denied action.

What is the solution?

The time you would spend aligning the off-the-shelf SELinux policies with your application will be better spent addressing the security of your system in conventional ways. Switch off SELinux and fix your security.

Still not convinced?

The post above was written in 2016. Here's a more recent update on my SELinux adventures.