[Updated 2015-03-10] How to work with root shells in a PCI-DSS 10.2.2 compliant environment

Table of content

  • Context and objectives
  • PCI-DSS 10.2.2
  • How it limits your productivity
  • Solutions
    • PAM
    • Bash PROMPT_COMMAND
      • Concepts
      • How it works
      • Examples of output

Context and objectives

If you ever worked as a system administrator in a Linux PCI-DSS environment, you know it’s sometimes (often) difficult to administrate your servers from command-line because of PCI-DSS Requirement 10.2.2 which forbids you to open a real root shell, whatever the way, because actions taken in a root shells are not logged anywhere and you must log every privilege usage. That’s why most of us end-up using sudo, but it is limited and not always easy to work with, as I will explain later (PATH completion, wildcards, etc…)

The purpose of this post is to provide a way for administrator to open a root shell (whatever the way) and work in that environment while being compliant with 10.2.2, ie: having all actions taken in this shell logged and centralized.

PCI-DSS 10.2.2

PCI-DSS Requirement 10 : Track and monitor all access to network resources and cardholder data

PCI-DSS Requirements Testing Procedures Guidance
10.2.2 All actions taken by any individual with root or administrative privileges 10.2.2 Verify all actions taken by any individual with root or administrative privileges are logged. Accounts with increased privileges, such as the “administrator” or “root” account, have the potential to greatly impact the security or operational functionality of a system. Without a log of the activities performed, an organization is unable to trace any issues resulting from an administrative mistake or misuse of privilege back to the specific action and individual.

How it limits your productivity

Because of that, it is impossible to use wildcard facilities or auto-completion from the CLI. For example, if the /etc/bar/ folder is not publicly readable, fgrep foo /etc/bar/*.xml will not work as the wildcard will not expand. You’d have to use:

  • ls /etc/bar, then one grep foo per XML file,
  • or grep -R foo /etc/bar/ (which includes non-xml files and subdirectories),
  • or find /etc/bar -name '*.xml' -maxdepth 1 -exec grep foo {} \+ (which is far from trivial)
  • or any other half-smart workaround which sucks anyway…

It would be a lot easier if you could just gain root privilege using sudo -s, sudo -i or maybe for old-schooler that haven’t read the man page of sudo: sudo su - # Eww!. Direct logins as root (ssh root@foo.bar are still forbidden and using su is strongly discouraged. Remember that we are trying to find a way to gain root privileges, not to tweak PCI-DSS and PCI-DSS states that you must not log-in as root and you shall not have the knowledge of the entire root password thus making su not usable without having someone entering his other half of the password. Ok so sudo -s would be allowed if only we could provide logs for every actions taken in the privileged shell… How to do that ? Well, let’s check possible solutions. We will also check how to generalize this concept to any shells so we can track any commands anywhere, anytime … !

Solutions

PAM

At first, I looked at PAM and how to log every keystrokes entered within a root shell.
To track such actions we need to configure the following:

  • the PAM module: pam_tty_audit.so
  • the auditd service: /sbin/chkconfig auditd on && service auditd start
  • append to /etc/pam.d/system-auth: session required pam_tty_audit.so disable=* enable=root
  • append to /etc/pam.d/su and /etc/pam.d/su-l: session required pam_tty_audit.so enable=root
  • append to /etc/pam.d/sudo and /etc/pam.d/sudo-i: session required pam_tty_audit.so open_only enable=root

It works, I tried it. I could see all keystrokes in /var/log/audit/audit.log or using aureport --tty -if /var/log/audit/audit.log (because you cannot read them in plain-text in the logs …)

The downsides are:

  • It is not trivial to deploy
  • You cannot exploit logs right-away, you need a tool: aureport
  • There is a lot of noise because you log keystrokes, not commands, so you can see lots of “<^C><^C><Up><Del>“….
  • It logs every keystrokes ! Including kerberos passwords, MySQL passwords, anything you enter in a password prompt even if echoing is off is logged because it is a keystroke

Update 2015-03-10: pam_tty_audit has been patched to support password mode and not log the keystrokes while in “password-mode”. You can find the patch here https://www.redhat.com/archives/linux-audit/2013-May/msg00007.html and the Red-Hat documentation updated accordingly here https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Security_Guide/sec-Configuring_PAM_for_Auditing.html

    It is great for security. To be honest I believe that everyone should enable this feature not only for root but for everyone so that if ever one of you application gets hacked (say apache through a poor website) then you can retrieve every keystrokes and commands entered by the hacker while he was exploiting an unprivileged-shell (eg: id=apache)

    So okay, it’s great news for security and one can generalize this concept without risking to store plain-text passwords but it’s not enough… As someone said “we have to go deeper”. With pam_tty_audit only me and my fellow coworkers could open root shells and start working but it will be a pain in the ass to review who did what because the produced logs are not easy to read (you need aureport –tty …). We are used to read /var/log/secure whenever we want to know who restarted a daemon and such… In addition of the security layer added by pam_tty_audit we need a solution that produces sudo-like logs for legitimate root shells, this is where bash PROMPT_COMMAND enters the game.

    Bash PROMPT_COMMAND

    The idea is to make a clever usage of the PROMPT_COMMAND bash variable. The man page says: If set, the value is interpreted as a command to execute before the printing of each primary prompt ($PS1).

    Concepts

    It means that you can execute a custom command every time a command is entered in a bash shell (each time Enter is pressed). Using this feature and a script which use logger we can easily recreate sudo-like logs for any shells, including root’s. This is the first step.

    If you think a little more, since there are ways to keep track of the real source-user (initial user who opened a shell in a shell in a shell in a shell…. etc), you can even track down anyone changing identity (sudo -u foo -s). It also means that in case someone uses N-level of shells before accessing the root shell, like with sudo -u foo -s ; sudo sudo su -, you can still know the real source-user and keep track of his privileged actions accordingly.

    For logging purpose, I had to chose an app-name for both cases: root shells and switched-identity shells. For this example, I chose su-company and chusr-company. Replace “company” by whatever you want. I also defined and hard-coded a log format, but you can easily change that by editing the logger lines, obviously. I did that so I could write Ossec decoders and rules to match these logs.

    How it works

    You just have to set the variable system-wide and assign a bash function to it which will do the testing and logging parts.

    • /etc/profile.d/zz-log-root-cmd.sh
    ##################
    # Written by Florian Crouzat &amp; Anwar El Fatayri
    # Contact: 
    # Feel free to do whatever you want with this file.
    # Just make sure to credit what deserve credits.
     
    # Warning: do not use a shebang if you are to place this script in /etc/profile.d/
    ##################
     
    # Get informations about the parent of the current process via PPID.
    # The idea is to go up in the process-tree until you found the first login-shell
    function get_ppid_information() {
            # Get informations about the PPID
            command=$(ps -o cmd= -p $ppid)
            user=$(ps -o user= -p $ppid)
            # Get the PPID of the PPID for the next iteration
            ppid=$(ps -o ppid= -p $ppid)
    }
     
    # Init: get the parent PID of the current shell using $PPID
    ppid=$PPID
     
    # First-pass: Get informations about our PPID (command and user) and initialize future PPID
    get_ppid_information
     
    # Then, travel the process tree until we get the first user that logged in
    while true ; do
             [[ $command != *bash* ]] &amp;&amp; [[ $command != *su* ]] &amp;&amp; break || get_ppid_information
    done
     
    function log_root_shell_cmd() {
     
            # Get the last command from history properly (delete white spaces) using bash's builtin fc
            shell_cmd=$(fc -ln | tail -n1 | sed 's/^\t //')
     
            # If the last command has been repeated, then skip logging
            if [[ $previous_shell_cmd != $shell_cmd ]] ; then
     
                    # If this is a root shell, we log.
                    if [ $UID -eq  0 ] ; then
                            logger -p authpriv.crit -t su-company "TTY=$SSH_TTY ; PWD=$PWD ; USER=$user ; COMMAND=$shell_cmd"
                    # If this is a user shell, but the source-user and current user differ, we log.
                    elif [[ $UID != 0 ]] &amp;&amp; [[ $user != $USER ]] ; then
                            logger -p authpriv.crit -t chusr-company "TTY=$SSH_TTY ; PWD=$PWD ; SRC_USER=$user ; USER=$LOGNAME ; COMMAND=$shell_cmd"
                    fi
     
                    # Save the last command
                    previous_shell_cmd=$shell_cmd
                    export previous_shell_cmd
            fi
    }
     
    # Append log_root_shell_cmd function to PROMPT_COMMAND
    PROMPT_COMMAND="${PROMPT_COMMAND:-:} ; log_root_shell_cmd"
    • Raw script available for download here.

    Maybe you are thinking that it is very easy to hide your tracks and cancel this behavior, don’t over-think it, it is !
    You can use any other shell, for example sh or ksh which don’t source the same files, you can unset the variable, edit the script, etc. But, as I said earlier in this post, the intent is not to protect yourself from a malicious root user because you just can’t, as soon as someone is root he can always stop any tracking process you created, whatever the complexity. You’ll just log the “stop” command, and after that, you are in the dark. So, don’t loose time trying to be smart against malicious root users they can cancel whatever you do, and focus on the main idea: simplify the life of your goodwill sysadmins which will not try to hide things from you and just want to work in a better environment.

    Examples of usage
    • For root shells:
    # The actions
    (11:48) (florian@bar.wnd) (~) $ sudo -s # let's open a root shell
    (11:49) (root@bar.wnd) (/home/florian) # ls -al /etc/pki/tls/private/*.key # yay, completion !
    [...]
    (11:49) (root@bar.wnd) (/home/florian) # whoami
    root
    
    # And the logs ...
    (11:49) (florian@bar.wnd) (~) $ sudo tail -n20 /var/log/secure
    [...]
    May  7 05:48:53 bar sudo:  florian : TTY=pts/1 ; PWD=/home/florian ; USER=root ; COMMAND=/bin/bash
    May  7 11:49:02 bar su-company: TTY= ; PWD=/home/florian ; USER=florian ; COMMAND=ls -al /etc/pki/tls/private/*.key
    May  7 11:49:08 bar su-company: TTY= ; PWD=/home/florian ; USER=florian ; COMMAND=whoami
    • For switched-identity shells:
    # The actions
    (11:53) (florian@bar.wnd) (~) $ sudo -u jpc -i # let's take-over someone else identity
    [jpc@bar ~]$ id # and do harmless stuff in this shell. Yet, it's good to know.
    uid=501(jpc) gid=501(jpc) groups=501(jpc),504(sftp)
    
    # And the logs ...
    (11:53) (florian@bar.wnd) (~) $ sudo tail -n20 /var/log/secure
    [...]
    May  7 05:53:28 bar sudo:  florian : TTY=pts/1 ; PWD=/home/florian ; USER=jpc ; COMMAND=/bin/bash
    May  7 05:53:31 bar chusr-company: TTY= ; PWD=/home/jpc ; SRC_USER=florian ; USER=jpc ; COMMAND=id

    Feel free to ask any questions in the comments, and to provides patches by email, I’ll surely integrate them and mention your name.

    CentOS/RHEL 6.4, Squid 3.1.10, IPV6 and TCP_MISS/503 errors

    Since Squid 3.1.0, IPv6 support is “native” [1] and the following behavior occurs: The most active [IPv6 operation] will be DNS as IPv6 addresses are looked up for each website.

    Unfortunately, until Squid 3.1.16 which introduce the configuration parameter dns_v4_first [2], you cannot change Squid order for DNS queries and IPv6 AAAA queries will always occurs first. (Well, you could always recompile with --disable-ipv6 but I cannot afford to recompile anything in my environment.)

    This is where it hurts: some NS out there on the Internet are very badly configured and timeout on any AAAA queries instead of just answering NXDOMAIN or NOERROR with an empty AAAA entry.
    It means that when Squid queries (directly or via a resolver) for an AAAA entry to these bugged NS, it results in a timeout (by default it retries for 15s which is 3*dns_retransmit_interval) and the request fails with a TCP_MISS/503 code.

    To summarize, EL 6.4 provides Squid 3.1.10 which is stuck between 3.1.0 (native IPv6) and 3.1.16 (which allow you to try IPv4 A queries first) and won’t work with broken NS that doesn’t handle AAAA queries correctly…

    ps: I’m interested in any insights about this and/or why it doesn’t fallback to IPv4 after the three retry.

    [1] – http://wiki.squid-cache.org/Features/IPv6#IPv6_in_Squid
    [2] – http://www.squid-cache.org/Versions/v3/3.1/cfgman/dns_v4_first.html

    RHEL 6.4: Pacemaker 1.1.8, adding CMAN Support (and getting rid of “the plugin”)

    Since RHEL6.0, pacemaker is shipped as a technology preview (TP).
    As explained in a blog post[1] from The Cluster Guy, you could choose between different setup for membership and quorum data, actually, three options.

    But since RHEL 6.4 and Pacemaker 1.1.8, in an attempt to move towards what is best supported by Red-Hat (CMAN), you should consider dropping option 1 (aka “the plugin”) and move to CMAN. It is not mandatory yet, but it will be soon enough, probably starting with 6.5. As a reminder, “the plugin” was configured like this:

    $ cat /etc/corosync/service.d/pcmk
    service {
            # Load the Pacemaker Cluster Resource Manager
            name: pacemaker
            ver:  1
    }
    

    Scared to move to CMAN ? Probably a little, but remember a tech preview cannot be considered stable and you cannot expect the product to stay consistent. Hopefully, we have been warned about this in Red-Hat’s release notes for 6.4, see [1]. What we have not been informed about is the loss of crmsh (aka the crm shell), replaced with pcs. About that, if you don’t want to migrate from crmsh to pcs, you’ll have to install crmsh from opensuse.org repository[3].

    Anyway, let’s see how to migrate a production cluster without incident.
    This how-to is just a mix of personal experience (also, thanks to Akee), “Quickstart Red-Hat” [4] guide from clusterlabs.org and up-to-date “Cluster from Scratch” [5].

    • On a single node:
    # Do not manage anything anymore. This is persistent across reboot.
    crm configure property maintenance-mode=true
    
    • On all nodes
    # Shutdown the stack.
    service pacemaker stop && service corosync stop
    
    # Remove corosync from runlevels, CMAN will start corosync
    chkconfig corosync off
    
    # Install CMAN
    yum install cman ccs
    
    # Specify the cluster can start without quorum
    sed -i.sed "s/.*CMAN_QUORUM_TIMEOUT=.*/CMAN_QUORUM_TIMEOUT=0/g" /etc/sysconfig/cman
    
    # Get rid of the old "plugin"
    rm /etc/corosync/service.d/pcmk
    
    # Prepare your host file for rings definitions
    vim /etc/hosts
    > 192.168.1.1 node01.example.com
    > 192.168.100.1 node01_alt.example.com
    > 192.168.2.1 node02.example.com
    > 192.168.200.1 node02_alt.example.com
    

    Okay, so now, we have set-up the environment, we must define the ring(s) and the nodes. We also must configure CMAN to delegate fencing to pacemaker.

    • On a single node:
    # Define the cluster
    ccs -f /etc/cluster/cluster.conf --createcluster pacemaker1
    
    # Create redundant rings
    ccs -f /etc/cluster/cluster.conf --addnode node01.example.com
    ccs -f /etc/cluster/cluster.conf --addalt node01.example.com node01_alt.example.com
    ccs -f /etc/cluster/cluster.conf --addnode node02.example.com
    ccs -f /etc/cluster/cluster.conf --addalt node02.example.com node02_alt.example.com
    
    # Delegate fencing
    ccs -f /etc/cluster/cluster.conf --addmethod pcmk-redirect node01.example.com
    ccs -f /etc/cluster/cluster.conf --addmethod pcmk-redirect node02.example.com
    ccs -f /etc/cluster/cluster.conf --addfencedev pcmk agent=fence_pcmk
    ccs -f /etc/cluster/cluster.conf --addfenceinst pcmk node01.example.com pcmk-redirect port=node01.example.com
    ccs -f /etc/cluster/cluster.conf --addfenceinst pcmk node02.example.com pcmk-redirect port=node02.example.com
    
    # Encrypt rings and define port (important to stick with default port if SELinux=enforcing)
    ccs -f /etc/cluster/cluster.conf --setcman keyfile="/etc/corosync/authkey" transport="udpu" port="5405"
    # Finally, choose your favorite rrp_mode
    ccs -f /etc/cluster/cluster.conf --settotem rrp_mode="active" 
    

    Now we must validate CMAN’s configuration and propagate it to other nodes: only done once in the entire life of the cluster as the resource-level configuration is obviously still maintained across all nodes in pacemaker’s CIB.

    ccs_config_validate -f /etc/cluster/cluster.conf
    scp /etc/cluster/cluster.conf node02.example.com:/etc/cluster/cluster.conf
    # If you were not using corosync's secauth, then
    scp /etc/corosync/authkey node02.example.com:/etc/corosync/authkey
    
    • On all nodes:
    # Add CMAN to the runlevels
    chkconfig cman on
    
    # Start CMAN
    service cman start
    
    # Check the rings
    corosync-objctl | fgrep members
    # Check secauth, rrp_mode, transport etc.
    corosync-objctl | egrep ^totem
    
    # Start pacemaker
    service pacemaker start
    
    • Finally, on a single node:
    # Validate everything
    crm_mon -Arf1
    
    # Exit maintenance-mode
    crm configure property maintenance-mode=false
    

    At this point, everything should be back to normal and you won’t have to worry about anything else than resource management anymore ;)

    [1] – http://blog.clusterlabs.org/blog/2012/pacemaker-and-cluster-filesystems/
    [2] – https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/6.4_Technical_Notes/pacemaker.html
    [3] – http://download.opensuse.org/repositories/network:/ha-clustering/CentOS_CentOS-6/network:ha-clustering.repo
    [4] – http://clusterlabs.org/quickstart-redhat.html
    [5] – http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/_adding_cman_support.html