Like bigbrotherApache, default Tomcat logging leaves a little something to be desired, especially in regard to forensics. And you know what they say: When Tomcat forensic logging is away, the hackers will play! Well fine, maybe nobody ever said that, but you get the point. In any case, let's play cat and mouse with those wily hackers and bolster default Tomcat logging! For this blog post we'll be working with Tomcat 7.0.56 running on Debian Linux:
root@debian $ /usr/share/tomcat7/bin/version.sh | grep "Server version" Server version: Apache Tomcat/7.0.56 (Debian)
Tomcat offers rich logging functionality. For example, Tomcat web applications can utilize the system logging API java.util.logging, the servlets logging method javax.servlet.ServletContext.log(), or a custom logging solution. In addition, Tomcat writes console messages to the /var/log/tomcat7/catalina.out file. However, what we are interested in is the Tomcat access logs. The access logfile format is defined within a Valve that implements the org.apache.catalina.valves.AccessLogValve interface within the /etc/tomcat7/server.xml configuration file:
<Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs" prefix="localhost_access_log." suffix=".txt" pattern="%h %l %u %t "%r" %s %b" />
Tomcat access logs are be stored within the /var/log/tomcat7 directory and are named "localhost_access_log.YYYY-MM-DD.txt" where "YYYY-MM-DD" is the logfile date. For example, the Halloween access log would be named "localhost_access_log.2014-10-31.txt ". Logfile entries are stored in the Common Log Format as specified by the pattern attribute of the Valve component. Consequently, Tomcat log entries will look like this little guy:
10.1.1.1 - - [31/Oct/2014:09:02:00 -0500] "GET /example.html?foo=bar HTTP/1.1" 200 999
That looks like a whole lot of crazy talk, so let's break down the Common Log Format piece by piece:
The %h pattern code logs the remote hostname. In the example log entry this value is "10.1.1.1".
The %l pattern code logs the remote username from the rarely deployed identd daemon. In the example log entry this value is "-", meaning that the identd daemon was not deployed.
The %u pattern code logs the remote username if the request was authenticated with HTTP Basic or Digest authentication. In the example log entry this value is "-", meaning that the request was not authenticated with HTTP Basic or Digest authentication.
The %t pattern code logs the date and time that the request was received in Common Log Format. In the example log entry this value is "[31/Oct/2014:09:02:00 -0500]".
The %r pattern code logs the first line of the request. In the example log entry this value is "GET /example.html?foo=bar HTTP/1.1".
The %s pattern code logs the status code of the request. In the example log entry this value is "200".
The %b pattern code logs the number of bytes sent to the client, excluding HTTP headers. In the example log entry this value is "999".
The default Common Log Format clearly provides some useful information, but surely we can flex our forensic muscle and let Tomcat logging out of the bag! Let's implement the enhanced log format by modifying the pattern attribute of the Valve component accordingly:
<Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs" prefix="localhost_access_log." suffix=".txt" pattern="%{E M/d/y @ h:mm:ss.S a z}t %a (%{X-Forwarded-For}i) > %A:%p "%r" %{requestBodyLength}r %D %s %B %I "%{Referer}i" "%{User-Agent}i" %u %{username}s %{sessionTracker}s"/>
Consequently, Tomcat enhanced log entries will now look like this bad boy:
Fri 10/31/2014 @ 9:02:00.666 PM CDT 10.1.1.1 (-) > 192.168.1.1:443 "GET /example.html?foo=bar HTTP/1.1" - 2 200 999 http-bio-443-exec-1 "https://192.168.1.1/previous.html""Mozilla/5.0 (X11; Linux x86_64; rv:33.0) Gecko/20100101 Firefox/33.0"– billy 0bcdd30af79b6aca8c8f7808c02ab530ac4c4a75
Holy guacamole! That's even more crazy talk than the Common Log Format, so let's break down the enhanced log format piece by piece:
The "%{E M/d/y @ h:mm:ss.S a z}t" pattern code logs the time in a more intuitive format by utilizing a custom SimpleDateFormat specification. The day of the week is now included, the time is now specified in 12-hour format with millisecond precision, and the time zone is now specified by abbreviation. Note that all production servers should be synchronized with a Network Time Protocol (NTP) server in order to ensure consistent time settings across the enterprise. In the example log entry this value is "Fri 10/31/2014 @ 9:02:00.666 PM CDT".
The "%a (%{X-Forwarded-For}i) > %A:%p" pattern logs the source and destination of the request. The %a pattern code logs the IP address of the client. In the example log entry this value is "10.1.1.1". The "%{X-Forwarded-For}i" pattern code logs the underlying client IP address for requests from proxy servers. Note that the value of the "X-Forwarded-For" header could be spoofed by the client. In the example log entry this value is "-", meaning that the request was either not received from a proxy server or the proxy server did not include the "X-Forwarded-For" header. The %A pattern code logs the IP address of the server, and the %p pattern code logs the server port. The server port is useful to determine whether requests were transmitted over cleartext HTTP or encrypted SSL network connections. In the example log entry these values are "192.168.1.1" and "443", respectively.
The ""%r" %{requestBodyLength}r %D %s %B %I" pattern logs details about the request and response. The %r pattern code matches the first line of the request, namely the request method, URL path, query string, and protocol (""" simply specifies a literal double quote). Suspicious anomalies within the first line of the request could indicate automated scanning tools or targeted attacks:
Suspicious methods such as "PUT"
Suspicious URL paths such as "/admin.html"
Suspicious query strings such as "' or 1=1--"
Suspicious protocols such as "HTTP/1.0"
- The query string is of particular interest, which could contain a wealth of useful forensic information. Common attacks such as SQL injection (SQLi) and cross-site scripting (XSS) could be identified by telltale attack signatures within the query string such as "' or 1=1--" or "<script>", respectively. In the example log entry this value is "GET /example.html?foo=bar HTTP/1.1". The "%{ }r" pattern code can be utilized to log arbitrary ServletRequest attributes from the incoming request. The %{requestBodyLength}r pattern code logs the ServletRequest attribute named requestBodyLength. The application would explicitly set this attribute to contain the length of the request body for POST requests. For example, the application would include the following code within each doPost() method:
request.setAttribute("requestBodyLength", request.getContentLength());
- This code would not be required within each doGet() method as GET requests do not contain a request body. An unusually large request body could indicate certain types of attacks such as buffer overflows. In the example log entry this value is "-", meaning that the client sent a GET request. The %D pattern code logs the number of milliseconds taken to serve the request. Unusually long times could indicate certain types of attacks such as time-based SQL injection. In the example log entry this value is "2". The %s directive logs the status code of the response. Uncommon status codes such as "405" (Method Not Allowed) could indicate automated scanning tools or targeted attacks. In the example log entry this value is "200". The %B pattern code logs the total number of bytes sent to the client, excluding headers. An unusually high number of bytes could indicate certain types of attacks such as SQL injection. In the example log entry this value is "999". Finally the %I pattern code logs the Tomcat thread that processed the request. The thread name can be utilized to correlate the request with subsequent Tomcat stacktraces. In the example log entry this value is "http-bio-443-exec-1".
The ""%{Referer}i"" pattern logs the "Referer" header sent by the client. Note that the value of the "Referer" header could be spoofed by the client. In addition, note that the name of the "Referer" header is deliberately misspelled due to a mistake within RFC 1945. In the example log entry this value is "https://192.168.1.1/previous.html".
The ""%{User-Agent}i"" pattern logs the "User-Agent" header sent by the client. Note that the value of the "User-Agent" header could be spoofed by the client. In the example log entry this value is "Mozilla/5.0 (X11; Linux x86_64; rv:33.0) Gecko/20100101 Firefox/33.0".
The "%u %{username}s %{sessionTracker}s " pattern logs details regarding the application user. The %u pattern code logs the username if the request was authenticated with HTTP Basic or Digest authentication. The identity of the authenticated user can be extremely useful during forensic investigations. In the example log entry this value is "-", meaning that the user was not authenticated with HTTP Basic or Digest authentication. The "%{username}s " pattern code logs the "username" attribute of the associated HttpSession. The application would set the "username" attribute of the associated HttpSession to the name of the user upon successful form-based authentication. For example, the application would implement the following code upon successful form-based authentication:
session.setAttribute("username", authenticatedUsername);
- This code assumes that the authenticatedUsername variable contains the name of the authenticated user. Note that because the "username" attribute is only stored on the server it cannot be spoofed by attackers. As previously mentioned, the identity of the authenticated user can be extremely useful during forensic investigations. In the example log entry this value is "billy". The "%{sessionTracker}s " pattern code logs the "sessionTracker" attribute of the associated HttpSession. The application would set the "sessionTracker" attribute of the associated HttpSession to a unique identifier in order to track requests throughout the duration of a session. For example, the application would implement the following code upon session initialization:
session.setAttribute("sessionTracker", DigestUtils.sha1Hex(session.getId()));
- This code utilizes the Apache Commons Codec, specifically, the sha1Hex() method of the org.apache.commons.codec.digest.DigestUtils class, in order to generate the SHA-1 hash of the "JSESSIONID" session identifier. Consequently the commons-codec-1.9.jar file must be included within the Java classpath during compilation and copied to the "WEB-INF/lib" directory of the web application. A SHA-1 hash cannot be reversed. In addition, the odds of two session identifiers generating the same SHA-1 hash are statistically insignificant. It is important to note that because the "sessionTracker" is not the actual session identifier it cannot be utilized to resume a session. The actual session identifier could be tracked with the %S pattern code, but then attackers could leverage a compromised logfile in order to hijack authenticated sessions. Therefore, session identifiers and other sensitive security tokens should never be logged. In addition, because the "sessionTracker" attribute is only stored on the server it cannot be spoofed by attackers. Tracking requests throughout the duration of a session can be extremely useful during forensic investigations. In the example log entry this value is "0bcdd30af79b6aca8c8f7808c02ab530ac4c4a75".
In addition to the pattern codes included within our enhanced log format, Tomcat provides several others that can be utilized to capture other pieces of relevant information:
The "%{ }t" pattern code can be utilized to log the date and time according to a custom SimpleDateFormat specification. For example, we utilized "%{E M/d/y @ h:mm:ss.S a z}t" in order to log the date and time in a more intuitive format.
The "%{ }i" pattern code can be utilized to log arbitrary request headers. For example, we utilized "%{X-Forwarded-For}i" in order to log the "X-Forwarded-For" header.
The "%{ }r" pattern code can be utilized to log arbitrary ServletRequest attributes. For example, we utilized the "%{requestBodyLength}r " pattern code in order to log the "requestBodyLength " ServletRequest attribute.
The "%{ }s" pattern code can be utilized to log arbitrary HttpSession attributes. For example, we utilized the "%{username}s " pattern code in order to log the "username" HttpSession attribute.
The "%{ }c" pattern code can be utilized to log arbitrary HTTP cookies. For example, we could utilize "%{JSESSIONID}c" in order to log the "JSESSIONID" HTTP cookie.
If you would like to log additional pieces of relevant information, you can refer to the complete list of supported AccessLogValve pattern codes. If you would like to modify the date and time format, you ca refer to the complete list of supported SimpleDateFormat pattern codes. All that's left now is to restart the Tomcat daemon in order to load the configuration changes:
root@debian $ service tomcat7 restart
Tomcat will now begin logging each request in the enhanced log format, providing a wealth of additional information that will be extremely useful during forensic investigations. Look out unsuspecting hackers, your nine lives are now in danger! The claws are out and Tomcat forensic logging is on the prowl!