Web Security

Overview

First widely reported exploit.
Technical details of a web exploit.
Importance of validating data.
Cross-site scripting.
Same-origin policy and CORS
Cross-site request forgery.
Authentication.

Morris Worm

1988, first hack which received attention in the popular press.
Replicating worm.
Overflow'd gets() buffer in fingerd daemon.
Misused DEBUG command to sendmail daemon.
Ran dictionary attack against publicly readable /etc/passwd file.
Given a password, would use .forward and .rhosts file to break into other hosts where user had accounts.
Was supposedly meant for innocent research, but a bug caused indiscriminate propagation.

Morris worm wikipedia link

HB Gary Hack Background

WikiLeaks gained world-wide prominence in 2010 with releasing, among other dumps, US State Department emails in Nov 2010.
At end of 2010, bowing to political pressure, major payment processors stopped processing donations to WikiLeaks.
The hacktivist group Anonymous mounted distributed-denial-of-service attacks on web sites of payment processors.
At beginning of 2011, CEO Aaron Barr of security company HBGary Federal publicized the fact that he could reveal the identity of Anonymous.
Anonymous took down the HBGary Federal website, extraced 40K emails from email server, deleted 1TB of backup data.

Links: New York Times, Ars Technica Story, sequel and follow-up.

Details of Break-In

HBGary Federal website was running a proprietary content-management system which has an SQL injection flaw. By providing specially crafted inputs, attackers were able to run arbitrary queries against database, and accessed login and password table.
Passwords were hashed using a fast hash algorithm (MD5) without any salt, making them amenable to a rainbow-table attack (comparing the hashed password with a table of precomputed hashes for passwords).
CEO and COO had passwords which were cracked. Same passwords were used on other machines and email, twitter, etc.
Used COO password to access Linux support machine which was accessed using password-based ssh (rather than public/private key access).
Linux system had not been patched and contained a well-known security flaw which allowed access as root! Deleted backups and research data.

Details of Break-In Continued

Barr's password allowed access to Google Apps as administrator. Allowed attackers to examine email archives.
Email contained root password for another machine rootkit!. But remote root access not possible.
With access to email account, anonymous sent an email to system administrator requesting ssh access be allowed (gave possible root passwords in email!). Administrator set things up, changed password to changeme123 and verified user-name!!
Attackers dumped user database for all users who ever registered on rootkit.com. Contained crackable passwords.

Sequel

Extremely poor security practices for a commercial company and even worse for a security company.
Anonymous members arrested 2012 (ringleader was arrested in 2011 and was cooperating with the FBI).
HBGary received additional business!!
HBGary purchased by defense contractor ManTech.
CEO Barr stepped down from HBGary. Took position as cybersecurity director at another federal contractor. Fired 2012 for concentrating on social media and Anonymous.
In 2013, list of rootkit.com user-names/passwords allowed possible identification of Chinese hackers suspected in other hacking incidents.

The Security Problem

A user (person or program) with limited authorization interacts with a program which has or may (temporarily) get different authorization.
By choosing certain inputs or interacting with the program in a certain way, the user forces the program to take an unintended action using the program's authorization.
Ultimate exploit is to spawn a shell with program's authorization.

Types of Malicious Programs

Trojan Horse: A program which provides normal useful functionality while also performing hidden malicious activity. Eg. a login program which allows users to login normally, while storing plaintext passwords in a file.
Worm: A program that propagates itself over a network, reproducing itself as it goes.
Virus: A program that searches out other programs and infects them by embedding a copy of itself in them, so that they become Trojan horses. When these programs are executed, the embedded virus is executed too, thus propagating the infection. This normally happens invisibly to the user. Unlike a worm, a virus cannot infect other computers without assistance. It is propagated by vectors such as humans trading programs with their friends.

Principle of Least Privilege

Always use lowest possible privilege.
When running programs using a privileged program, use privileged user only for portions of execution where it is strictly necessary, reverting to non-privileged user wherever possible.
When done with privileged operations, revert to non-privileged user permanently.
Typically use a user with very limited privileges to run externally accessed programs like web servers. Typical user for Unix web servers is nobody or www-data with very limited privileges.

Program Bugs Which Affect Security

Buffer overflows. No checking on whether a buffer overflows .... hence specially crafted inputs can overflow the buffer overwriting other memory, which can compromise security. Notorious in C/C++ programs. Can be avoided by using safe programming languages like JavaScript, Java or Perl.
Untrusted data used in executing other programs. JavaScript eval() and Function() constructor vulnerable to this.
Environment variables.
Race conditions. Program checks something and then acts based on the result of the check. The check and act are not atomic, allowing a malicious user to change the result of the check before the act.
Randomness. Cryptographic schemes often depend on a source of randomness. If the source of randomness is compromised, then the security of the cryptographic scheme is also compromised.
Denial of service.

Validating Data

Heartbleed Bug

Bug in widely used OpenSSL cryptography library; 2012-2014.
Normal operation: To check if remote end is alive, a computer is supposed to send a heartbeat message consisting of a message payload along with the length of the message. Remote end is supposed to echo back message.
Attacker sent malformed hearbeat request containing a small message with large length. Validation did not verify that length matched message length. Remote end allocated buffer for incorrect length, filled only initial portion with message and returned full buffer; basically allowing attacker to access secrets which may have been previously stored in the memory used by the unused part of buffer.
Unintended memory access caused by insufficient data validation.

Not Validating Data

This page allows attacker to run arbitrary JavaScript on page.

Input not validated; hence attacker can input arbitrary HTML, including scripts.
Consider what would happen if this "name" was stored in a db and then output on the page of another user, say Mary with a request: like "john would like to chat with you".
Script would have access to Mary's session.

Web Programming Errors

Cross-site scripting (XSS).
Code injection (SQL, JavaScript, PHP, etc).
Force unintended execution of code.
Unintended memory access: buffer overflows or other methods.
Reveal site info: data, paths.

Cross-Site Scripting

A script from an attacker domain is injected into the scripts for the subject domain being attacked.
Injected script is executed as though it originated from the subject domain.

Typical Scenario:

Alice is logged into bank and visits Mallory's website.
Mallory's website loads an "<img>" from bank with src set to a URL which tranfers funds from Alice's bank account to Mallory's bank account.
Browser submits authentication cookies, bank has no way of knowing that request was not made by Alice and transfers funds.

Non-Persistent XSS Attack

Does not involve server.

Alice often logs into Bob's web site.
Mallory uses a XSS vulnerability on Bob's site to send an email from Bob's web site to Alice with a cute cat link containing a malicious script.
Alice clicks on the link and is redirected to Bob's web site and unintentionally executes the malicious script using her login credentials for Bob's web site.
The malicious script could take an action like capturing Alice's authentication cookie (if any).

Persistent XSS Attack

Involves persistent storage on server.

Mallory creates an account on Bob's server.
Mallory realizes that Bob's web site allows comments to contain arbitrary HTML including <script> tags. She adds a comment containing a malicious script.
When Alice views the comment she unintentionally runs the (normally invisible) malicious script.

Mitigating XSS

Always validate all input. If possible, restrict to safe characters like alphanumeric and whitespace.
When rendering variables which may have potentially unsafe input, always escape based on context. From OWASP:
- In HTML contexts, escape &, <, >, " and ' as & <, >, " and &#39 respectively. Alternatively, render into a context without HTML parsing like textContent.
- In HTML attribute contexts encode all non-alphanumeric characters as &#HH;. Not necessary when setAttribute() is used.
- In JavaScript contexts, encode all non-alphanumeric characters as the Unicode codepoints as \uXXXX.
- In URL contexts, using %-url-encoding to encode query parameters.

The following server-side script illustrates XSS attacks in all of the above contexts. Assuming server started on port 3000 on localhost, access at <http://localhost:3000/>.

Same Origin Policy

Cookies are often used to retain authentication state; i.e. once a user has authenticated with a web site, that web site gives the browser a cookie set to a special authentication token specifying a authenticated session.
Whenever the browser sends a request to that web site domain it will include that cookie to indicate that the user is authenticated.
It is possible that the JavaScript loaded from a third-party web site also makes a request to the original web site. If this is permitted then the browser would send the authentication token and the third-party web site would have authenticated access to the original web site.
To prevent such attacks, browsers enforce the same origin policy: JavaScript loaded from one web site cannot make requests to another web site.
Same-origin means identical protocol (i.e. http, https), domain and port.

Same Origin Policy Continued

Browser permits scripts originating from some origin server to access data only if the data has the same origin.
Origin includes protocol (like http, https), domain (like www.binghamton.edu), and port (like 1234, 443).
document.domain property can be used to allow pages from within same domain hierarchy like x.com and www.x.com to interact (deprecated).
Cross-Origin Resource Sharing uses Access-Control-Origin HTTP headers to specify authorized origins (wildcard for any).
JSON with Padding JSONP allows loading JSON from another origin. JSON is loaded into a <script> section using the original JSON URL with an additional callback=parseResponse parameter. Server returns JSON json payload wrapped in parseResponse(json). Function parseResponse() returns JSON payload. Libraries like jQuery have JSONP helpers.

Cross-Origin Resource Sharing

Acronym CORS.
Allows relaxing same origin policy.
To allow mashup web pages where a page includes contents from multiple web sites, the standards make it possible to relax the same-origin policy.
A web site can whitelist other domains to access its services.
For non-sensitive services, all domains can be whitelisted.
Project web services use cors package to whitelist all domains and allow selected headers.

Cross Site Request Forgery (CSRF)

Third party attacker website tricks user into making a hidden request to targeted website; the request will automatically include the user's cookies for the targeted website.

Can be prevented using randomly generated CSRF tokens.
Every form submission contains a HTTP header or hidden field specifying CSRF token.
CSRF Token also submitted using Secure cookie.
Server compares submitted CSRF field value with CSRF cookie value. Request rejected if they do not match.
No server state involved in prevention.
Depends on fact that 3rd party attacker website cannot submit a cookie to the targeted website.
Vulnerable to subdomains setting cookie or man-in-the-middle MITM attacks.

Session Fixation

Session fixation.

Application assigns session at first contact with anonymous user.
User authenticates; application does not change session when user authenticates.
Attacker creates their own session.
Attacker tricks user into authenticating using the attacker's session. Possibilities include using the session token in a URL argument, hidden form field or cookie.
Attacker exploits authenticated session.

Session Hijacking

Attacker intercepts or guesses session ID token identifying authenticated session.
Session ID tokens should not be predictable.
Session ID tokens should never be transmitted unsecurely over HTTP; always use HTTPS.
Do not transmits session ID tokens as part of GET/POST parameters. Always use secure cookies.
Make cookie HttpOnly to avoid XSS browser attacks from reading cookie using document.cookie.

Secure Programming

Try to avoid using CORS to relax the same origin policy.
Do not allow GET for non-safe requests (like funds transfer).
Always validate all input.
Escape any untrusted input (using escaping mechanisms dependent on the context).
Make sensitive cookies HttpOnly.
Do not send sensitive information like passwords or credit card numbers to browser.

RegEx Denial of Service

Wikipedia

> function time(fn) {
    t0 = Date.now(); fn(); return Date.now() - t0;
  }
> re = /^(a|aa)+$/
/^(a|aa)+$/
> time(() => ('a'.repeat(30) + 'c').match(re))
18
> time(() => ('a'.repeat(40) + 'c').match(re))
1835
> time(() => ('a'.repeat(46) + 'c').match(re))
32408
>

Example: 2019 Cloudflare RegEx DoS caused by backtracking regex .*.*-.*.

Mitigation: use linear time regex engines like Google's re2.

Encryption

Unlike hashing, encryption is reversible.
Encryption converts plain text into cypher text based on some algorithm and key.
Decryption converts cipher text into plain text based on some algorithm and key.
Symmetric encryption: uses the same secret key for encryption and decryption. Example algorithms include Blowfish, RC4, AES.
Asymmetric encryption: uses different keys for encryption and decryption. Often slower than symmetric encryption.
Public-Key Encryption: Asymmetric algorithm. Has a public and private key pair. Either key can be used for encryption with the other used for decryption. It is supposedly impractical to derive the private key from the public key (based on 1-way functions like prime factorization of large numbers). Example algorithms include RSA, ECC.

HTTPS

HTTP over Transport Layer Security (TLS).
Provides authentication of remote web site based on certificate signed by a trusted certificate authority. Certificate provides proof of ownership over a public key.
Uses public key cryptography to securely setup a session key which is used to symmetrically encrypt the rest of the session. Specifically:
1. Client creates a random session key which is then sent securely to the server using the server's public key.
2. Once server decrypts the session key both client and server share the same session key.
3. All subsequent communition are encrypted using a symmetric encryption algorithm using the session key.

Basic Authentication

Note that "client" below is often a browser.

Client sends a request for some resource which requires authentication.
Server replies with a 403 UNAUTHORIZED status and a WWW-Authenticate header with value Basic.
Client retrieves user-id and password from its credential cache; if not found in its credential cache, it will get them from the user.
Client resends original request, but with an additional Authorization header which contains Basic along with a base-64 encoding of UserId : Password; i.e a base64 encoding of the concatenation of the provided UserId, a single colon character :, and the provided Password.

Basic Authentication Properties

Since HTTP is a stateless protocol, Authorization header must be sent with each request.
Client (browser) will cache the authorization for a particular realm so that it can send that authorization whenever it receives a 403 UNAUTHORIZED challenge without needing to reprompt the user for user-id and password.
Base64 is an encoding, not encryption. Hence the password can be read by any intermediary. So basic authentication by itself is not secure and should only be used over HTTPS which provides encryption.

WWW-Authenticate Header

WWW-Authenticate type realm= Realm

type: Specifies authentication scheme. Posibilities include Basic, Digest (extension of basic with nonce and MD5 hashing), Bearer (OAuth 2.0 token).
realm: An opaque string specifying the resources being protected. Examples could include "production server", "CGI scripts".

Base64 Encoding

Base64 is used for encoding arbitrary binary data.

Partition binary sequence into sequence of 6-bit blocks starting with most-significant bits. If # of bytes not divisible by 3, pad with 0-bytes on right to make a multiple of 3.
Map each block into a ASCII alphanumeric (upper-alpha, lower-alpha, digits for a total of 62 possibilities) plus + or / characters.
Replace any padding 0's with = characters. Hence a single trailing = indicates 1-byte of padding, 2-trailing = characters indicate 2 bytes of padding.

Base64 Encoding Examples

$ echo 'hello world' | base64
aGVsbG8gd29ybGQK
$ echo 'hello world1' | base64
aGVsbG8gd29ybGQxCg==

#-n removes newline at end
$ echo -n 'hello world' | base64
aGVsbG8gd29ybGQ=
$ echo -n 'hello world1' | base64
aGVsbG8gd29ybGQx

OAuth2

OAuth2 is a delegation protocol for delegating authorization (often misused for authentication).

A server contains resources owned by a user.
An example server would be a photo storage service photo-storage storing photos owned by a user Alice.
The user authorizes a client (an app or third-party web site) to access these resources on her behalf.
Hence Alice could authorize a photo-print client to read selected photos on photo-storage for a limited period of time in order to print them out. She may not want to allow photo-print to read all stored photos; she definitely would not want photo-print to be able to delete stored photos.
OAuth2 is used to set up this authorization.

OAuth2 Protocol Overview

In order for a human user to give access to server resources to some client program:

The client sends the user to a page which requests the user to authorize access to server resources.
The user interacts with the server to authorize the request.
The user is redirected to the client with an authorization token.
The client sends the authorization token to the server.
The server returns an access token to the client.
The client uses the access token to access user resources on the server.

OAuth2 Prerequisites

Client must preregister with server. When registering, it must provide a set of callback URLs.
After registration, server provides client with a CLIENT_ID (which is public) and a CLIENT_SECRET which is private.

OAuth2 Authorization Request

Client asks user to click on a link which looks something like:

      https://api.server.com/authorize?
        response_type=code
        &client_id=CLIENT_ID
        &redirect=CALLBACK_URL
        &scope=RESOURCES
        &state=STATE

The URL can be anything, but the query parameter names are fixed.
CLIENT_ID and CALLBACK_URL were set up during the registration process.
RESOURCES specifies the resources on the server the client would like to access on behalf of the user.
The state parameter is optional. If provided, STATE is a string whose meaning is interpreted only by the client.

Authorization Grant

If user successfully grants the requested access to the client on the server, then the server redirects the user back to the client:

         CALLBACK_URL?code=AUTH_CODE&state=STATE

CALLBACK_URL is the CALLBACK_URL parameter provided during the authorization request.
AUTH_CODE is the authorization code.
STATE is the state parameter provided in the authorization request (if any).
- The client can include a nonce in the STATE parameter to ensure that the callback originated from it.
- The client can use the STATE parameter to redirect the user to different URLs based on the initial request by the user.

Exchanging Authorization Token for an Access Token

The client makes a token request to the server:

  POST https://api.server.com/access_token?
    grant_type=authorization_code
    &code=AUTH_CODE
    &redirect_url=CALLBACK_URL

Authorization header set to Basic BASE-64 where BASE-64 is the base-64 encoding of CLIENT_ID:CLIENT_SECRET. (Alternately, can also be passed as POST parameters client_id and client_secret)
URL can be anything, but query parameter names are fixed.
AUTH_CODE is the authorization code granted by the server.
CALLBACK_URL is exactly the same as what was used when getting the authorization token.
CLIENT_ID and CLIENT_SECRET are the values set up for the client during the registration process.
CLIENT_SECRET is omitted if it is a mobile app or single-page browser app since using it would necessarily involve making it public.

Receiving an Access Token

If the exchange is successful, the server responds with a JSON object:

{ "access_token": ACCESS_TOKEN,
  "token_type": "Bearer"
}

ACCESS_TOKEN is the token used for authorizing subsequent requests.
Can also contain a "expires_in" field specifying the token expiry time in seconds.
The ACCESS_TOKEN can be used by the client to make subsequent requests to the server:
```
    Authorization: Bearer ACCESS_TOKEN
```

References

Wikipedia article on the Morris Worm.

John Viega, Gary McGraw, Building Secure Software, Addison-Wesley, 2002.

Peter Bright, Anonymous speaks: the inside story of the HBGary hack, Ars-Technica, strongly recommended. Link.

Peter Bright, With arrests, HBGary hack saga finally ends, Ars-Technica. Link.

Nate Anderson, How Anonymous accidentally helped expose two Chinese hackers, Ars-Technica. Link.

Main-stream press coverage of HBGary hack in the New York Times. Eric Lipton and Charles Savage, Hackers Reveal Offers to Spy on Corporate Rivals, New York Times, Feb 11, 2011. Link.

Port Swigger Labs: free online security labs.