Your privacy on the World Wide Web:
Connection information:
When you connect to any Transmission Control Protocol (TCP) based server, such as a web or email
server, you pass your IP address to that server. Once a server possesses your IP address, it can perform
reverse-DNS (Domain Name Service) lookups, to discover who provides you with your Internet access,
and perhaps, therefore, what geographic area you are located in. While such DNS information can be
falsified (this is one of the skills that can be adapted to one of the zebulun challenges), the server can
perform similar lookups using the American Registry of Internet Numbers. Depending upon the nature of
your connection to the Internet, this may yield your actual physical address and contact information, or it
may merely provide information about your upstream Internet access provider.
Yet another resource for similar information can be found in an rwhois database, if your service provider
runs one, or in an identd service. All of these services are designed to reveal information about who is
using specific IP addresses, in order to allow you access to services that may require this information, in
order to allow access.
Once they know who your ISP is, and what your username is, that can then request any information that
they do not already have (such as your name, address, and telephone number) from the ‘finger’ service (if
your ISP runs it), or by submitting VRFY and/or EXPN requests, to the designated Simple Mail Transfer
Protocol (SMTP) server, for your email domain.
Although not every user is exposed, in all of these ways, this can result in a great deal of information
getting out, about you, that you may not be aware of, without your knowledge or consent – all because you
had the gall to connect to someone else’s server, on the Internet… And this exposure occurs, even before
you have made any request for data!
Request data:
When you request data from a website, a considerable amount of information exchange takes place, some
of which may intrude upon your privacy, without your knowledge.
Firstly, it is worth mentioning that just because you ask for specific data, does not mean that you will get it.
Lots of web sites purport to provide back doors into password recovery mechanism, or simulate login
screens for services with which they have no affiliation, at all. These sites exist solely to mislead you into
providing information to the site that you would otherwise protect, such as your usernames and passwords.
Additionally, when you request a specific domain or URL, there is no particular guarantee that your web
client software will take you to the site that you requested, or that content that you see on the page, will
belong exclusively to that site.
One common example of this situation, is the banner advertising schemes of such organizations as FlyCast
(now operated by Engage, Inc.), or DoubleClick, in which site redirection and banner ads, containing
cookies which could serve as identifying tags, are
Notes: What you asked for, and what you got; refresh tags, webserver logging, cookies, and embedded
scripts/images.
Authentication and Ciphering:
Ciphers:
Typically, data that you receive or transmit, via the Web, is carried completely unencrypted - the exception
being data transmitted via the Secure Socket Layer (SSL). Most Web client software will warn you when
you are transmitting data to a form, unencrypted, and it will typically let you know when you have
switched from SSL-ciphered data to cleartext (or unciphered) data. Such software will not, however, warn
you each time that you send a request for, or receive, unencrypted data.
Since most data is passed unencrypted, over the Internet, your requests, as well as the related responses, are
readily monitored by anyone with access to the transmission path. This means that your ISP, the access
provider for the web site that your are talking to, and anyone involved in the intervening network, has the
capacity to insert themselves into your data stream, and examine your traffic – even make changes to it!
This is the principle upon which the FBI’s Carnivore (now called DCS1000) and the international Echelon
electronic surveillance systems are based. While SSL-based ciphering may offer some assurance of
privacy, it is worth noting that the United States’ National Security Agency (NSA) has traditionally
blocked efforts to export cipher technologies that are not either based upon crippled cipher strengths, or
equipped with key-escrow systems that effectively offer the NSA a back-door into the data. That SSL
technology has been allowed to proliferate, through the licensing of RSA Security Inc.’s algorithms, is
indicative that the ciphers may be vulnerable, where the United States government has an interest in
examining the data. If you think that, by using SSL ciphers, you are safe from Carnivore or Echelon, you
may not be paranoid enough – although, if you are merely trying to protect your email from examination by
parents or employers, then SSL ciphers are probably adequate.
Authentication:
Assuming that you are logging into a web site, without using SSL ciphers, then it is a good bet that you are
using no ciphers at all. This is the case when logging into the CyberArmy web site, just as it is, typically,
with many web-based email sites, and discussion forums.
CGI authentication:
Sometimes the login is accomplished using a web form, with a Common Gateway Interface (CGI) script as
the device collecting the data; in this case, the password is being passed completely in-the-clear, with no
effort made to protect it, at all, in a format similar to the following:
POST /cgi-bin/login.pl HTTP/1.0
Host: www.someplace.com
Content-length: 29
usernm=mryowler&passwd=secret
Setting aside the obvious risks associated with having your password exposed in this way, it’s also worth
mentioning that any data that you subsequently access or transmit, having now authenticated, is also
probably being passed unencrypted. This leaves the very data, that the password was intended to protect,
as exposed as the password itself.
htpasswd authentication (Authorization: BASIC):
While web servers are typically capable of supporting many different types of authentication, most sites use
one that is referred to, in the HyperText Transfer Protocol (HTTP) as Authorization: BASIC. This
authentication method uses the base64 encoding scheme, to encode a password, with the HTTP header,
when a page is accessed that requires a password. This is, in fact, the scheme which is employed by the
CyberArmy web server.
The base64 algorithm encodes text in 3-byte (24-bit) cleartext blocks, and converts them to 4-byte
ciphertext blocks, which also represent 24 bits. While the details of the base64 algorithm are beyond the
scope of this document, the element of the algorithm that is relevant, here, is that base64 encodes the text,
without using a key of any kind. It can be deciphered by anyone who possesses knowledge of the
algorithm, which is widely available. As a result, passwords encoded, in this way, are not safe from
interception, this type of cipher merely obfuscates the password – it does not effectively encrypt it.
When a password is transmitted, using the Authorization: BASIC form of htpasswd authentication, it
usually looks something like this:
GET /protectedpage.htm HTTP/1.0
Host: www.somplace.com
Authorization: Basic bXJ5b3dsZXI6c2VjcmV0
The text ‘bXJ5b3dsZXI6c2VjcmV0’ decodes to ‘mryowler:secret’, and is, of course easy to spot, in the
HTTP request.
Netscape Cookies:
Once authenticated into a system (particularly using the CGI authentication method), a user’s identity must
be maintained. Since the Web operates as a stateless protocol (meaning that information about a user is
generally not maintained, on the server, under the HyperText Transfer Protocol), Netscape came up with
the notion of cookies, to allow the web client software, to retain state information. This concept was very
popular, among web sites, and is now implemented in nearly every popular web client program. The client
software is expected to keep track of specific information, usually some form of identity tag, and transmit
this information back to the server that assigned it, according to rules established by that server.
htpasswd-based authentication will typically reassert itself, with every protected page that is loaded, from a
web server, since the web server will challenge the web client, each time it attempts to access protected
content. In cases where this type of authentication is used, cookie-based authentication may not be
necessary – though it is often used, regardless.
In cases where identity is established using a cookie, the cookie itself is subject to interception, and use as
an authenticating mechanism, by someone who wants to pose as you. This is a standard tactic, for gaining
access to web-based email accounts. Although CyberArmy does not trust cookies, to authenticate users
into zebulun rank-restricted areas of the server, the web server does use cookies to identify the username,
used to post to the message forums. Cookie-based authentication looks something like this:
GET /protectedpage.htm HTTP/1.0
Host: www.someplace.com
Cookie: id=nozxcpoi
Although the meaning of the cookie may be unclear, it is simple enough for anyone who intercepts it, to
insert it into their HTTP request headers, thereby adopting whatever identity the cookie represents, to the
target web server.
Client-side scripting and Plug-ins:
Through the use of client-side scripting, it is possible for web sites to execute programs, on your computer.
While scripting languages such as Java and JavaScript will limit the kinds of things that the scripts can do,
other languages such as VBscript allow much broader access. Additionally, some web sites ask the user to
install ‘plug-ins’, some of which are ostensibly digitally signed by some authority.
Plug-ins
Rather obviously, one should consider the source, when receiving a ‘plug-in’. If you are downloading it
from a site that you have no particular reason to trust, or if it is digitally signed by an organization that you
have no particular reason to trust, then it’s worth thinking twice. Just because the plug-in is from a big-
name company, like Microsoft, or Adobe, does not make it trustworthy; lots of people do not trust these
companies. Remember, their interests are best served, by gathering marketing data about you. Some
companies may even sell the information that they collect – companies make money by understanding what
people will pay for – and privacy has traditionally been something that people have been willing to give a
great deal more to invade, than to protect.
Like downloading programs from the Internet, remember that you typically have no idea what a plug-in is
doing, beyond whatever the sender has told you about it. A plug-in can easily possess the attributes of any
trojan, worm, or virus, and they have every opportunity to access information stored on, or passing through
your computer and the connected network. If you have a firewall, the plug-in is now inside of the firewall,
and burrowed into the soft underbelly of your network – and such a plug-in now has the capability to begin
transmitting data out.
JavaScript:
Cookies:
Spyware: