A. Glossary – How to Use Web 2.0 and Social Networking Sites Securely

Appendix A. Glossary

* Terms taken from the definitions and are included in Alan Calder and Steve Watkins, A Dictionary of Information Security Terms, Abbreviations and Acronyms (IT Governance Publishing, 2007).

Adware

– Adware (spelled all lower case) is the name given to any software application in which advertising banners are displayed on the web page. The advertisements can generally be viewed through pop-up windows or through a bar that appears on a computer screen[50].

Ajax

– (Asynchronous JavaScript and Extensible Markup Language or XML) is a set of technologies which enable greater processing to be carried out on the client computer, rather than the server. In the traditional web application, the user clicked and then waited some number of seconds for the server to respond and refresh the page. In contrast, Ajax-enabled web pages are far more reactive, giving the user the appearance that pages are updating instantly. This is illustrated by the application ‘Google Maps’, where the page and map are refreshed instantly as the cursor is moved. Ajax is not a new technology, but rather a combination of existing technologies being used in a new way.

Ajax endpoints

– In contrast to typical Web 1.0 applications, Ajax applications send a greater number of smaller requests to the server which creates many more points of input.

Figure 7. Ajax endpoints

The inputs are also referred to as Ajax endpoints which provide a greater number of opportunities for that traffic to be attacked.

Blogs

– ‘Blog’ is an abbreviation of ‘Weblog’, which is a term originally used to describe a web page where the ‘blogger’ (author or writer of the page) ‘logs’ all other web pages they find interesting. Readers can subscribe to a blog, post comments to a blog, and select links on a blog.

Botnet*

– A network of zombie computers, usually created and controlled by criminals, either for distributing spam or for mounting DDoS attacks.

Cache

– A place which is used to store something temporarily. Within the context of a web browser, this describes storage space and contains the most recent files which have been downloaded and which are physically located on the hard disk[51]. This saves time reloading the web page and also saves additional traffic.

The cache* is the section of a computer’s memory which retains recently accessed data in order to speed up repeated access to the same data. If the data on the Web has altered since you last visited it, you may need to refresh the page to see the new data, otherwise you will only see what is stored in the cache.

Within the context of Google[52], a cache is a snapshot taken of a page which is taken as a backup in case the original page is unavailable. Google retrieves these ‘caches’ as it crawls the Web. The cached content is the content Google uses to judge whether a page is a relevant match for a query. When the cached page is displayed, it will have a header at the top which serves as a reminder that this is not necessarily the most recent version of the page. Terms that match the query are highlighted on the cached version.

The ‘cached’ link will be missing for sites that have not been indexed, as well as for sites whose owners Google say have requested that they do not cache their content.

Collaboration tool

– A collaboration tool uses a variety of Web 2.0 technologies with the purpose of aiding internal collaboration and communication within the workplace.

Copyright owner

– Generally speaking, a copyright owner in the first instance is the creator of a literary, dramatic, musical or artistic work. Copyrights in works made during the course of employment are owned by the employer and not the employee[53].

CSS (Cascading Style Sheets)

– ‘A W3C (World Wide Web Consortium) recommended language for defining style (look and feel such as font, size, color, spacing, etc.) for web documents’[54]. It is a technology which enables content (written in HTML or a similar mark-up language) to be separated from its presentation (written in CSS). Because they cascade, some elements take precedence over others.

Data

– A collection of facts from which conclusions may be drawn[55].

Data controller

– Within the context of the UK Data Protection Act, the data controller is the person who determines the purposes for which, and the manner in which, personal information is to be processed. This may be an individual or an organisation and the processing may be carried out jointly or in common with other persons[56].

Data mining

– The process of sorting through data to identify patterns and establish relationships[57]. On the Web, data can be mined using search engines or Spiders.

Data subject

– This is the living individual who is the subject of the personal information (data)[58].

Defamatory

– An act of communication that causes someone to be shamed, ridiculed, held in contempt, lowered in the estimation of the community, or to lose employment status or earnings or otherwise suffer a damaged reputation[59].

Denial of service attack (DoS)*

– This sort of attack is designed to put an organisation out of business, or to interrupt the activities of an individual or group of individuals, for a time by freezing its systems. This is usually done by flooding a web server (or other device) with e-mail messages or other data so that it is unable to provide a normal service to authorised users.

DRM (Digital Rights Management)

– A systematic approach to copyright protection for digital media. The Digital Millennium Copyright Act (DMCA) was enacted on 28 October 1998 in the United States in order to protect the digital rights of copyright owners and consumers[60].

Exponential

– Web 2.0 tools enable users to connect with a very large number of people in a short period of time at low cost. This is referred to as the ‘viral’ nature of Web 2.0: the virus metaphor describes the ability of a virus to reproduce itself very rapidly in a short space of time. The speed with which this can happen, and the number of people who can be involved, is also described with more positive connotations, as ‘exponential’.

FTP*

– File Transfer Protocol is a method of transferring files over the Internet.

Gmail

– Google Mail, or Gmail is a free, search-based WebMail service available from Google, which also enables e-mails to be picked up on mobiles. Security vulnerabilities in Gmail have caused e-mails to be transferred and stolen with consequent potential data disclosure[61]. Although Google patched the vulnerability, users of Gmail were not necessarily made aware of the need to repair the derived vulnerability in their own systems. The fact that Web 2.0 companies apparently prefer to downplay such issues might lead to them becoming a preferred attack vector for hackers and malware jockeys.

Folksonomies

– A ‘folksonomy’ is a collection of tags used to organise and easily find content on the Web. A folksonomy is created collaboratively and is also contributed to by users.

Information*

– The New Shorter Oxford English Dictionary provides these helpful definitions: ‘knowledge or facts communicated about a particular subject, events, etc.; intelligence, news’ and ‘without necessary relation to a recipient: that which inheres in or is represented by a particular arrangement, sequence or set, that may be stored in, transferred by, and responded to by inanimate things’. Clearly information, or data, exists in many forms but, for the purposes of its security, we are concerned with data that has a digital, paper, or voice format. Information is defined by Coleman and Levine** as ‘Data put into context by a human to give it meaning’.

Instant messaging*

– (IM) is a communication methodology that is analogous to a private chat room; it enables you to communicate over the Internet in real time with another person, using text.

Intellectual property

– Intellectual property (IP) can allow you to own things you create in a similar way to owning physical property[62]. Intellectual property implies ownership of content which is created intellectually, through thinking, or the creation of ideas. Intellectual property acts define this ownership in law.

There are four main types of intellectual property:

  • Copyright – Copyright protects material such as literature, art, music, sound recordings, films and broadcasts.

  • Designs – Designs protect the visual appearance or eye appeal of products.

  • Patents – Patents protect the technical and functional aspects of products and processes.

  • Trademarks – Trademarks protect signs that can distinguish the goods and services of one trader from those of another.

Intellectual property rights are a complex area of law; an appreciation of the complexities of the subject can be gained from referring to the FAQs available from the United States Copyright Office and the US Patent and Trademark Office.

Internet, the*

– The massive global network of networks, connecting millions of computers, allowing any computer to communicate with any other by any one of a number of protocols. The Internet is not the (World Wide) Web.

Javascript

– This is a type of programming language used for web applications whereby the commands are interpreted and run one at a time. Javascript is on the client computer for Web 2.0 applications to initiate calls to the server and then to programmatically access and update the client’s browser[63].

Malware

– ‘Malware’ is a term that denotes software designed for some malicious purpose. Common forms of malware include viruses, worms and Trojans.

A virus is able to produce copies of itself but depends on a host file to carry each copy. A worm can also replicate itself but does not rely on a host file to carry it. A worm can replicate itself by means of a transmission medium such as e-mail, instant messaging, Internet Relay Chat or network connections.

Trojan malware is an analogy derived from the legend of the wooden horse built by the ancient Greeks built to enable them to enter the walled city of Troy by stealth – by concealing themselves inside the wooden horse. In computer terms a Trojan is hostile code concealed within and purporting to be bona fide code, often with the intention of achieving control over another system or collecting information from within it.

Mashups

– Within the context of Web 2.0 the term is used to describe the mechanism by which multiple sources of information can be combined to create a single application.

Online collaboration

– Web 2.0 online collaboration tools provide users with the ability not only to upload content to the Web, but also to upload content to a single, shared space which can be accessed by many users.

Web 2.0 online collaboration tools incorporate Web 2.0 technologies such as social networking and wikis within a single application or workspace which is visible to the entire team. They enable users to:

  • Create and share team documents.

  • Create individual or group information workspaces.

  • Post to team – or organisation-wide blogs.

  • Manage team projects.

  • Automate employee alerts of changes to content with RSS feeds.

Openness and transparency

– The concept of ‘openness’ within the context of Web 2.0 relates more to making intellectual ideas, developments or creations available so that they can be developed exponentially by a wider, external community. The antonyms of ‘openness’ and ‘open source’ are ‘closed’ and ‘closed source’.

Payload

– ‘Payload,’ within the context of web filtering, is the amount of damaging material contained within a packet of data.

Personal data*

– That information about a living person (i.e. not an organisation) that is protected by legislation and regulation.

Personally identifiable information[64]

– Any information relating to an identified or identifiable individual who is the subject of the information such as a social security number, date of birth, mother’s maiden name, address, etc.

Phishing*

– This is the sending of e-mails that falsely claim to come from a legitimate company in an attempt to scam users into surrendering information that can be used for identity theft.

RSS

– Really Simple Syndication (RSS) is the most well-known type of web feed. A web feed is an automatic notification of an update to a website. Notification of new content requires a subscription to that ‘feed’ as well as RSS reader and/or Atom reader software which enables new content to be viewed. The readers are either downloadable programs or available as online services.

Sensitive PII

– This includes confidential medical information or information relating to racial or ethnic origins, political or religious beliefs, or sexuality that is tied to personal information.

Signature defence

– An electronic signature which is used by banks to prove themselves as the originators of e-mails combating phishing attacks.

Social networking

– Is a virtual community, usually via the Internet but also increasingly available via mobile devices such as the iPhone.

Social networking websites enable users to create their own online page or profile and to construct and display an online network of contacts, often called ‘friends’. Users create their own pages, link to other members and communicate by voice, chat, instant message, videoconference and blog. They can communicate via their profile both with their ‘friends’ and with people outside their list of contacts. This can be on a one-to-one basis or in a more public way such as a comment, typically posted on a message board for all to see.

Spider

– Whatis define a ‘spider’ as follows[65]:

A spider is a program that visits websites and reads their pages and other information in order to create entries for a search engine index. The major search engines on the Web all have such a program, which is also known as a ‘crawler’ or a ‘bot’. Spiders are typically programmed to visit sites that have been submitted by their owners as new or updated. Entire sites or specific pages can be selectively visited and indexed. Spiders are called spiders because they usually visit many sites in parallel at the same time, their ‘legs’ spanning a large area of the ‘Web’. Spiders can crawl through a site’s pages in several ways. One way is to follow all the hypertext links in each page until all the pages have been read.

Spyware

– Technology which gathers information about a person or organisation from the Web without their permission[66].

Synchronous communication

– In contrast to asynchronous communication, synchronous communication is that which occurs between two or more people within five seconds.

Trojan*

– The term ‘Trojan’ is derived from the Greek story of the Trojan horse. Within the context of IT security a Trojan is hostile code concealed within, and purporting to be, bona fide code. It is designed to reach a target stealthily and to be executed inadvertently. It may have been installed at the time the software was developed. There can be programs that, while perhaps appearing to be a useful utility, are designed to secretly damage the host system. Some will also try to open up host systems to outside attack.

User created content

– Central to Web 2.0 is the idea that content should be created by users, that users can interact with the Web and that users have moved from passive absorbers of web content to being active interactors with the Web. Users not only download content but also upload it. Technologies such as wikis, blogs, video sharing and photo sharing all consist of user-created content. Web 2.0 technologies enable web content to be created easily by anybody, rather than being solely the output of ‘experts’. Web 2.0 technologies support the rapid creation of new content at speeds much faster than is possible in a Web 1.0 environment. Tapscott and Williams[67] describe the creation of a Wikipedia account of the London bombings which occurred in 2005. ‘By the end of the day, over 2,500 users had created a comprehensive 14-page account of the event that was much more detailed than the information provided by any single news outlet’. The established media are therefore increasingly using user-generated reports and video clips which provide valuable, comprehensive, up-to-the-minute, eyewitness accounts of events.

User profiles

– When a user of a website uploads content to any website, it is usual for a ‘user profile’ to be created. The user profile will store personal data relating to the user, which, at a minimum, includes e-mail and password, but which may also include the following data:

  • Name (or user name)

  • Date of birth or birthday

  • Address

  • Marital status

  • Likes and dislikes

  • Education

  • Current and previous employment

  • Sexual orientation

  • Religious views

  • Personal photographs.

Viral

– Web 2.0 tools enable users to connect with a very large number of people in a short period of time at low cost. This is referred to as the ‘viral’ nature of Web 2.0: the virus metaphor describes the ability of a virus to reproduce itself very rapidly in a short space of time. The speed with which this can happen, and the number of people who can be involved, is also described with more positive connotations, as ‘exponential’.

VoIP/VOB*

– Voice over IP/Voice over Broadband is a technology that enables voice-to-voice communication across the Internet.

Vulnerability*

– A weakness of an asset or group of assets that can be exploited by a threat. There are regularly updated central stores of known vulnerabilities.

Vulnerability assessment

– This is the (usually automated) evaluation (or vulnerability scanning) of operating systems and applications to identify missing fixes for known problems so that the necessary fixes can be installed and the systems made safe.

Vulnerability scanning

– an automated process of scanning a network or a series of information assets to establish if they display any of the characteristics of known vulnerabilities.

Wikis

– Wikipedia describes a wiki as ‘software that allows registered users or anyone to collaboratively create, edit, link, and organise the content of a website, usually for reference material”[68].

World Wide Web* (the Web)

– an information-sharing construct that sits on top of the Internet, and uses HTTP to transmit data. It is not synonymous with the Internet. A browser is required for accessing web content.

‘Zero-day’ vulnerability

– a ‘zero-day’ vulnerability is one where hackers take advantage of vulnerability on the same day as it is announced. For further details, see the IT Governance Best Practice Report: Data breaches: Trends, costs and best practices.



[53] ‘Who is a copyright owner?’, Australian Government, January 2008, www.ag.gov.au/www/agd/agd.nsf/Page/Copyright_Whoisacopyrightowner.

[54] ‘Glossary’, Egghead Design Ltd, www.eggheaddesign.co.uk/glossary.aspx.

[59] The ’Lectric Law Library’s Lexicon On ‘Defamation’, www.lectlaw.com/def/d021.htm.

[61] ‘Bullseye on Google: Hackers expose holes in Gmail, Blogspot, Search Appliance’, ZDNet, 25 September 2007, http://blogs.zdnet.com/security/?p=539.

[62] ‘What is intellectual property?’ UK Intellectual Property Office, 2008, www.wipo.int/about-ip/en/.

[63] ‘Simplifying content security, ensuring best-practice e-mail and web use, Web 2.0 Security Technical White Paper, Is the web broken?’, Clearswift, July 2007.

[67] ‘Wikinomics’, Don Tapscott and Anthony Williams, 2006.