Free Newsletter
Register for our Free Newsletters
Access Control
Deutsche Zone (German Zone)
Education, Training and Professional Services
Government Programmes
Guarding, Equipment and Enforcement
Industrial Computing Security
IT Security
Physical Security
View All
Other Carouselweb publications
Carousel Web
Defense File
New Materials
Pro Health Zone
Pro Manufacturing Zone
Pro Security Zone
Web Lec
ProSecurityZone Sponsor
ProSecurityZone Sponsor
ProSecurityZone Sponsor
ProSecurityZone Sponsor
ProSecurityZone Sponsor
ProSecurityZone Sponsor

Web filtering technology explained

Bloxx : 14 August, 2008  (Technical Article)
Eamonn Doyle of Bloxx provides an overview of web filtering technology and explains why a multi-layered approach is the way for the future.
The Internet has become an invaluable resource of information, which plays an increasingly important role in business and education. Yet, without adequate controls in place, organisations are faced with a number of serious issues and concerns. These include staff's excessive personal use of the Internet during working hours, impacting productivity and often resulting in financial loss, as well as legal risks associated with users accessing inappropriate content, such as pornography, violence and racism.

This has led to the emergence of web filtering products to enable proactive management of Internet access for users. When the Internet consisted of a few hundred thousand web sites, web filtering was relatively simple. However, as the Internet continues to expand at such a remarkable rate, with millions of new websites created daily, the traditional methods of web filtering have become completely unsustainable.

One of the most prevalent methods of web filtering is to create a database of web addresses categorised according to their content, eg shopping, gambling or violence. These databases can then be used to create a range of user Internet access profiles to allow different groups of users controlled access to the Internet.

Such databases are typically maintained using human URL reviewers to ensure accuracy of categorisation. This requires each URL reviewer to read the content, analyse images on the website and categorise into the database accordingly.
However, this process of classifying web content using URL databases presents a number of challenges:.

* Misclassification: - With limited time to classify individual websites, sites deliberately seeking to mislead a reviewer (eg pornography under the guise of a cookery site) could quite easily be wrongly categorised as legitimate.

* Keeping pace with the growth and dynamic nature of the Internet: - The Internet is currently growing by approximately 7.5 million new or renamed web addresses each day. An average URL classifier will review and classify around 500 web addresses a day, but to keep pace with this growth would require around 15,000 classifiers, which from a cost perspective, is clearly unrealistic.

* Scale - Web filtering companies usually try to differentiate themselves on the basis of the size of their database of categorised sites, ie the larger the URL database, the better the coverage and protection. It is common for these databases to contain 15 to 35 million categorised sites, but in reality when compared to the size of the web itself, they cover only less than 1% of the web.

No matter how extensive the URL database is, IT managers are still required to allow or deny access to requested URLs not listed in the database. This can result in users having open and uncontrolled access to potentially unsavoury sites, which is not only embarrassing for suppliers, but also completely unacceptable in certain sectors such as education. Alternatively, blocking all requested URLs not listed in the database can cause 'over-blocking', which heightens user frustration and takes up more of the IT manager's time by having to manually input individual sites to a 'white list'.

A number of other technologies have tried to solve these inadequacies of URL databases, with varying degrees of success.

Image scanning is a useful, but not foolproof way of blocking pornography from a network. It tends to be expensive and process-intensive and overall, is not a realistic solution for managing Internet access.

Keyword Scanning and Scoring is typically used to complement URL Database web filtering to provide an additional layer of protection. It examines the keywords within the URL and webpage requested by the user, comparing them to a list of scored words, and if the keywords presented exceed the threshold set for the user, the page will be blocked.

This method, again, has limited uses; it can only be truly effective for a distinct vocabulary and does not take context into consideration, often blocking harmless sites that users should be allowed access to, such as 'sex education' sites.
The majority of web filtering suppliers use variations and combinations of the techniques previously described to filter the web. However, web-filtering suppliers are now starting to realise that these filtering techniques alone simply cannot cope with the extensive growth and dynamic nature of the web.

New technology is now emerging using advanced software techniques to analyse the patterns and context of the text on a page at the point it is requested by the user.

The software operates by using multi-layered protection and contextual analysis to provide instant classification of web content as soon as it is accessed. This real-time method of web filtering also enables the categorisation of previously undiscovered web pages with an extremely high level of accuracy. It is highly effective at categorising web pages across a wide range of different categories, not just inappropriate content such as pornography, but also shopping and social networking sites, which can have a dramatic impact on user productivity.

When used in conjunction with existing web filtering approaches, such as a URL database, this multi-layered approach to web filtering provides an extremely effective method that ensures thorough and up-to-date coverage of the web and a higher level of protection for the end user.

Recommendations from ICANN this year to open up the available top level domains (currently limited just to the likes of .uk, .fr, .de etc) will lead to a dramatic increase in the number of registered domains and URLs. This will be a significant challenge for first and second-generation web filtering suppliers whose products depend on keeping a URL database up-to-date. Whereas, third-generation filters have the ability to analyse and categorise sites on the fly and make an informed decision as to what risks are associated with accessing them.
Bookmark and Share
Home I Editor's Blog I News by Zone I News by Date I News by Category I Special Reports I Directory I Events I Advertise I Submit Your News I About Us I Guides
   © 2012
Netgains Logo