Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Waegis Reloaded

DZone's Guide to

Waegis Reloaded

·
Free Resource
Older members of the .NET community would remember the enterprise spam filtering service that I had launched in July 2008 that existed for 14 months with excellent results. However, due to administration issues and the fact that I was unable to manage, maintain, and host the service at that time, I had to turn it off.

In last Spring semester I had a project on networks security that I dedicated to Waegis and published a report about it and had a short presentation about the service design and implementation. It was a good reminder and motive for me to rethink about reloading the service, and I could manage to rewrite the application and prepare a satisfactory Beta build to launch today.

Today I'm trilled to announce that Waegis is available again, fully reloaded, and rewritten from the scratch to serve to its users.

Software

The software for Waegis is rewritten and although I haven't quantified the changes, I guess that over 85% of the original codebase is recoded to use the most recent technologies and tools. The front-end of the site is replaced to use ASP.NET MVC 3.0 rather than ASP.NET Web Forms, and the whole project is running on .NET Framework 4.0 with the latest bits of its technologies like WCF and LINQ.

Other than using the newest versions of technologies, the codebase is updated and optimized for running in the cloud. Waegis is now running on Windows Azure and applies SQL Azure as its database backend. The architecture of the software has some modifications for the new version that aims at simplification, automation of tasks, and optimization for running in the cloud but essentially, it follows the same structure.

The user interface for the site is the same beautiful, simple, and professional theme designed by Shaho for it that I personally like very much.

Algorithms and Techniques

For the new version of the service, all the spam filtering techniques and algorithms are redesigned, retuned, and updated with the current state of spamming in the world. Some of the old techniques are removed or modified while new ones are introduced. Generally, this area is one of the main areas where I had the most significant changes to improve the quality of the service and its accuracy.

One major addition to Waegis for this version is that my friend, Hamed Banaei, has joined me to bring his expertise in artificial intelligence to this service in order to apply the state of the art techniques to modern spam filtering in an online scenario.

Hamed and I have been working on applying some of the most recent techniques in Waegis, however, they're not applied yet in the first public Beta that is launched today. We're planning to add these new algorithms as a part of a second Beta within the next couple of months. Therefore, the accuracy of the service in this first Beta would be lower than the optimal one that we will have in the future, but still we're hoping to have a very good and satisfactory accuracy in the current version.

Cloud Hosting

One of the other major changes for Waegis is that it is now running on Windows Azure, and this has some important advantages for the service:

  • The first and foremost advantage of running in cloud is that we can achieve a better reliability and availability that was difficult with the same budget when Waegis was running on dedicated servers.
  • The second advantage is that the maintenance of the service is now easier for me because unlike before, I don't need to maintain the operating system and related software for the servers.
  • Running in the cloud makes the application more secure and saves me from writing a lot of code for protection purposes and monitoring the service. In fact, in the previous version I had to write a reasonable amount of code for security checks and guards in order to prevent any possible DDoS or other types of attacks on the limited resources of the servers.With Windows Azure this is much easier and I could remove such features and focus more on the service and the spam filtering techniques.
  • The fourth major advantage of being in the cloud is that the costs of hosting are easier to manage as you just pay for the resources that you use that is directly proportional to the number of clients that you host, and you can easily scale up during the time. As I'll explain later, this was one of the major problems forcing me to shut the service down in 2009.

Generally, online services like Waegis are one of the best applications of cloud computing where hosting in cloud can simplify many aspects of the development and administration.

New Administration Strategies

In 2009 I was overwhelmed by several issues forcing me to shut Waegis down despite the excellent results that I could get for the service and the very good growth that it had in 14 months of its existence. The hosting fees for keeping the service secure and available were high and I was offering a totally free service to clients without any limitations and this was made possible by the great help and support from Axosoft as the sponsor.

Unfortunately, spam filtering, especially for modern types of spam like comment spam in an online scenario, is one of the most complex areas of programming and design, but the significance of this task is not well-understood. Although I put a lot of effort into building, maintaining, and developing Waegis, I didn't receive much support and help from the Microsoft community. Furthermore, I was in the process of moving to the United States to start my Ph.D. program that was bringing new concerns to my life.

All in all, at that point I realized that the amount of effort that I put in Waegis won't have an outcome any close to what it deserves and it will be a waste of time to continue with that style.

But now that I've settled in the United States and get used to the new life, and cloud computing has become mature enough to serve in a scenario like this, I have an opportunity to give this idea a try again and manage it with less hassles. Besides, now I have a friend helping me with some parts which is another positive point, and I’ll be trying to keep a team working on Waegis to make things easier.

The other way to manage the service is to enforce a production per demand nature by imposing limitations and going away from openness and freeness that is proven to not work very well on the Microsoft community. Therefore, as soon as we finish developing the new features and test the site and service in Beta stage, we're going to offer commercial plans with competitive pricing in order to at least cover the hosting and maintenance fees. While we keep the free accounts available to small sites, we use commercial plans to make sure that the more demand we receive on the service, the more resources we will have to scale up. This assures that if we grow fast, we have more resources available proportionally, and we won't face with any administration problems.

Major Changes

Other than the technical and administration changes mentioned above, there are some major changes that can concern the normal users as well. These changes are mostly applied in order to simplify the user experience and make things easier.

As a technical change, the API is revised to be simpler to use. The concept of site instances is removed and you can use your API key for all the sections throughout your site as long as it is used for the same domain.

Besides, the site is also simplified and there is no blog anymore. I’ll publish the latest news and updates on Waegis Twitter account.

You would also notice a simplification on the site and its content to make things easier to follow as you may notice in the new API structure.

Roadmap

Today I’m launching the first public Beta of Waegis while there are some major additions planned to come soon in a second Beta. We're launching in this state because we're following a release early, release often philosophy and have already achieved a satisfactory accuracy with our current implementation. Besides, we want to get user feedback and fix possible issues while allowing the service to be trained for the algorithms that we'll add later and heavily rely on the amount of training data that we have.

We expect each Beta stage to be live for 1.5-2 months and have the stable version released within the next 3 to 4 months when hopefully, we achieve our optimal accuracy, start offering commercial plans, and provide statistical reports.

Along with this path, we'll try to publish libraries and plug-ins for different platforms and content providers based on the priority and their demand.

Get Started

If you have a site, blog, forum, or wiki, and are tired of spam items, you can go to Waegis site and register (old accounts are not available as we start with a fresh new database), then activate your account, and add your site to retrieve a unique private API key that you can apply to your application. At the moment, I've published a .NET library for Waegis API that is hosted on GitHub that you can download from Waegis site which includes binaries for .NET Framework 2.0, 3.5, and 4.0 along with a Windows Desktop application to test the API. Please note that there are some simplifications and modifications in the API so the previous libraries need some updates in order to be used with the service.

All the free accounts can make up to 25000 API calls per months to the service. If you think that you will exceed this limit, you can contact us and we will lift this limit for you during the Beta period for free.

How to Help

If you want to help us succeed in our goal in providing a better spam filtering service for modern spam types, you have some options:

  • You can spread the word about Waegis by blogging or tweeting about it (don't forget to mention our Twitter account), sharing the link on social bookmarking/networking sites, and recommending it to your friends.
  • If you have a site, blog, forum, or wiki, a great way to help is to simply use Waegis because it uses self-learning algorithms and your contributions can help us use more data to train our service which has a huge impact on the quality of this service.
  • If you're a technical programmer, you can develop different libraries or plug-ins for your favorite blog engine, CMS, or platform, and open new doors for others to use Waegis. The API documentation is very simple to follow and we're always ready to hear your questions and know about your implementation that we can share on our website.

Wrap Up

I had a week off from research work after the spring semester when I had the opportunity to work on some ideas and one of them was reloading Waegis by rewriting the codebase to work with .NET Framework 4.0, ASP.NET MVC 3.0, latest bits of WCF, and Windows Azure.

My own problems with the amount of spam on my blog was a major contributing factor in pushing me to reload Waegis because reCAPTCHA didn’t handle the incoming spam content very well and wasn’t very helpful in practice. There were many spam items passing through the complex CAPTCHA control that was hurting the usability of my blog as well.

Now that I have a skillful friend working with me on this project and cloud hosting has opened new doors for ease of maintenance and lowering the costs, I’m hopeful to have a good progress with Waegis and get excellent results in the coming years which can help many site owners save their time from wasting on dealing with spam content. To get there, I’ll be glad to hear the feedback and thoughts from our clients in order to improve the service and meet the requirements of users.

Topics:

Published at DZone with permission of Keyvan Nayyeri. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}