TubeExplode

TDD, Django/Python, Fabric, Nginx, monit, WURFL, Apache Bench, Bootstrap/CSS3, HTML5, SEO, Microdata

Mobstuff in an content company and whilst they have numerous brands tailored to specific markets, TubeExplode is their lead brand which is marketed globally.

The primary aim of the project was to take the existing project; a set of static pages for a free (ad-supported) mobile-only content website generated nightly via a Java, into a fully dynamic website where the same code-base and basic structure could be shared across various brands and device specific layouts. This would help reduce the lead time when it came to introducing new feature across multiple sites/brands and multiple territories.

In moving the project from static to dynamic pages special care had to be taken to ensure the new site could handle the relatively high volumes of traffic that the site receives (60,000+ per day).

By introducing new device-type specific templates (smartphone, lite, tablet and desktop) I had to integrate with various advertising networks which were specific to certain device types (i.e. some are mobile-only, some desktop-only ).

Another aim of this project was to introduce a new subscription service for the UK market with the payment solution coming via MobStuff's sister company Mobbill. MobStuff's internal Content API was updated to meet the requirements of this project which was a good opportunity to test its market readiness. For each country supported there would be a free site and a VIP paid-for subscription site (which would have different content), and different layouts per device-type.

An additional concern of mine was the fact that MobStuff was relying solely on paid traffic in it's bid to attract new one-off PPV purchases and user subscriptions for it's content services.I personally made SEO a high priority on this project as I felt it was not wise to rely solely on one source of traffic for business growth. Besides, on-site SEO and semantic mark-up is just good practice.

This was a large and complex project took a substantial amount of time to complete, effective time management was imperative as I worked on other projects. Furthermore being the only developer dedicated to this project made it even more challenging.

Static to dynamic

My first tasks were to take the static website and to tidy up the HTML (verified by the W3C HTML validator),remove any unused JavaScript libraries (there were quite a few) and break down the pages into reusable templates using Django's template system.

The clean template could now be used as a base template and I then created logic that would allow template switching based on device-type. This logic used a combination of the WURFL device capability repository and data taken from the MobileDetect PHP library for user agent (UA) detection.

Once the logic was in place I created new templates and any variations (e.g. ad zones) were defined in a global configuration file for ease of management.

SEO

Whilst tidying up the HTML in the templates I took special care to ensure that the mark-up was semantic, and came up with a convention for generating the title of pages. I also employed HTML5 Microdata and the Video Object schema to help boost rankings in search engines.

An additional element of SEO is the sitemap - implemented using Django's Sitemap framework which generated a separate sitemap for both the VIP site and the free site.

Payment API integration

I had to integrate the MobBill payments using their API. I was already familiar with their API as I actually helped them start-up their web services and submit for compliance testing with mobile billing operators across Europe.

Server-side caching / Content API integration

Multiple types of cache were employed on this project to ensure that everything was served quickly and that services weren't overwhelmed by the large amount of traffic (60,000+ requests per day).

The two main caches used are:

  1. Memcache (database querysets)
  2. Nginx file cache (complete web pages)

The internal MobStuff Content API was queried nightly and stored in the Django Postgres database to be queried by the app. The result from these database queries were then cached using Memcache which helped to ensure that pages were rendered quickly by preventing costly (time and resource) database queries from happening.

When a page request came into the Nginx server it first checks if a response had already been stored in its file cache, which resided on the filesystem. If it did exist then the response was rendered using the cache, otherwise the request was passed on to the uWSGI application server which rendered a response which Nginx could pass back to the user agent but not before storing it in its file cache to be used in future.

This approach ensured that the system could handl the large amounts of traffic hitting the site without overwhelming the uWSGI application server which was limited in the number of connections it could handle.

The caching element of the system was quite tricky and required extensive R&D.

Testing, Monitoring and task automation

As previously mentioned this project required extensive testing. Different types of testing included:

  1. Test Driven Development (TDD) – Django/Python unit-testing framework
  2. Load testing – Apache bench

Employing TDD helped to identify issues before they were deployed and helped deconstruct the functionality into separate, testable components.

The load testing was used to ensure the server-side configuration was correct and that the site didn't crash when it came under real traffic. After a few false starts I finally managed to find the right configuration with the help of our sysadmin. Load tests ran for a weekend and consisted of 2000 requests per minute which equates to: 1,440,000 requests per day - way more than the 40,000-60,000 we were experiencing at the time. This was achieved using an AWS Medium dual core server which was cost effective when compared to the hardware needed if caching had not been employed.

Task automation and monitoring

Task automation was implemented via a mix of Bash shell scripts and Python Fabric tasks run via crontab jobs.

Monit was used to ensure that I’d be notified should the service become inaccessible; i.e. a non 200 status code was returned by the web server.

mobstuff_tubeexplode