23 · 01

Released CirruxCache 0.4.2

I have just released a new minor version of CirruxCache. This new version includes several bugfixes and third party library upgrades.

Here is the change list of the 0.4.2:
  • Admin panel: upgraded jquery, jquery-ui et jquery-form to the latest versions.
  • Store service: fixed file upload (which was not working anymore on appengine with billing mode enabled).
  • Some code cleanup
This update is not critical, I invite you to upgrade only if you are experiencing problems with the admin panel or the store service.

Thanks to Doug Tung who shared me an appengine account with billing mode enabled, that was really helpful for debugging.
26 · 10

CirruxCache 0.4.1 is out!

An easy way to use Google Appengine as a personal CDN! CirruxCache 0.4.1 is out, so let me show you a set of new cool features!

CirruxCache provides a software solution to dynamically cache HTTP objects on Google Appengine (using the Datastore and the Memcache services).

Since the last release, I received a lot of good comments and smart suggestions. Fortunately, Some of you increased the size of my todo-list and a bunch of nice ideas are still in the box.

Let’s take a look at this release:

  • Admin panel now includes a config helper (find here screenshots and documentation).
  • Object flushing has been fixed in the admin panel, give it a try!
  • The Cache service now includes a “Content Prefetching” feature. When it is enabled, HTML pages are interpreted. All associated content are loaded in the cache before the user request them. So it speeds up the first page loading. It is experimental for the moment, but it works (it is enabled on this blog!).
  • A new Image service enables you to delegate image transformation to CirruxCache. For example, you won’t need to upload images previews or thumbnail on your website anymore, let CirruxCache generate them and cache them on the fly! (All images on this page are generated this way).
  • Config has been simplified: you don’t need to edit the main “app.py” file. It is located in a file named “config.py”. On top of that, all internal services (admin, cron, store) are now included by default, you can forget them! (Install documentation page has been updated).
  • A lot of bugfixes which improved the caching mechanism.

I hope you will enjoy this new release, check it out on the download page.

I would like to share with you the Users page, I try to list some of you using CirruxCache. This pagedoes not have much success, please take a look!

Finally, I warmly encourage you to subscribe to the CirruxCache group, it is really cool to get help and to provide advices to other users.

Spread the word!

 

21 · 07

CirruxCache: Advanced configuration sample

That's it! This whole blog is cached and directly delivered by CirruxCache (only static files were cached before). My origin server is a tiny eeebox connected through my personal ISP. So this configuration is a good challenge to offload my tiny web server as much as possible. I think this is a good opportunity to show an example of a configuration which is a little bit more evolved. The point is, I cannot set the same cache TTL for the whole website, and actually, I want to cache several websites...
# URL mapping
urls = {}

base = (
'/_admin/(.*)', 'Admin',
'/_store/(.*)', 'Store',
'/_cron/(.*)', 'Cron'
)

urls['default'] = base + (
'(/debug/.*)', 'Debug',
'/(.*)', 'Root'
)

urls['www.shad.cc'] = base + (
'(/themes/.*)', 'Blog_Static',
'(/plugins/.*)', 'Blog_Static',
'(/admin/.*)', 'Blog_Forward',
'(/.*)', 'Blog_Page'
)

urls['www.zaphod.eu'] = base + (
'(/pub/.*)', 'Zaphod_Redirect',
'(/.*)', 'Zaphod'
)

# still supporting the old config

urls['cdn.shad.cc'] = base + (
'/blog(/.*)', 'Blog_Static',
'/(.*)', 'Root'
)

urls['cdn.zaphod.eu'] = base + (
'(/admin/.*)', 'Zaphod',
'/(.*)', 'Root'
)

# POP definition
# You can define and configure your Point Of Presence

class Blog_Static(cache.Service):
origin = 'http://orig.shad.cc'
forceTTL = 2592000 # 1 month
ignoreQueryString = True
forwardPost = False
allowFlushFrom = ['x.x.x.x']

class Blog_Page(cache.Service):
origin = 'http://orig.shad.cc'
forceTTL = 3600 # 1 hour
ignoreQueryString = True
forwardPost = True
allowFlushFrom = ['x.x.x.x']

class Blog_Forward(forward.Service):
origin = 'http://orig.shad.cc'

class Zaphod(cache.Service):
origin = 'http://orig.zaphod.eu'
forceTTL = 2592000 # 1 month
ignoreQueryString = True
forwardPost = False
allowFlushFrom = ['x.x.x.x']

class Zaphod_Redirect(redirect.Service):
origin = 'http://zaphod.eu'

# !POP
I think this configuration is enough readable to avoid any explanation. However, do not hesitate to leave any comments.
Finally, I created a google groups to centralize all help requests. So if you need help, go to http://groups.google.com/group/cirr... or send an email to cirruxcache 'at' googlegroups 'dot' com.
18 · 07

New release: CirruxCache 0.3.1

I am really glad to announce a new major release of CirruxCache.

This new release includes the following changes:

  • A storage webservice: store big files (<= 2GB) on the Blobstore in order to deliver them through CirruxCache. This feature is useful to bypass the 1MB limit on appengine.
  • An admin panel that enables users to flush objects, manage big files and see some statistics about the resources used.
  • Bugfixes

It is really important to note there are few limitations on the panel admin:

  • There is no error reporting on the flush panel (it only displays the number of objects trying to be flushed).
  • Storage manager displays a "500 Internal Error" when uploading. It only happens when you don't have a billing account (the Blobstore is only available on billing accounts, refer to appengine).

These two limitations will be improved in the next release, and there will be more informations in the statistics panel.

The Storage WebService will be documented really soon, but you can access the admin panel through "http://your.cirruxcache.app/_admin/"

I make the most of this opportunity to announce some changes on the project website:

I hope you will enjoy this new release.

19 · 05

Minor release: CirruxCache 0.2.2

CirruxCache 0.2.2 has just been released. It contains some bugfixes (thanks to Devattas to have reported errors on Datastore latency). Webpy has been updated to the last version.

I have also updated the documentation, especially I brought more details on Point of Presence configuration and usage of cron tasks for garbage collection.

Finally, some of the users reported me that there is a real problem with the cached object size limit (currently 1MB). I am working on the solution, I will take advantage of the new Blobstore service on AppEngine to store objects. Maybe I will keep the Datastore only for meta-data. This solution will raise the cache object limit to 50MB.

Stay tuned :)

11 · 03

CirruxCache 0.2.1 is released

I have just released a new version (0.2.1) of CirruxCache. To remember:

CirruxCache provides a software solution to dynamically cache HTTP objects on Google Appengine (using the Datastore and the Memcache services).

This new version includes an interesting set of features:

  • allow object flushing from restricted IP
  • configure a PoP (Point of Presence) according to a virtual host
  • several behaviors (cache, redirect, forward)

In more details, the last feature is the ability to configure a point of presence to differ from a classical caching mechanism. For example, I may want to configure "/admin/*" on my website to be redirected on the origin without caching.

Of course, this release includes several bugfixes, especially a fix on the "Expires" HTTP header which improves the caching performances.

Do not hesitate to test this new version and to comment any bugs or any suggestions.

19 · 02

Adding virtual host support to webpy

Webpy is a tiny web framework. I use it a lot for my web-services applications. In general, I let my web server (lighttpd) to handle virtual hosting. But as you may know, I am working on a CDN solution on top of Google App Engine, named CirruxCache. In that case, while I have absolutely no control on the server configuration, I need to handle virtual hosting from the code. Webpy maps urls by iterating through a tuple. So my solution is quite simple: wrapping the tuple to override the __iter__ function according to an environment variable (HTTP_HOST). Let's take this basic webpy example, without vhosting:

import web

urls = ('/(.*)', 'hello')

class hello(object):
def GET(self, name):
if not name:
name = 'World'
return 'Hello, %s' % name

if __name__ == "__main__":
app = web.application(urls, globals())
app.run()
Let's add the VhostMapper class:
import web

urls = {
'default' : ('/(.*)', 'hello'),
'my-vhost.domain.tld' : ('/(.*)', 'helloVhost')
}

class hello(object):
def GET(self, name):
if not name:
name = 'World'
return 'Hello, %s !' % name

class helloVhost(object):
def GET(self, name):
return 'Hello %s' % web.ctx.environ['HTTP_HOST']

class VhostMapper(object):
def __iter__(self):
url = urls['default']
if 'HTTP_HOST' in web.ctx.environ:
vhost = web.ctx.environ['HTTP_HOST']
if vhost in urls:
url = urls[vhost]
return iter(url)

if __name__ == "__main__":
app = web.application(VhostMapper(), globals())
app.run()
Finally, you can use curl or wget to test your vhosts:
$> curl -H "Host: my-vhost.domain.tld" http://localhost:8080/

It is not so early to announce that the next version of CirruxCache will handle virtual hosting :) I am sure this simple hack can be easily reproduced to use virtual hosting in some other Rest frameworks.

30 · 10

CirruxCache: speeds up your HTTP app using Google Appengine as a CDN

It is a great moment, for the first time since I have started to work at Zoomorama, I have just released as open-source an important part of our server platform.

I previously explained how to use Google AppEngine as a Content Delivery Network (CDN). CirruxCache project concretizes this idea. I released the first version based as the one we use in production.

Here is the features it currently supports:

  • honor Cache-Control
  • cache TTL override
  • several POP (Point Of Presence) configuration mapped on custom base-url
  • ignore query string
  • POST forwarding
  • expired entries garbage collection
  • extensibility


CirruxCache is not documented at the moment even if you would be able to use it after reading the comments in the app.py file. I'll document this app in the next few days, but if you need more documentation, don't hesitate to contact me.

The project website.

30 · 07

Speed up HTTP delivering using Google AppEngine

Google AppEngine provides an high-level cloud service which means that your application will be distributed automatically on top of the Google platform. All of your code will depends on the AppEngine SDK, so it could be risky to develop complex application on it. I develop a webservice application for content delivering and content publishing at Zoomorama. We currently use Akamai CDN as a simple cache layer to improve data delivering accross the world. It is interesting for me to use AppEngine in the same way: without changing anything on my existing code base. I have found some posts on blogs dealing with this AppEngine usage, but they are not focused on dynamic HTTP caching like a real CDN. Principle is very simple, all HTTP requests on my AppEngine application will be copied to the AppEngine Datastore. Moreover data which are delivered through AppEngine are cached by AppEngine servers. The code below is a tiny proof of concept:
# HTTP caching on Google App Engine
# - by shad <shad@zaphod.eu>
#

import web # webpy 0.3x
from google.appengine.ext import db
from google.appengine.api import urlfetch

origin = 'http://my.website.com'

urls = (
'(/.*)', 'Root'
)

class Cache(db.Model):
data = db.BlobProperty(default=None)
headers = db.ListProperty(str)

class Root(object):
def GET(self, request):
cache = self.readCache(request)
if cache is None:
cache = self.writeCache(request)
for h in cache.headers:
print h
return cache.data

def readCache(self, key):
cache = cache = Cache.get_by_key_name(key)
if cache:
return cache

def writeCache(self, request):
url = origin + request
response = urlfetch.Fetch(url=url)
if response.status_code != 200:
raise web.NotFound()
cache = Cache(key_name=request)
cache.data = db.Blob(response.content)
cache.headers = []
for k, v in response.headers.iteritems():
cache.headers.append('%s: %s' % (k, v))
cache.put()
return cache

if __name__ == '__main__'
app = web.application(urls, globals())
app.cgirun()
I use webpy to depends on the AppEngine SDK as less as possible.
I have almost finished the production version of this application. I am doing some performance tests. This application is closed source for now. But I am going to release the code source in few weeks. This version will include:
  • Fetch from Memcache (about 10 times faster).
  • Headers forwarding.
  • Read "Cache-Control" and "Expires" to define a TTL (rfc 2616).
  • Multi origins (according to url mount points).
  • Other small features (force TTL, ignore query string, etc...).
It is important to note that AppEngine does not keep running instances of your application (your CGI is distributed and it is executed on demand). So this application have to start very quickly (no configuration file, no dynamic generation, etc...).

Pages