Thom Nichols

Thom

Technology is evolution outside the gene pool

RESTful File Upload for AppEngine

Google's AppEngine is an awesome platform for simple web apps. Unfortunately one of the more glaring limitations is the lack of filesystem access. Fortunately, their robust data storage API can make up for the omission in many cases.

The Use Case

I wanted to have integrated image hosting for this blog so I wrote a REST handler to upload and serve image files for GAE's Python API.

The Code

Google's Appengine documentation actually provides the meat of the datastore code. The remaining know-how comes from the Python API Webapp framework module.

First, a simple data model, very similar to the AppEngine documentation example:

from google.appengine.ext import db

class Image(db.Model):
 name = db.StringProperty(required=True)
 data = db.BlobProperty(required=True)
 mimeType = db.StringProperty(required=True)
 created = db.DateTimeProperty(auto_now_add=True)
 owner = db.UserProperty(auto_current_user_add=True)

Now that we have that out of the way, here's the upload (POST handler) code:

import logging
from google.appengine.ext import webapp
from models.image import Image

urlBase = '/imgstore/%s'

class ImageHandler(webapp.RequestHandler):
 def post(self,id):
  logging.info("ImagestoreHandler#post %s", self.request.path)
  fileupload = self.request.POST.get("file",None)
  if fileupload is None : return self.error(400)
  
  # it doesn't seem possible for webob to get the Content-Type header for the individual part, 
  # so we'll infer it from the file name.
  contentType = getContentType( fileupload.filename )
  if contentType is None: 
   self.error(400)
   self.response.headers['Content-Type'] = 'text/plain'
   self.response.out.write( "Unsupported image type: " + fileupload.filename )
   return
  logging.info( "File upload: %s, mime type: %s", fileupload.filename, contentType )
  
  img = Image( name=fileupload.filename, data= fileupload.file.read(), 
    mimeType=contentType )
  img.put()
  logging.info("Saved image to key %s", img.key() ) 
  #self.redirect(urlBase % img.key() ) #dummy redirect is acceptable for non-AJAX clients,

def getContentType( filename ): # lists and converts supported file extensions to MIME type
 ext = filename.split('.')[-1].lower()
 if ext == 'jpg' or ext == 'jpeg': return 'image/jpeg'
 if ext == 'png': return 'image/png'
 if ext == 'gif': return 'image/gif'
 if ext == 'svg': return 'image/svg+xml'
 return None

The upload from an HTML form will be a multipart-mime request. Because of that, self.request.POST.get("file") must be used to get a dict of request properties from that portion of the multipart request. fileupload is a dict of properties which includes the file name and uploaded data. Note that most user-agents will send a Content-Type header for that portion of the request, however I couldn't figure out how to access it using the Webob API. To solve that, I just created a getContentType() method to infer the file type given the uploaded file name. This also allows you to prohibit unwanted files from being uploaded.

The one last piece of information needed is the URL mapping to the ImageHandler class in main.py:

ROUTES = [
 ('/imgstore/?([\w]*)/?', imagestore.ImageHandler)
 # other handlers here...
]

def main():
  application = webapp.WSGIApplication(ROUTES)
  wsgiref.handlers.CGIHandler().run(application)

if __name__ == "__main__":
  main()

Now that we're able to upload files, let's access them:

 # part of ImageHandler class
 def get(self,id): 
  logging.info("ImagestoreHandler#get for file: %s", id)
  
  img = None
  try:
   img = Image.get( id )
   if not img: raise "Not found"
  except:
   self.error(404)
   self.response.headers['Content-Type'] = 'text/plain'
   self.response.out.write( "Could not find image: '%s'" % id )
   return
   
  logging.info( "Found image: %s, mime type: %s", img.name, img.mimeType )
  
  dl = self.request.get('dl') # optionally download as attachment
  if dl=='1' or dl=='true':
   self.response.headers['Content-Disposition'] = 'attachment; filename="%s"' % str(img.name)
    
  self.response.headers['Content-Type'] = str(img.mimeType)
  self.response.out.write( img.data )

A GET request will have the Image's datastore 'key' property in the URL like this example.  (Note the image is being served by my blog, which is running on AppEngine.)  As a final bonus I've also implemented a 'list' function if an image ID is not given in the GET request. You can see the full code on GitHub (including authorization, which is covered elsewhere).

Conclusion

This should be everything you need to have a REST-ful Python image handler for AppEngine! Finally, here's a sample upload form to test with. Note that this will only work for relatively small files - Google's DataStore limits entity size to 1 MB, but that's more than enough for web

(Comments are closed)

4 Comments

  1. avatar Re: RESTful File Upload for AppEngine Jan. 20, 2010 Benjamin

    That's a very useful feature. You are doing great work and I can't wait to try this out. I think 1MB should be enough for most use cases, although I imagine a blog of high-definition space photos which shows thumbnails and HD photos on click, for example via lightbox or facebox could need more.
  2. avatar Re: Re: RESTful File Upload for AppEngine Jan. 20, 2010 Thom

    In that case, you'll want to use the Picasa support that I just added last week :)

    I'm also considering general attachment support, which could be interesting to support via Google Documents, especially considering they've added support for any file type

    I might implement it as a datastore or GDocs option, but considering the size limit and the additional capabilities of Picasa and GDocs, it's almost silly to use the local datastore option.  I'll support it as more of an exercise I suppose.

  3. Re: RESTful File Upload for AppEngine Jan. 25, 2011 po

    your example doesn't work, 500 Server error appers, why?
  4. avatar Re: 500 Error Feb. 5, 2011 Thom

    What does your server traceback look like?