Thursday 28 January 2010

Hardening Wordpress

I have started using Wordpress (instead of Blogger) in order to have more control of advertising; more templates, more plugins, etc. It was very easy to get started with..., however, this weekend the first test sites where hacked, so I am not sure if Wordpress was the right choice or not - still considering sticking with Django, or testing Drupal.

Anyway, although Wordpress does not have the best reputation for security, it may actually have been other user accounts on the shared hosting I rent that were cracked first - in that case I should consider swapping to a VPS instead of changing Wordpress.

Anyway, after some research I have decided to always install these plugins on the Wordpress sites, in alphabetic order:
  • AntiVirus
  • Login LockDown
  • Login Logger
  • Secure WordPress
  • WordPress File Monitor
  • Wordpress Firewall
  • WP Security Scan
In a addition, I will be using these .htaccess files:

In the main directory of the blog:


Order Deny,Allow
Deny from all
Allow from xx.xxx.xxx.103

 In the wp-admin directory:


Order Deny,Allow
Deny from all
Allow from xx.xxx.xxx.103


Can not really be bother setting and managing passwords, so I'll use the IP instead. If I ever work on these from somewhere else, I'll just login to the cPanel and add those IPs too into the files. 


Since I will use this blog post as my own bookmark when setting up another Wordpress blog, I will also add the other "must have" plugins here:

  • Google XML Sitemaps 
  • Google Integration Toolkit
  • Sociable 
  • Evermore
I always install these from the WP Dashboard - so there is no need to include links.

That's it folks.

Wednesday 16 December 2009

Google App Engine Task Queue with Python - for Dummies...

I have a script that pull in data using from the net, and does some work on the data before storing it in the Datastore. However, the script was kicked off by cron every now and then, but was running into the 30s time limit set by the App Engine. The obvious solution was to split the work into smaller pieces and assign these to the Task Queue.

However, after reading that documentation few times I was more confused than before I started. The Task Queue API is actually great, but someone should revise that documentation... Above all, it is not clear at all how the different configurations relate to each other, the examples in the documentation is made for "over complicated" examples, IMHO.

Here is Task Queue explained for Dummies (like myself):

First of all, there is the default queue, and in addition you can create your own queues using the queue.yaml config file. The reason you to create your own queue would be that you are not happy with the default execution of 5 tasks per second of the default queue. Let's start looking at using just the default queue, and later on we will expand with creating our own queue too.

In this example the original script was doing work for seven days, and we will split it into seven smaller tasks:
  1. When using the default queue, we do not need to create any queue.yaml file at all.

  2. To start with, we need a URL that cron, or yourself, can use to kick off the whole affair, in app.yaml add for example:

    - url: /update
      script: scripts/all-to-q.py
      login: admin

  3. Now, create the all-to-q.py file in the scripts directory with content like:

    #!/usr/bin/python
    # -*- coding: utf-8 -*-

    import logging
    from google.appengine.api.labs import taskqueue

    for i in range(7):
        taskqueue.add(url='/one-day', params={'dayI': i}, countdown= i)
        logging.info('Adding day '+str(i)+' to the Task Queue.')

    The countdown parameter adds a little delay for each new task before it is executed.

  4. Now, go back to the app.yaml file and add that new URL you need for each task:

    - url: /one-day
      script: scripts/one-day.py
      login: admin

    Simple, isn't it.

  5. And now the essential parts of the one-day.py file; mainly those that will pick up the POST parameters (here just one called 'dayI'):

    import wsgiref.handlers
    from google.appengine.ext import webapp

    class OneDay(webapp.RequestHandler):

      def post(self):
        i = int(self.request.get('dayI'))
       
        # ... and here you get your hands dirty; use i and do the work.

    def main():
     
      application = webapp.WSGIApplication([
                    (r'/one-day', OneDay),
                    ], debug=True)
     
      wsgiref.handlers.CGIHandler().run(application)


    if __name__ == '__main__':
      main()


    ...and that's it.

  6. I don't understand why the official documentation could not explain something this simple...; I believe my example above makes it fairly clear how the execution logic flows.
PS. Note that I also included some logging above; it is really useful... Expand on it yourself.

Now, let's say we do not want to overload the sites we pull data from, so we will create our own queue and used that instead of the default queue. All we need to do is:
  • Create that queue.yaml file, with for example:

    queue:
    - name: one-full-day
      rate: 1/s
      bucket_size: 1

  • Now, in order to use that queue, change one single line in all-to-q.py so it reads:

    #!/usr/bin/python
    # -*- coding: utf-8 -*-

    import logging
    from google.appengine.api.labs import taskqueue

    for i in range(7):
        taskqueue.Task(url='/one-day', params={'dayI': i}, countdown= i).add(queue_name='one-full-day')
       
        logging.info('Adding day '+str(i)+' to the Task Queue.')

       
    Done. Possibly the taskqueue. line above wraps the row here in the blog, but it's a single line.
Wasn't that easy...

    Organizing and re-using Python code

    As the code grows it has become necessary to structure it better and re-use it (import it) - so that it can be maintained and expanded effectively. In short, it is time to take a closer look at
    if __name__=="__main__": ...

    I think this is the best explanation... Better start implementing it.

    Monday 14 December 2009

    ...actually, no thanks.

    Actually trying to produce anything with Django on GAE has been a pain in the ***. All documentation and tutorials seems aimed at people who already knows Django and wants to put it on GAE.

    I have realized it is a waste of time.

    Will instead just use the template system built into GAE's own webapp (which I know is from Django) - it is probably all I want from Django anyway. When restling with Django itself  i discovered its GUI admin interface and other scary things...

    If webapp will not be enough I will instead use Cheetah. If I go for Cheetah, there a good how-to here.

    Sunday 13 December 2009

    ...and Django

    The documentation regard getting Django up and running isn't that comprehensive.

    This is good: http://code.google.com/appengine/docs/python/tools/libraries.html#Django

    The actual local installation on the Mac, use this:
    http://www.djangoproject.com/download/
    ...but with the command: sudo python2.5 setup.py install

    Friday 11 December 2009

    Voice Applications too?

    Just wanted to put this in here while I remember it... great potential:

    http://www.twilio.com/

    More Unicode, and more gor Google to fix

    Once I finally got teh Unicode sorted between BeautifulSoup and Google App Engine Datastore, with some help from this:

    http://khaidoan.wikidot.com/google-app-engine-datastore

    ...I was rather surprised to find out that Google itself still have bugs to sort out... The web based Datastore viewer of the SDK (http://localhost:8080/_ah/admin/datastore) can not display non ASCII charcters! ...and it is a rather old bug:

    http://code.google.com/p/googleappengine/issues/detail?id=502

    Well, well...
    It works fine in the production though - from the DataViewer in the Dashboard.