The Google App Engine datastore provides convenient data modeling with Python. One important aspect is the validation of the data stored in a Model instance. Each data key-value is stored as a Property which is an attribute of a Model class.
While every Property can be validated automatically by specifying a “validator” function, there is no option for the Model key name to be automatically validated. Note that we can manually specify by our code the value of the key name, and therefore this key name can be considered user-data and must be validated. The key name is by the way the only unique index constraint, similar to the “primary key” in relational databases, which is supported by the Google datastore, and can be specified manually.
Here is my version for a validation function for the Model’s key name:
from google.appengine.ext import db import re def ModelKeyNameValidator(self, regexp_string, *args, **kwargs): gotKey = None className = self.__class__.__name__ if len(args) >= 2: if gotKey: raise Exception('Found key for second time for Model ' + className) gotKey = 'args' k = args[1] # key_name given as an unnamed argument if 'key' in kwargs: if gotKey: raise Exception('Found key for second time for Model ' + className) gotKey = 'Key' k = kwargs['key'].name() # key_name given as Key instance if 'key_name' in kwargs: if gotKey: raise Exception('Found key for second time for Model ' + className) gotKey = 'key_name' k = kwargs['key_name'] # key_name given as a keyword argument if not gotKey: raise Exception('No key found for Model ' + className) id = '%s.key_name(%s)' % (self.__class__.__name__, gotKey) if (not re.search(regexp_string, k)): raise ValueError('(%s) Value "%s" is invalid. It must match the regexp "%s"' % (id, k, regexp_string)) class ClubDB(db.Model): # key = url def __init__(self, *args, **kwargs): ModelKeyNameValidator(self, '^[a-z0-9-]{2,32}$', *args, **kwargs) super(self.__class__, self).__init__(*args, **kwargs) name = db.StringProperty(required = True)
As you can see, the proposed solution is not versatile enough, and requires you to copy and alter the ModelKeyNameValidator() function again and again for every new validation type. I strictly follow the Don’t Repeat Yourself principle in programming, so after much Googling and struggling with Python, I got to the following solution which I actually use in my projects (click “show source” to see the code):
from google.appengine.ext import db import re def re_validator(id, regexp_string): def validator(v): string_type_validator(v) if (not re.search(regexp_string, v)): raise ValueError('(%s) Value "%s" is invalid. It must match the regexp "%s"' % (id, v, regexp_string)) return validator def length_validator(id, minlen, maxlen): def validator(v): string_type_validator(v) if minlen is not None and len(v) < minlen: raise ValueError('(%s) Value "%s" is invalid. It must be more than %s characters' % (id, v, minlen)) if maxlen is not None and len(v) > maxlen: raise ValueError('(%s) Value "%s" is invalid. It must be less than %s characters' % (id, v, maxlen)) return validator def ModelKeyValidator(v, self, *args, **kwargs): gotKey = None if len(args) >= 2: if gotKey: raise Exception('Found key for second time for Model ' + self.__class__.__name__) gotKey = 'args' k = args[1] # key_name given as unnamed argument if 'key' in kwargs: if gotKey: raise Exception('Found key for second time for Model ' + self.__class__.__name__) gotKey = 'Key' k = kwargs['key'].name() if 'key_name' in kwargs: if gotKey: raise Exception('Found key for second time for Model ' + self.__class__.__name__) gotKey = 'key_name' k = kwargs['key_name'] if not gotKey: raise Exception('No key found for Model ' + self.__class__.__name__) v.execute('%s.key_name(%s)' % (self.__class__.__name__, gotKey), k) # validate the key now class DelayedValidator: ''' Validator class which allows you to specify the "id" dynamically on validation call ''' def __init__(self, v, *args): # specify the validation function and its arguments self.validatorArgs = args self.validatorFunction = v def execute(self, id, value): if not isinstance(id, basestring): raise Exception('No valid ID specified for the Validator object') func = self.validatorFunction(id, *(self.validatorArgs)) # get the validator function func(value) # do the validation class ClubDB(db.Model): # key = url def __init__(self, *args, **kwargs): ModelKeyValidator(DelayedValidator(re_validator, '^[a-z0-9-]{2,32}$'), self, *args, **kwargs) super(self.__class__, self).__init__(*args, **kwargs) name = db.StringProperty( required = True, validator = length_validator('ClubDB.name', 1, None))
You probably noticed that in the second example I also added a validator for the “name” property too. Note that the re_validator() and length_validator() functions can be re-used. Furthermore, thanks to the DelayedValidator class which accepts a validator function and its arguments as constructor arguments, the ModelKeyValidator class can be re-used without any modifications too.
P.S. It seems that all “validator” functions are executed every time a Model class is being instantiated. This means that no matter if you are updating/creating the data object, or you are simply reading it from the datastore, the assigned values are always validated. This surely wastes some CPU cycles, but for now I have no idea how to easily circumvent this.
Disclaimer: I’m new to Python and Google App Engine. But they seem fun! 🙂 Sorry for the long lines…
Resources: