Everyone knows that you should filter your inputs most of the good programmers do it but when you are working with a large team of programmers on an open source project things slip up, errors do creep in, at times like this you wish for a mechanism which would prevent your team from making such mistakes, some thing which forces them to declare their intent.
Obviously I am not the first one to wish for it, in fact people have worked a lot on this. Probably the best known solution for this is the inbuilt input_filter extension for PHP >= 5.2 written by none other than Rasmus himself. Sadly this is turned off by default and it is V5 only (yes, I know about V4 end of life and all that). Another notable solution appeared in the form of Zend_Filter_Input component by Chris Shiflett and was a part of the Zend Framework.
Last week I stumbled upon Inspekt which has been derived from the code in Zend_Filter_input and provides the functionality which I was wishing for! To quote from the web page
Inspekt acts as a sort of ‘firewall’ API between user input and the rest of the application. It takes PHP superglobal arrays, encapsulates their data in an “cage” object, and destroys the original superglobal. Data can then be retrieved from the input data object using a variety of accessor methods that apply filtering, or the data can be checked against validation methods. Raw data can only be accessed via a ‘getRaw()’ method, forcing the developer to show clear intent.
Using Inspekt is quite simple
$cage_POST = Inspekt::makePostCage();
$userid = $cage_POST->getInt('userid');
Get the basic library in your application, create cage objects for super globals and then they can (should!) be accessed only using one of the access methods of the cage object. Read more about usage on this page http://code.google.com/p/inspekt/wiki/BasicUsage
One thing which is missing from that page is the list of accessor methods – yeah it says coming soon but don’t be disheartened – the download has a copy of phpdoc generated documentation which is enough to get you started. In brief they are
getAlnum – Returns only the alphabetic characters and digits in value.
getAlpha – Returns only the alphabetic characters in value.
getDigits – Returns only the digits in value. This differs from getInt().
getDir – Returns dirname(value).
getInt – Returns (int) value.
getPath – Returns realpath(value).
getRaw – Returns the Raw value.
Another scenario where I find “firewall” type input filters handy are when I am straddled with legacy code and now want to make it secure ASAP. The approach to this would be same as that for setting up a firewall for a server. If I were using Inspekt in in such a case the first step would be to cage *all* the super globals and then run the application – obviously there will be a lot of errors, you then start poking appropriate holes in your input firewall – in the end you will have to refactor the code where you were forced to use getRaw() method for whatever reasons…. Yes it is still an awful lot of work but now it is much more easier!