|Photograph by Tostie14 (Creative Commons)|
Each time I revisit regular expressions after not using them for a while it feels like reading hieroglyphics, but then I remember I can somehow read these ancient runes!
Writing regex from scratch is incredibly enjoyable and rewarding. Refactoring and improving the specificity of the match pattern will improve the performance usually. Trying to capture as many groups as you can without OR-ing will also improve the performance and that in itself is great fun.
Then there's the snag, "I pity the fool that has to debug and maintain that terse string, I Do!"
There must be a way to make these things maintainable. I haven't quite worked out what it is yet, perhaps you should piecewise compose the string with comments on the intended use. Or maybe just write a block comment above saying what strings you are actually trying to get hold of, then when it comes to extending or maintaining, the next person can just throw the old expression in the bin and write a new one which matches everything you needed previously and the new stuff. It sounds really inefficient however it seems that knowing what someone was trying to do in the code, is more useful than knowing how they went about it.
The power of being able to objectify unstructured data feels like you are creating something from nothing. Which is why it feels so rewarding because you are.
Some great regex references