« False imprisonment - a contrast from the news | Main | Training in Lua »

March 15, 2007

Python - two different splits

In Python, there are two different split methods you can use to break up a string into a number of substrings, based on a particular separator. If you know exectly what character(s) your separator will be - e.g. exactly one space - the you can use the method in the string class. By if your seperator is less well defined - e.g. if it's one or more space characters - then you'll want to use the split within the re class.

import re
space = re.compile(r'\s+')
data ="Perl Python PHP Prolog Pascal"
langs = data.split(" ")
print langs
langs = space.split(data)
print langs

How does that run?

['Perl', 'Python', '', '', 'PHP', 'Prolog\tPascal']
['Perl', 'Python', 'PHP', 'Prolog', 'Pascal']

The first split looks fairly poor - we've split at single space characters BUT the input string had multiple spaces in one place, and a tab in another

The second split - on a regular expression "one or more white space characters" worked much better, and is typicaly what you might use for data that was user entered or user edited.

Posted by gje at March 15, 2007 05:53 PM

Comments

Post a comment




Remember Me?


Well House Consultants Ltd. Copyright 2008