There are numerous string functions that are built in, that might come in handly with processing data. This post lists with examples, some of the most common yet powerful functions for string operations.
- capitalize()
- center()
- count()
- endswith()
- startswith()
- find()
- format()
- index()
- isalnum()
- isalpha()
- isdecimal()
- isupper()
- islower()
- upper()
- lower()
- casefold()
- isnumeric()
- isprintable()
- isspace()
- join()
- split() and rsplit()
- splitlines()
- strip()
- rstrip()
- lstrip()
capitalize() :
Capitalize the first character of the string.
>>> str = 'vijay' >>> str_new = str.capitalize() >>> print str_new Vijay >>>
center() :
Pad empty character or a given character on both sides. This function accepts the total length of the string that the resulting string should have inclusive of the characters to pad the original string with.
# in the below example initial length of string 'vijay' is 5. 8 is the length required, which is passed as argument, hence the output is padded with 3 empty characters cumulatively. >>> str = 'vijay' >>> str_new = str.center(8) >>> print str_new vijay >>> # In the below example, the initial length of string 'vijay' is 5. 36 is the required length of resulting string including the padded character '*'. >>> str_new = str.center(36,"*") >>> print str_new ***************vijay**************** >>> >>> str_new = str.center(8,"~") >>> print str_new ~vijay~~
count() :
count total number of occurrences of substring in a given string.
>>> str1 = 'aabayagh' >>> str2 = str1.count("aa") >>> print str2 1 >>> >>> str2 = str1.count("a") >>> print str2 4 # start counting for substring "a" after the 1st character and stop before 6th character. >>> str2 = str1.count("a",1, 6) >>> print str2 3 >>>
endswith() :
Returns true if a string ends with a given substring passed as an argument.
>>> str1 = "welcome to unixutils" # search for substring "unixutils" in the string "welcome to unixutils". If found, endswith() returns true. >>> str2 = str1.endswith("unixutils") >>> print str2 True >>> >>> str2 = str1.endswith("unix") >>> print str2 False >>>
startswith() :
Returns true if a string starts with a given substring passed as an argument.
>>> str1 = "welcome to unixutils" >>> str1.startswith("wel") True >>> # search for substring "unixutils" after the 1st character and stop before the 8th character, in the string "welcome to unixutils". If found, endswith() returns true, else returns false. >>> str2 = str1.endswith("unixutils",1, 8) >>> print str2 False >>> >>>
find() :
Returns the index or position of the substring that is passed as an argument.
>>> str1 = "python is good. It is Object oriented. It is easy to learn and is easy to teach" >>> str2 = str1.find("is") >>> print str2 7 # Note: without any arguments, as in the the above example, find() function returns only the lowest index position even if the substring is found multiple times. # we can make find() search with in a rage of positons. Below example searchs for substring "is" between the 10th and 50th positions only. >>> str2 = str1.find('is',10, 50) >>> print str2 19 >>>
format() :
Returns a formatted string with the value passed as parameter in the placeholder position.
>>> str = "i am {}" >>> print (str.format("vijay")) i am vijay >>> >>> str = "i am {name}" >>> print (str.format(name="vijay")) i am vijay >>> >>> str = "i am {0} and my age is {1}" >>> >>> str_new = str.format("vijay","22") >>> print str_new i am vijay and my age is 22 >>> #:d will format decimal numbers only. >>> str1="age: {:d}" >>> str1.format(78) 'age: 78' >>> >>> str1.format(78.9) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: Unknown format code 'd' for object of type 'float' >>> The above output shows it throws an error when floating numbers are given for formatting when it is expecting decimal numbers only. To over come this the below example can be considered, where in it expects a floating nuber with precision to places after decimal #Precision the floating point numbers #.0f is 0 places after decimal. i.e 78.123 is formatted to 78 #.4f is 4 places after decimal. i.e 78.123 is formatted to 78.1230 >>> str = "my score is {:.0f}" >>> str2 = str.format(9.9) >>> print str2 my score is 10 >>> >>> str2 = str.format(9.4) >>> print str2 my score is 9 >>> >>> str = "my score is {:.6f}" >>> str2 = str.format(9.9) >>> print str2 my score is 9.900000 >>> # we can pad the string or number with empty characters by adding the number of empty characters to pad with. In the 2nd output {0:8} pads the string "vijay" with 8 empty characters >>> print("I am {0} with UnixUtils".format("vijay")) I am vijay with UnixUtils >>> >>> print("I am {0:8} with UnixUtils".format("vijay")) I am vijay with UnixUtils >>> # we can choose alignment options which are right, left, center. The below examples shows right, left and center alignments respectively. >>> print("I am {0:>8} with UnixUtils".format("vijay")) I am vijay with UnixUtils >>> >>> print("I am {0:<8} with UnixUtils".format("vijay")) I am vijay with UnixUtils >>> >>> print("I am {0:^8} with UnixUtils".format("vijay")) I am vijay with UnixUtils >>> #we can also choose to format string with or without quotes >>> str="I am {0!r}" >>> op=str.format("vijay") >>> print op I am 'vijay' >>> >>> >>> str="I am {0!s}" >>> op=str.format("vijay") >>> print op I am vijay >>>
index() :
get the position/Index of substring.
>>> str1 = "i am vijay" >>> str1.index("am") 2 >>>
isalnum() :
checks if a string is alpha numeric. Returns true if it has either numbers and/or alphabets, returns false if it has special characters or white spaces in it.
>>> name = "55" >>> print(name.isalnum()) True >>> name = "Sam" >>> print(name.isalnum()) True >>> >>> name = "Sam99" >>> print(name.isalnum()) True >>> >>> name = "Sam 99" >>> print(name.isalnum()) False >>> >>> name = "Sam$" >>> print(name.isalnum()) False >>>
isalpha() :
checks if string only has alphabets in it. If it has white spaces, special characters or numbers, it returns false.
>>> name="vijay" >>> print(name.isalpha()) True >>> name="vijay ?" >>> print(name.isalpha()) False >>> >>> name="vijay99" >>> print(name.isalpha()) False >>> name="vijay g" >>> print(name.isalpha()) False >>>
isdecimal() :
checks if input is a decimal and returns true if it is, else returns false.
>>> a="18" >>> a.isdecimal() True >>> >>> a="18.98" >>> a.isdecimal() False >>>
isupper() :
returns True only if the entire string is in uppercase.
>>> >>> str="VIJAY" >>> str.isupper() True >>> >>> str="viJaY" >>> str.isupper() False >>> >>> str="vijay" >>> str.isupper() False >>>
islower() :
returns True only if the entire string is in lowercase.
>>> str="vijay" >>> str.islower() True >>> str="VijaY" >>> str.islower() False >>> str="VIJAY" >>> str.islower() False >>>
upper() :
converts entire string to uppercase.
>>> str="vijay" >>> str.upper() 'VIJAY' >>> >>> str="vij12" >>> str.upper() 'VIJ12' >>>
lower() :
converts entire string to lowercase.
>>> str="VIJay" >>> str.lower() 'vijay' >>>
casefold() :
converts entire string to lowercase. More powerful than lower() since it can handle many different Languages.
# Works in python 3 only! >>> str1 = "vijAY" >>> str1.casefold() 'vijay' >>> >>> str1 = "vijay" >>> str1.casefold() 'vijay' >>> >>> str1 = "VIJAY" >>> str1.casefold() 'vijay' >>>
isnumeric() :
Returns true only if the input is strictly a decimal number.
>>> x = "vijay" >>> x.isnumeric() False >>> >>> x = "26" >>> x.isnumeric() True >>> x = "2.6" >>> x.isnumeric() False >>>
isprintable() :
Returns true only if a given input is printable.
>>> s="hello !@#$" >>> s.isprintable() True >>> >>> s="\n" >>> s.isprintable() False >>>
isspace() :
returns true only if the given string has nothing but space.
>>> string = "hello people!" >>> string.isspace() False >>> >>> string = "\n\n" >>> string.isspace() True >>> >>> string = "" >>> string.isspace() False >>> >>> string = " " >>> string.isspace() True >>> >>> string = "\t" >>> string.isspace() True >>>
join() :
Joins a string with a given seperator.
>>> list1 = ['1','2','3','4'] >>> s = "-" >>> s.join(list1) '1-2-3-4' >>> >>> >>> string = "vijay" >>> s.join(string) 'v-i-j-a-y' >>> >>>
split() :
converts string to an array. Space is used as a default field seperator if no argument is given.
>>> string = "Welcome to Unix utils" >>> string.split('$') ['Welcome to Unix utils'] >>> >>> >>> string = "My+Name+is+vijay" >>> string.split('+') ['My', 'Name', 'is', 'vijay'] >>> # The second argument accepts a number, which is to specify when to stop splitting. >>> string.split('+',2) ['My', 'Name', 'is+vijay'] >>> string.split('+',1) ['My', 'Name+is+vijay'] >>> Note: rsplit() is another function that does the same job as split() and takes same arguments and has no difference in how it functions.
splitlines() :
Returns the list of lines in a string by looking for ‘\n’.
>>> string = "text1\ntext2\ntext3" >>> print string text1 text2 text3 >>> string.splitlines() ['text1', 'text2', 'text3']
strip() :
removes leading and trailing characters that is passed as an argument. If no arguments are given, it strips empty spaces if found.
>>> string = " vijay " >>> string.strip() 'vijay' >>> >>> string = "####vijay####" >>> string.strip('#') 'vijay' >>>
rstrip() :
rstrip() is another function that works just like strip() and takes same arguments as strip(), however strips only the trailing charcters
>>> string = "####vijay####" >>> string.rstrip('#') '####vijay' >>>
lstrip():
lstrip() is another function that works just like strip() and takes same arguments as strip(), however strips only the leading charcters
>>> string = "####vijay####" >>> string.lstrip('#') 'vijay####' >>>