Formatting Strings With Python
Python offers a variety of options that you can use to format strings. See which one you should use.
Join the DZone community and get the full member experience.
Join For FreeGoing through the official Python documentation , you find out that the language offers four options to format strings. It is one of the rare contradiction with the Zen of Python that indicates clearly that there must be always one and only one way to do something.
Let's take a look at the four options.
The Old Style: %-formatting
This modulo operator is the same one used in C language. This style uses C-style string formatting to create new, formatted strings. Let's have an example:
house = "Gryffondor"
points = 10
print ' %d points for %s !!' % (points, house)
#10 points for Gryffondor !!
%-formatting is limited as to the types it supports. Only ints, strs, and doubles can be formatted. All other types are either not supported, or converted to one of these types before formatting. If you mention the wrong type, a TypeError will be raised (see the example below).
house = "Gryffondor"
points = 10
#house is a str however i use %d to format it
print ' %d points for %d !!' % (points, house)
#TypeError: %d format: a number is required, not str
The advantage of this style is that is easy to use and simple to write. So for short and simple strings to format, it is very suitable. This style offers a certain degree of flexibility for padding and aligning strings.
house = "Gryffondor"
points = 10
#Aligning
#Align right
print ' %10d points for %s !!' % (points, house)
# 10 points for Gryffondor !!
#Notice the 10 spaces before the 10
#Align left
print ' %-10d points for %s !!' % (points, house)
# 10 points for Gryffondor !!
#Notice the 10 spaces after the 10
#Truncating
print ' %d points for %.5s !!' % (points, house)
#10 points for Gryff !!
#Notice that only the 5 first letters of house were printed
This option offers the ability to convert values using the __str__() and __repr__() methods.
class IP_Adress(object):
def __init__(self, nb_1, nb_2, nb_3, nb_4):
self.nb_1 = nb_1
self.nb_2 = nb_2
self.nb_3 = nb_3
self.nb_4 = nb_4
def __str__(self):
return '%.%.%.%' % (str(self.nb_1), str(self.nb_2), str(self.nb_3), str(self.nb_4))
def __repr__(self):
return 'IP_Adress(%,%,%,%)' % (repr(self.nb_1), repr(self.nb_2), repr(self.nb_3), repr(self.nb_4))
if __name__ == '__main__':
addr = IP_Adress(192, 192, 233, 232)
#conversion using the __str__( ) method
print '%s' % (addr)
#192.192.233.232
#conversion using the __repr__() method
print '%r' % (addr)
#IP_Adress(192,192,233,232)
However, the %-formatting is error-prone, cumbersome, and not very adapted to have good readability in your code.
The Python documentation does not recommend this option and indicates clearly that it exhibits a variety of quirks that lead to a number of common errors such as failing to display tuples and dictionaries correctly.
Although this style is error-prone, it is not deprecated with Python 3.x and it is still widely-used because it is simple to type, the logging module uses this style, and, from Python 3.5, the type byte uses this operator for formatting
The Not-So-New Style: str.format( )
This option was provided with Python 2.7, that is why I called it the not-so-new style as it had been some time now since it joined the Python standard features.
It corrects the errors that were caused by the oldstyle, offers more flexibility, and a better level of readability and it is quicker.
Let's introduce again the same previous examples with str.format( )
house = "Gryffondor"
points = 10
print " {} points for {} !!".format(points, house)
# 10 points for Gryffondor !!
You can refer to your variables by using their names to be able to reuse them many times in the final string while maintaining a readable code.
Gryffondor = "Gryffondor"
slytherin = "slytherin"
points = 10
print " {points} points for {Gryffondor} and also {points} for {slytherin}!!".format(points=points,Gryffondor=Gryffondor, slytherin=slytherin)
# 10 points for Gryffondor and also 10 for slytherin!
As I already mentioned, this option offers more flexibility. The code below lists some of the tricks it offers and highlights the ones which are not provided with the %-formatting option.
house = "Gryffondor"
points = 10
#Aligning
print " {:10} points for {:20} !!".format(points,Gryffondor)
#10 points for Gryffondor !!
print " {} points for {:>20} !!".format(points,Gryffondor)
#10 points for Gryffondor !!
#You are able to choose the padding character '_'
#not available with %-formatting
print " {} points for {:_>20} !!".format(points,Gryffondor)
#10 points for __________Gryffondor !!
print " {} points for {:_<20} !!".format(points,Gryffondor)
#10 points for Gryffondor__________ !!
#align center
#not available with %-formatting
print " {} points for {:^20} !!".format(points,Gryffondor)
#10 points for Gryffondor !!
#Truncating long strings
print " {} points for {:.5} !!".format(points,Gryffondor)
#10 points for Gryff !
This option offers also the flexibility to perform easily value conversion. The below example considers the same previous IP_Adress Class.
if __name__ == '__main__':
addr = IP_Adress(192, 192, 233, 232)
#conversion using the __str__( ) method
print '{0!s}'.format(addr)
#192.192.233.232
#conversion using the __repr__() method
print '{0!r}'.format(addr)
#IP_Adress(192,192,233,232)
The New Style: F-Strings
This option was introduced since Python 3.6. It is also called Sstring interpolation. Personally, this is my favorite one. It is simple, quick and intuitive.
They are called f-strings because you need to preceed your string with the letter "f." The "f" is used the same way the "b" prefix is used for the byte strings.
The "f" stands for formatted.
It works pretty much like the .format() method; however, you can directly insert the names from the current scope in the format string.
Gryffondor = "Gryffondor"
points = 10
f'{points} points for {Gryffondor} !!'
# 10 points for Gryffondor !!
With F-Strings, you can format strings on multiple lines using the triple-quote.
Gryffondor = "Gryffondor"
points = 10
f'''{points} points for {Gryffondor} !!
Of Course, Hermione Granger gave
the right answer'''
#10 points for Gryffondor !! Of Course, Hermione Granger gave the right answer
F-Strings are evaluated at runtime, which is why you can put any valid Python expression in them.
For example, you can include a mathematic expression like:
print (f'{1 + 2}')
#3
You can also call functions and methods directly.
#method call
Gryffondor = "Gryffondor"
print (f'{Gryffondor.upper()}')
#GRYFFONDOR
#function call
def upper_case(str_in):
return str_in.upper()
Gryffondor = "Gryffondor"
print (f'{upper_case(Gryffondor)}')
#GRYFFONDOR
This option offers the same level of flexibility as str.format( )
and offers also a simple way to perform value conversion. Once again we consider our IP_Adress class.
# i consider the same class IP_Adress as the example with %-formatting
if __name__ == '__main__':
addr = IP_Adress(192, 192, 233, 232)
#by default calls the __str__ method
print (f'{ip_adress}')
#192.192.233.232
#invokes the __repr__ method with !r specifier
print (f'{ip_adress!r}')
#IP_Adress(192,192,233,232)
Template Strings
Let's start directly by introducing the below example:
from string import Template
house = "Gryffondor"
points = 10
t = Template('$points points for $house')
print t.substitute(points=points, house=house)
# 10 points for Gryffondor
Note that you need to import the Template class from the string module as Template strings are not a core language feature, but they are included in the string module in the standard library.
With this option, you do not mention specifiers such us %d, %r, %s, %10d. That is why it is less powerful than the other options. So for the conversion of values, you need to do as mentioned below:
# i consider the same class IP_Adress as the example with %-formatting
from string import Template
if __name__ == '__main__':
addr = IP_Adress(192, 192, 233, 232)
t = Template('$addr')
#by default when __str__ is defined, its result is printed
print t.substitute(addr=addr)
#192.192.233.232
#to print the result of __repr__, as with template strings no specifiers are mentionned
# we use the repr( ) function
print t.substitute(addr=repr(addr))
#IP_Adress(192,192,233,232)
Although this option is less powerful, it is more secure when it is up to the user to indicate the format of its strings. Users with %-formatting and the other options can provide a format that allows access to some secret variables in your code. With the subsitute( )
method of templates you avoid that.
Let's have an example to explain this last point:
class IP_Adress(object):
def __init__(self, nb_1, nb_2, nb_3, nb_4):
self.nb_1 = nb_1
self.nb_2 = nb_2
self.nb_3 = nb_3
self.nb_4 = nb_4
def __str__(self):
return '{}.{}.{}.{}'.format(str(self.nb_1), str(self.nb_2), str(self.nb_3), str(self.nb_4))
def __repr__(self):
return 'IP_Adress({},{},{},{})'.format(repr(self.nb_1), repr(self.nb_2), repr(self.nb_3), repr(self.nb_4))
if __name__ == '__main__':
addr = IP_Adress(192,192,234,233)
protected_var = "protected value"
#format provided as an input from a user
str_format = "{addr.__init__.__globals__[protected_var]}"
print str_format.format(addr=addr)
#protected value
# using the __globals__ dictionary that provides the mapping between the names and the values
#of variables the user get the access to the protected_var
With the str.format( )
the user injected a format that permitted him to access the protected_var
in our code.
Let's see how the string template can help you avoid this exact same issue:
class IP_Adress(object):
def __init__(self, nb_1, nb_2, nb_3, nb_4):
self.nb_1 = nb_1
self.nb_2 = nb_2
self.nb_3 = nb_3
self.nb_4 = nb_4
def __str__(self):
return '{}.{}.{}.{}'.format(str(self.nb_1), str(self.nb_2), str(self.nb_3), str(self.nb_4))
def __repr__(self):
return 'IP_Adress({},{},{},{})'.format(repr(self.nb_1), repr(self.nb_2), repr(self.nb_3), repr(self.nb_4))
#import the Template class from the
from string import Template
if __name__ == '__main__':
addr = IP_Adress(192,192,234,233)
protected_value = "hello"
str_format = "${addr.__init__.__globals__[protected_value]}"
t = Template(str_format)
print t.substitute(str_format, addr = addr)
#ValueError: Invalid placeholder in string: line 1, col 1
Conclusion
So what option do you uuse?
When it is up to the user to decide the string format to use, String templates are the safest choice.
%-formatting is to be avoided or just to be used with simple Python types like int and float, and not with collections like dictionaries or lists. Between the str.format( )
option and the F-Strings option, you have the liberty to use the one you feel more comfortable with or do as I did when I found myself with a version of Python superior to 3.6 and use the F-Strings or the str.format( )
style.
Opinions expressed by DZone contributors are their own.
Comments