Even after using Python for years, we stumble upon functions and features that we did not know about. Some of these can be quite useful, yet underused. With that in mind, I’ve compiled a list of incredibly useful Python functions and features that you should be familiar with.
Functions with Arbitrary Number of Arguments
You may already know that Python allows you to define functions with optional arguments. But there is also a method for allowing completely arbitrary number of function arguments.
First, here is an example with just optional arguments:
|
def function(arg1="",arg2=""):
print "arg1: {0}".format(arg1)
print "arg2: {0}".format(arg2)
function("Hello", "World")
# prints args1: Hello
# prints args2: World
function()
# prints args1:
# prints args2:
|
Now, let’s see how we can build a function that accepts any number of arguments. This time we are going to utilize
Tuples:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
def foo(*args): # just use "*" to collect all remaining arguments into a tuple
numargs = len(args)
print "Number of arguments: {0}".format(numargs)
for i, x in enumerate(args):
print "Argument {0} is: {1}".format(i,x)
foo()
# Number of arguments: 0
foo("hello")
# Number of arguments: 1
# Argument 0 is: hello
foo("hello","World","Again")
# Number of arguments: 3
# Argument 0 is: hello
# Argument 1 is: World
# Argument 2 is: Again
|
Using Glob() to Find Files
Many Python functions have long and descriptive names. However it may be hard to tell what a function named
glob() does unless you are already familiar with that term from elsewhere.
Think of it like a more capable version of the
listdir() function. It can let you search for files by using patterns.
|
import glob
# get all py files
files = glob.glob('*.py')
print files
# Output
# ['arg.py', 'g.py', 'shut.py', 'test.py']
|
You can fetch multiple file types like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
import itertools as it, glob
def multiple_file_types(*patterns):
return it.chain.from_iterable(glob.glob(pattern) for pattern in patterns)
for filename in multiple_file_types("*.txt", "*.py"): # add as many filetype arguements
print filename
# output
#=========#
# test.txt
# arg.py
# g.py
# shut.py
# test.py
|
If you want to get the full path to each file, you can just call the
realpath() function on the returned values:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
|
import itertools as it, glob, os
def multiple_file_types(*patterns):
return it.chain.from_iterable(glob.glob(pattern) for pattern in patterns)
for filename in multiple_file_types("*.txt", "*.py"): # add as many filetype arguements
realpath = os.path.realpath(filename)
print realpath
# output
#=========#
# C:\xxx\pyfunc\test.txt
# C:\xxx\pyfunc\arg.py
# C:\xxx\pyfunc\g.py
# C:\xxx\pyfunc\shut.py
# C:\xxx\pyfunc\test.py
|
Debugging
Some of the examples below make use of the
inspect module. This module can be very useful for debugging purpose and you can get with it much more that what described here.
We are not going to cover each one of these in this article, but I will show you a few use cases.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
|
import logging, inspect
logging.basicConfig(level=logging.INFO,
format='%(asctime)s %(levelname)-8s %(filename)s:%(lineno)-4d: %(message)s',
datefmt='%m-%d %H:%M',
)
logging.debug('A debug message')
logging.info('Some information')
logging.warning('A shot across the bow')
def test():
frame,filename,line_number,function_name,lines,index=\
inspect.getouterframes(inspect.currentframe())[1]
print(frame,filename,line_number,function_name,lines,index)
test()
# Should print the following (with current date/time of course)
#10-19 19:57 INFO test.py:9 : Some information
#10-19 19:57 WARNING test.py:10 : A shot across the bow
#(, 'C:/xxx/pyfunc/magic.py', 16, '', ['test()\n'], 0)
|
Generating Unique ID’s
There may be situations where you need to generate a unique string. I have seen many people use the md5()
function for this, even though it’s not exactly meant for this purpose
There is actually a Python function named uuid() that is meant to be used for this.
|
import uuid
result = uuid.uuid1()
print result
# output => various attempts
# 9e177ec0-65b6-11e3-b2d0-e4d53dfcf61b
# be57b880-65b6-11e3-a04d-e4d53dfcf61b
# c3b2b90f-65b6-11e3-8c86-e4d53dfcf61b
|
You may notice that even though the strings are unique, they seem similar after several characters. This is because the generated string is related to the computer network address.
To reduce the chances of getting a duplicate, you can use this two functions.
|
import hmac,hashlib
key='1'
data='a'
print hmac.new(key, data, hashlib.sha256).hexdigest()
m = hashlib.sha1()
m.update("The quick brown fox jumps over the lazy dog")
print m.hexdigest()
# c6e693d0b35805080632bc2469e1154a8d1072a86557778c27a01329630f8917
# 2fd4e1c67a2d28fced849ee1bb76e7391b93eb12
|
Serialization
Have you ever needed to store a complex variable in a database or a text file? You do not have to come up with a fancy solution to convert your arrays or objects into formatted strings, as Python already has functions for this purpose.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
|
import pickle
variable = ['hello', 42, [1,'two'],'apple']
# serialize content
file = open('serial.txt','w')
serialized_obj = pickle.dumps(variable)
file.write(serialized_obj)
file.close()
# unserialize to produce original content
target = open('serial.txt','r')
myObj = pickle.load(target)
print serialized_obj
print myObj
#output
# (lp0
# S'hello'
# p1
# aI42
# a(lp2
# I1
# aS'two'
# p3
# aaS'apple'
# p4
# a.
# ['hello', 42, [1, 'two'], 'apple']
|
This was the native Python serialization method. However, since JSON
has become so popular in recent years, they decided to add support for it. Now you can decode and encode as well inJSON
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
import json
variable = ['hello', 42, [1,'two'],'apple']
print "Original {0} - {1}".format(variable,type(variable))
# encoding
encode = json.dumps(variable)
print "Encoded {0} - {1}".format(encode,type(encode))
#deccoding
decoded = json.loads(encode)
print "Decoded {0} - {1}".format(decoded,type(decoded))
# output
# Original ['hello', 42, [1, 'two'], 'apple'] - <type 'list'="">
# Encoded ["hello", 42, [1, "two"], "apple"] - <type 'str'="">
# Decoded [u'hello', 42, [1, u'two'], u'apple'] - <type 'list'="">
|
It is more compact, and best of all, compatible with javascript and many other languages. However, for complex objects, some information may be lost.
Compressing Strings
When talking about compression, we usually think about files, such as ZIP archives. It is possible to compress long strings in Python, without involving any archive files.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
|
import zlib
string = """ Lorem ipsum dolor sit amet, consectetur
adipiscing elit. Nunc ut elit id mi ultricies
adipiscing. Nulla facilisi. Praesent pulvinar,
sapien vel feugiat vestibulum, nulla dui pretium orci,
non ultricies elit lacus quis ante. Lorem ipsum dolor
sit amet, consectetur adipiscing elit. Aliquam
pretium ullamcorper urna quis iaculis. Etiam ac massa
sed turpis tempor luctus. Curabitur sed nibh eu elit
mollis congue. Praesent ipsum diam, consectetur vitae
ornare a, aliquam a nunc. In id magna pellentesque
tellus posuere adipiscing. Sed non mi metus, at lacinia
augue. Sed magna nisi, ornare in mollis in, mollis
sed nunc. Etiam at justo in leo congue mollis.
Nullam in neque eget metus hendrerit scelerisque
eu non enim. Ut malesuada lacus eu nulla bibendum
id euismod urna sodales. """
print "Original Size: {0}".format(len(string))
compressed = zlib.compress(string)
print "Compressed Size: {0}".format(len(compressed))
decompressed = zlib.decompress(compressed)
print "Decompressed Size: {0}".format(len(decompressed))
# output
# Original Size: 1022
# Compressed Size: 423
# Decompressed Size: 1022
|
Register Shutdown Function
There is a module called
atexit, which will let you execute some code right before the script finishes running.
Imagine that you want to capture some benchmark statistics at the end of your script execution, such as how long it took to run:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
|
import atexit
import time
import math
def microtime(get_as_float = False) :
if get_as_float:
return time.time()
else:
return '%f %d' % math.modf(time.time())
start_time = microtime(False)
atexit.register(start_time)
def shutdown():
global start_time
print "Execution took: {0} seconds".format(start_time)
atexit.register(shutdown)
# Execution took: 0.297000 1387135607 seconds
# Error in atexit._run_exitfuncs:
# Traceback (most recent call last):
# File "C:\Python27\lib\atexit.py", line 24, in _run_exitfuncs
# func(*targs, **kargs)
# TypeError: 'str' object is not callable
# Error in sys.exitfunc:
# Traceback (most recent call last):
# File "C:\Python27\lib\atexit.py", line 24, in _run_exitfuncs
# func(*targs, **kargs)
# TypeError: 'str' object is not callable
|
At first this may seem trivial. You just add the code to the very bottom of the script and it runs before it finishes. if there is a fatal error, or if the script is terminated by the user, again it may not run.
When you use atexit.register()
, your code will execute no matter why the script has stopped running.
Conclusion
Are you aware of any other Python features that are not widely known but can be quite useful? Please share with us in the comments. And thank you for reading!
Repost from http://pypix.com/tools-and-tips/python-functions/