Follow along here: w0rp.com/migrating
During the day ...
... but at night!
ALE: https://github.com/w0rp/ale
My demo album: holdon.world
Probably the same, and going to get better.
Ignore generic benchmarks. Run your own!
I submitted some Ideas and averaged the time for POST requests.
~349ms in Python 2, ~330ms in Python 3.
Basically, no meaningful difference.
However, much less memory is now consumed in production.
The Timsort algorithm used in list.sort() and sorted() now runs faster
UTF-8 is now 2x to 4x faster. UTF-16 encoding is now up to 10x faster.
collections.OrderedDict is now implemented in C, which makes it 4 to 100 times faster.
The UTF-8 encoder is now up to 75 times as fast for error handlers ignore, replace, surrogateescape, surrogatepass
Optimized case-insensitive matching and searching of regular expressions. Searching some patterns can now be up to 20 times faster.
Performance improvements go where new development is.
New development means Python 3.
*x
and **y
super()
and class
declarationsyield from
async
– await
Don't get ahead of yourself, however.
These features are great, but you can't upgrade all in one go.
Python 2 and 3 behave differently in subtle ways.
You will need to follow a methodical upgrade process.
At a high level, all you need to do is...
# Install the version you want via the deadsnakes PPA.
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt-get update
sudo apt-get install python3.6 python3.6-dev python3.6-venv
I strongly recommend using the official installer. https://www.python.org/downloads/release/python-362/
You could just compile Python. It's not too hard.
Example here: https://github.com/w0rp/tox-travis-example
Run automated tests to measure Python 3 compatibility.
You can target certain tests with marks.
pytest -m python3_supported
@pytest.mark.python3_supported
def test_really_important_thing(...):
...
You can also choose to skip some.
@pytest.mark.skipif(six.PY3)
def test_really_important_thing(...):
...
Use CI to move forward (10% one week, 20% the next...)
Import code which fails on Python 3 locally.
Before:
from broken_in_python3 import important_function
def do_something_important():
important_function()
After:
def do_something_important():
from broken_in_python3 import important_function
important_function()
Localising imports will take you from output like this...
$ pytest
E..F.
to output more like this.
$ pytest
......FFFFFFF..FF..
You can get more of your tests to run.
Over 90% of the top 200 Python packages support Python 3: https://python3wos.appspot.com
Stop using packages and versions that do not support Python 3.
Start using the packages that do.
First thing's first. Write this in every Python file. (I mean it!)
from __future__ import absolute_import, division, print_function, unicode_literals
The above will fix problems with...
x / y
returning different results.print
as a statement.flake8 plugin: https://github.com/xZise/flake8-future-import
print
statements
print x, y # With the __future__ imports, this is a syntax error
print(x, y) # This works in both versions with the __future__ imports
Old except:
syntax
except ExceptionType, e: # Only valid in Python 2
except ExceptionType as e: # Works in both versions
.next()
for iterators
x = iterator.next() # Only valid in Python 2
x = next(iterator) # Works in both versions.
Use list(x for x in ...)
, not [x for x in ...]
some_list = [x for x in range(3)] # x is bound to function scope in 2.x
some_list = list(x for x in range(3)) # x is not accessible on the outside
Both Python versions offer text sequence types and binary sequence types.
They are named differently in each version.
Text should be your default, not bytes.
from __future__ import ..., unicode_literals # Remember this?
some_text = 'foo' # This is now `unicode` in Python 2, and `str` in Python 3.
also_text = u'foo' # Works in 2.7, removed in 3.0, added back in 3.3
some_bytes = b'bar' # This is now `str` in Python 2, and `bytes` in Python 3.
Never mix text and binary sequences.
confusion = 'foo' + b'bar' # This doesn't work in 3
text_result = 'foo' + b'bar'.decode('utf-8') # This will work
bytes_result = 'foo'.encode('utf-8') + b'bar' # This will also work
Remember: You decode bytes, and encode text.
Never the opposite.
These are correct.
b'xyz'.decode('utf-8') # Correct, you decode some bytes into text
'xyz'.encode('utf-8') # Correct, you encode text into bytes
These are wrong!
b'xyz'.encode('utf-8') # Wrong! Python 3 will raise AttributeError
'xyz'.decode('utf-8') # Wrong! Python 3 will raise AttributeError
Think of byte sequences and text sequences like so.
encoded -> decoded -> encoded
bytes -> str -> bytes
HTTP request -> application -> database
Libraries will almost always handle encoding for you.
... You'll have to deal with a number of issues before you're done.
pip install six
six
offers a compatibility layer for many common symbols.
six
will fix most of your common standard library issues, so use it.
Builtins behave differently.
Here is how to fix those issues.
Use range
, not xrange
.
If you must have a generator, use six.moves
from six.moves import range # Make range compatible for this file
a_generator = range(42) # Same as xrange(42) in 2, range(42) in 3
Use list(range(...))
when you must have a list.
some_generator = xrange(5) # Does not exist in Python 3
something = range(5) # list in 2, but the same as xrange(5) in 3
a_list = list(range(5)) # A list in both
This works for anything which returns an iterable.
some_list = list(may_return_list())
Worst case scenario, you make a redundant copy.
Use expressions instead of map
or filter
.
Don't bother with this:
doubled_values = map(lambda x: x * 2, some_list) # a list in 2, a generator in 3
odd_values = filter(lambda x: x % 0: some_list) # list in 2, a generator in 3
Do this instead:
doubled_values = list(x * 2 for x in some_list) # a list in each version
odd_values = list(x for x in some_list if x % 0) # a list in each version
If you must use them, import the iterator versions.
from six.moves import filter, map # itertools.filter and itertools.imap in 2.x
reduce
is no longer a builtin in 3, for some reason.
Just import it when you use it.
from six.moves import reduce # Redundant in 2, but fixes code in 3
product_result = reduce(lambda x, y: x * y, range(1, 5)) # Now you can use reduce
Types are different in 2 and 3. Use types from six instead.
Before:
isinstance(value, (int, long)) # Checking for integers
isinstance(value, basestring) # Checking for str or unicode
After:
isinstance(value, six.integer_types) # Checking for integers, no long in 3
isinstance(value, six.string_types) # basestring in 2, and str in 3 ...
isinstance(value, six.text_type) # unicode in 2, and str in 3
isinstance(value, six.binary_type) # Check for bytes in 3, or str in 2
# If you really must, accept both, and convert your value...
isinstance(value, (six.binary_type, six_text_type))
These do not work in Python 3
for key in some_dict.iterkeys(): ...
for value in some_dict.itervalues(): ...
for key, value in some_dict.iteritems(): ...
Use six functions instead.
# You can just do this instead in each version
for key in some_dict: ...
for value in six.itervalues(some_dict): ...
for key, value in six.iteritems(some_dict): ...
There are view functions too.
for key in six.viewkeys(some_dict): ...
for value in six.viewvalues(some_dict): ...
for key, value in six.viewitems(some_dict): ...
A whole talk could be devoted to this alone.
Try and use text almost everywhere, not bytes.
u''
always gives you text in 2.x and 3.3+
b''
always gives you bytes in 2.6+ and 3.0+
Wazoku first adopted explicit literals, and then unicode_literals
from __future__ import unicode_literals
When you know the types, decode and encode.
a_string = some_data.decode('utf-8')
some_data = a_string.encode('utf-8')
Always explicitly specify the encoding.
The default encoding is often ascii
in Python 2.
A poor man's version here. (Find a good library instead)
def to_text(value, encoding='utf-8'):
# Check for str in 2, and bytes in 3
if isinstance(value, six.binary_type):
return value.decode(encoding)
# Use six.text_type(x) instead of unicode(x) or str(x)
return six.text_type(value)
def to_bytes(value, encoding='utf-8'):
# Re-encode binary data as utf-8, so we get exceptions for invalid bytes
return to_text(value, encoding).encode('utf-8')
Use force_text
or force_bytes
instead for Django code.
from django.utils.encoding import force_bytes, force_text
Python 2 only:
from urlparse import urlparse, parse_qs
from urllib import urlencode, quote_plus
Python 3 only:
from urllib.parse import urlparse, parse_qs, urlencode, quote_plus
Both with six:
from six.moves.urllib.parse import urlparse, parse_qs, urlencode, quote_plus
See six documentation for other functions.
Use the Django functions if you can, which work better.
from django.utils.http import urlencode, urlquote_plus
urlencode
handles Unicode poorly in Python 2None
)csv
changed to expect text sequencescmp
for sorted
or itertools.groupby
__str__
: Use @six.python_2_unicode_compatible
__metaclass__
: Use @six.with_metaclass
u'%s' % b'foo'
produces weird resultsFurther reading: