1
Fork 0

Rewrite parts of django 2.2 post so things make a little more sense

This commit is contained in:
Jake Howard 2019-07-19 12:03:52 +01:00
parent d57051f600
commit d340a66b41
Signed by: jake
GPG key ID: 57AFB45680EDD477
2 changed files with 51 additions and 50 deletions

View file

@ -4,21 +4,23 @@ date: 2019-04-01
tags: [programming]
---
April marks the release of Django 2.2, the latest LTS version of the popular Python web framework. Django 2.2 marks almost 2 years of development since the last LTS release, 1.11 in April 2017, and brings with it some very large improvements and changes which naturally come with a major version increase.
April marks the release of Django 2.2, the latest LTS version of the popular Python web framework. Django 2.2 marks almost 2 years of development since the last LTS release, 1.11 in April 2017, and brings with it some very large improvements and changes which naturally come with a major version bump.
Django historically works off the LTS pattern of software releasing, providing 2 channels. The LTS versions are maintained for longer than the regular versions, and receive regular bug fixes and security patches in line with the main release channel.
Django historically works off the LTS pattern of software releasing, providing 2 channels. LTS versions are maintained far longer than regular versions, and receive regular bug fixes and security patches in line with the main release channel.
![Django update cycle](https://static.djangoproject.com/img/release-roadmap.e844db08610e.png)
The bump between 1.11 and 2.2 also bought with it the updates from 2.0 and 2.1. Those features which have been being used by users for 18 months finally come to those who need the stability of an LTS release. I've not delved too far into the 2.x releases so far, as most of what I do strongly benefits from using an LTS-based version.
The bump from 1.11 to 2.2 also bought with it the updates from 2.0 and 2.1. Features used by some users for 18 months finally come to those who need the stability of an LTS release. I've not delved too far into the 2.x releases so far, as most of what I do strongly benefits from using an LTS-based version.
## Python 2
This is far from a complete list - that can be found on the [Django website](https://docs.djangoproject.com/en/2.2/releases/). This is simply the parts I found most interesting. If you're about to upgrade a codebase, or are just interested in what's changed, I highly recommend checking the release notes for yourself!
Ironically named, Django 2 is the first Django release to completely drop support for Python 2. Django 2.2 will require at least 3.5. Python 2 (commonly referred to as 'legacy python') will retire [in 2020](https://pythonclock.org/), so it's great to see Django dropping support well beforehand so users can start migrating their larger codebases. For years there's been a debate as to which major version of python is better: 2 or 3. Considering Python 2 now has an end-of-life date, and the performance gap is now a non-issue, hopefully this debate is over.
## ~~Python 2~~
Django 2.0 (ironically named) is the first Django release to completely drop support for Python 2, requiring at least 3.5. Python 2 (commonly referred to as 'legacy python') will retire [in 2020](https://pythonclock.org/), so it's great to see Django fully drop support beforehand so users have to start migrating their codebases. For years there's been a debate as to which major version of python is better: 2 or 3. Considering Python 2 now has an end-of-life date, and the performance gap is now almost a non-issue, this debate is over.
## Simplified URL Routing
In previous versions, Django's URL system relied heavily on regular expressions to match paths to views. This works fine for very simple data types (like integers, which can simply be `\d+`), but more complex data structures lead to more interesting URL patterns. In the past, I've had to resort to a simpler URL pattern, and then doing more URL validation in the view, which is less than ideal. UUIDs were famously very difficult to do, with many people resorting to `[0-9a-f-]+`, whereas the correct regex is in fact [`[0-9a-f]{12}4[0-9a-f]{3}[89ab][0-9a-f]{15}\Z`](https://stackoverflow.com/a/18359032).
In previous versions, Django's URL system relied on regular expressions to match paths. This works fine for very simple data types (like integers, `\d+`), but more complex data structures lead to much more _interesting_ URL patterns. In the past, I've had to resort to a simpler URL pattern, and then doing more URL validation in the view, which is less than ideal. UUIDs were famously very difficult to do, with many people resorting to `[0-9a-f-]+`, whereas the correct regex is in fact [`[0-9a-f]{12}4[0-9a-f]{3}[89ab][0-9a-f]{15}\Z`](https://stackoverflow.com/a/18359032), apparently.
Thankfully, Django 2.0 fixed this, by drastically simplifying the URL routing syntax to allow for special keywords to be in place of RegEx capture groups. The new syntax is available with the new `path` function:
@ -32,7 +34,7 @@ This means UUID-based paths can now be written as:
path('articles/<uuid:year>/', views.year_archive),
```
This is a significant improvement over the previous methods. The intention is to deprecate the previous `url` function, and rename it `re_path`, but it's undefined when that will happen. There's support for the following shorthand types:
This is a significant improvement over the previous methods. There's support for the following shorthand types:
- `str`
- `int`
@ -48,27 +50,33 @@ The django admin is now mobile friendly. This isn't a massive deal considering l
### Auto-complete
I've personally had to wrestle quite a lot with performance in the Django Admin caused solely by foreign key and many-to-many fields. By default, the Django Admin renders these as `<select />` elements, with all the possible records. This can lead to large lists of models, and potentially some `O(n)` queries if the models `__str__` method also calls queries.
I've personally had to wrestle quite a lot with performance in the Django Admin caused solely by foreign key and many-to-many fields. By default, the Django Admin renders these as `<select />` elements, with all the possible records. This can lead to large lists of models, and potentially some `O(n)` queries if the models `__str__` method also calls queries (please don't get into this habit!).
Django 2.0 resolves this by adding an auto-complete widget for these 2, which means rather than rendering a `<select />`, it renders a custom widget which only searches and populates the search results when the user interacts with it. This will greatly increase the performance in large forms.
## Stronger password hashes
Django uses SHA256 to encrypt passwords, and then applies PBKDF2 over the top, to further strengthen the hash. I don't want to go into what those are and why they're there now, but trust that it's a very strong hash.
Django uses SHA256 to hash passwords, and then applies some rounds of PBKDF2 over the top, to further strengthen the hash. Exactly why this is done, and what PBKDF2 brings to the table is an interesting topic, for a later date.
Django 2.0 increases the number of PBKDF2 iterations from 36000 to 100000. This is a very large increase in iterations, and is meant to increase to 180000 rounds in Django 3.0. This only affects new users, or existing users when they change their passwords. For someone like me, this is great news!
Django 2.0 increases the number of PBKDF2 iterations from 36000 to 100000. That's quite an increase, and is meant to increase further to 180000 rounds in Django 3.0. The new round will only be applied to new users, or when existing users change their passwords.
This may have an impact on how long tests take to run, if they're constantly creating and destroying users, as PBKDF2 can increase how long it takes to calculate passwords, which is compounded by needing to run it many times in test cases. I've seen 25% increases in test speeds just by swapping to an MD5-based hashing backend, which doesn't use PBKDF2 at all.
The increase in iterations means hashing passwords is much slower, which can have a huge impact during tests, if they're constantly creating and destroying users. This issue can be compounded if tests need to create lots of users. I've seen 25% increases in test speeds just by swapping to an MD5-based hashing backend, which is significantly faster, as it doesn't use PBKDF2 at all.
## Files can be opened as context managers
Anyone who's opened files with Python, you'll have seen the context manager pattern, and hopefully understand why it's significantly better than manually opening and closing the file handler manually. This pattern can now be used with Django files from file / image fields. This results in slightly cleaner code which is less prone to leaving handles open to files which aren't needed anymore.
Anyone who's opened files with Python, you'll have seen the context manager pattern, and hopefully understand why it's significantly better than opening and closing the file handler manually.
```python
with open("file.txt", "w") as f:
file.write("Hello world")
```
This pattern can now be used with Django files from file / image fields. This results in slightly cleaner code which is less prone to leaving handles open to files which aren't needed anymore.
## New Database functions
One of the largest changes in Django 2.0 - 2.2 is the plethora of database functions added. These database functions allow more complex queries than were previously allowed, enabling more computation to be done by the database, rather than requiring pulling all the data into python land and operating there.
One of the largest changes in Django 2.0 - 2.2 is the plethora of new database functions added. These database functions allow more complex queries than were previously allowed, enabling more computation to be done by the database, rather than pulling all the data into python land and operating there.
As with many other things in Django, said functions are named fairly well:
As with many other things in Django, said functions are named fairly well, so don't require much explanation:
### 2.0
- `StrIndex`
@ -112,39 +120,41 @@ As with many other things in Django, said functions are named fairly well:
- `Tan`
- `ExtractIsoYear`
With all these new functions, focusing around maths and string manipulation, database servers can be leveraged more, and less data returned to the application server. Exactly how useful these will be to most users, only time will tell.
With all these new functions, focusing around maths and string manipulation, database servers can be leveraged more, and less data returned to the application server. Exactly how useful these are, I don't know - So far I've not found a need for many of them.
## `QuerySet` API
### `QuerySet.iterator` chunk size
`QuerySet.iterator` is an efficient way of loading very large datasets into Django to be used. Simply iterating over a queryset loads the entire result set into memory, and then iterates over it. `iterator` uses cursors and pagination to chunk up the data, so a much smaller amount of data is stored in memory at once. The ability to specify a chunk size allows tuning of this to improve performance. The default is 2000, which represents something [close to how it worked before](https://www.postgresql.org/message-id/4D2F2C71.8080805%40dndg.it)
`QuerySet.iterator` is an efficient way of loading very large datasets into Django to be used. Simply iterating over a queryset loads the entire result set into memory, and then iterates over it as a `list`. `.iterator` uses cursors and pagination to chunk up the data, so a much smaller amount of data is stored in memory at once.
The new ability to specify a chunk size allows tuning of this to improve performance. The default is 2000, which represents something [close to how it worked before](https://www.postgresql.org/message-id/4D2F2C71.8080805%40dndg.it)
### `QuerySet.values_list` can return named tuples
Named tuples are much like tuples, but their keys are, well, named! This is a much lighter data structure than a dictionary, and is also immutable. `values_list` being able to return these means the returned objects can be deconstructed in a much nicer way, and allow stronger type hinting.
[Named tuples](https://docs.python.org/3/library/collections.html#collections.namedtuple) are much like tuples, but their keys are, well, named! Named-tuples are a type-safe, lightweight, immutable alternative to passing around dictionaries when there's a limited set of keys. `values_list` being able to return these means the returned objects can be deconstructed in a much nicer way, and allow stronger type inference.
### `QuerySet.explain`, explained.
The new `explain` method on a `QuerySet` hooks into the existing SQL `EXPLAIN` statement to provide additional execution detail on queries. This may be useful to diagnose slow running queries and attempt to optimise them.
The new `explain` method on a `QuerySet` hooks into the existing SQL `EXPLAIN` statement to provide additional execution detail on queries. This allows a deeper understanding into the queries Django is using, and how the database will execute them. If you just want to see the query Django will execute, there's a `.query` property on a queryset, but that's been around a while.
### `QuerySet.bulk_update`
Updating many model instances at once often required either using a separate update query, or iterating over queries, resulting in `O(n)` queries. Neither of these are ideal. Django 2.2 introduces the `bulk_update` method, which takes a list of modified model instances, and saves them in a single query.
Updating many model instances at once often required either using a separate update query, or iterating over queries, resulting in `O(n)` queries, Neither of which are ideal. Django 2.2 introduces the `bulk_update` method, which takes a list of modified model instances, and saves them in a single query.
`bulk_update` requires knowledge on which fields it's updating as the second argument, therefore if there may be modifications to differing fields per instance, this may not be ideal.
Personally, I can't think of many places this will be necessary, but I'm sure someone can!
Personally, I can't think of many places this will be necessary, but I'm sure someone can! It's always better to have features like this than work around them.
## `createsuperuser` password validators
Django 1.11 added support for password validators, which can be used to measure the strength of users passwords against pre-defined requirements. Often during development, these may not be useful, and may be annoying. `createsuperuser` now prompts if these validators should be ignored, allowing super users to have weaker passwords for development.
Django 1.11 added support for password validators, which can be used to measure the strength of users passwords against pre-defined requirements. Often during development, it's easier to have a simpler password than your validators allow (`password` is a perfectly fine password for once!). `createsuperuser` now prompts if these validators should be ignored, allowing super users to have weaker passwords for development.
Even though this exists, please don't use it in production!
## Secure JSON serialization into HTML
Anyone who's had to dump JSON blobs into HTML pages should have come across [`django-argonauts`](https://github.com/fusionbox/django-argonauts) (if you're doing this _without_ `django-argonauts`, fear). `django-argonauts` helps prevent multiple different classes of XSS attacks, there's more information on this in the [project's README](https://github.com/fusionbox/django-argonauts#filter).
Anyone who's had to dump JSON blobs into HTML pages should have come across [`django-argonauts`](https://github.com/fusionbox/django-argonauts) (if you're doing this _without_ `django-argonauts`, fear). `django-argonauts` helps prevent multiple different classes of XSS attacks, which there's great examples of on the [project's README](https://github.com/fusionbox/django-argonauts#filter).
Django now has some built-in support for protecting against these kinds of attacks, from the new `json_script` filter. This takes an object in template context, serializes it to JSON (securely), and wraps it in a `script` tag, resulting in:
@ -156,7 +166,7 @@ This can then be used by JavaScript directly by getting the tag by ID. If you st
## Constraints
The new constraints API in Django 2.2 allows for far greater control of database-level validation on model fields than previously available in field validators. These validators are applied at the model level, rather than the field level. Django 2.2 comes with 2 existing constraints: `UniqueConstraint` and `CheckConstraint`. Both constraints are executed at the database level (as additional queries rather than column-level constraints), which whilst making them faster when doing complex relationship-level validation, also increases the number of queries executed when modifying a model instance.
The new constraints API in Django 2.2 allows for far greater control of database-level validation on model fields than previously available in field validators, because they're applied at the model level, rather than the field level. Django 2.2 comes with 2 built-in constraints: `UniqueConstraint` and `CheckConstraint`. Both constraints are executed at the database level (as additional queries rather than column-level constraints), which whilst making them faster when doing complex relationship-level validation, also increases the number of queries executed when modifying a model instance.
`UniqueConstranint` creates a unique constraint with any number of fields, in much the same way `unique_together` worked. `UniqueConstraint` also provides an additional `condition` argument, which specifies additional `Q` objects which must also apply. For example, `UniqueConstraint(fields=['user'], condition=Q(status='DRAFT')` ensures that each user only has one draft.
@ -164,18 +174,18 @@ The new constraints API in Django 2.2 allows for far greater control of database
## No more headers in migrations
Whenever `manage.py makemigrations` is run, Django injects a header into the file with the generated date and version.
Whenever `manage.py makemigrations` is run, Django injects a header into migrations with the generated date and version. These headers are simply for reference, they serve no value at runtime.
```plain
```python
# -*- coding: utf-8 -*-
# Generated by Django 1.11.1 on 2017-06-07 16:10
```
The new `--no-header` argument removes this when generating new migrations. In a typical workflow, there's little reason to remove this, but it's nice there's the option now!
The new `--no-header` argument removes this when generating new migrations. In a typical workflow, there's little reason to remove this, but it's nice there's the option now! I'd be interested in hearing a use case for this!
## Migration planning
When executing migrations, especially in a production environment, it's useful to know which migrations are going to run. This is especially useful when certain migrations may require some site downtime. Previously, it was possible to see the migrations to run by using `manage.py showmigrations | grep -F "[ ]"`, but this is less than ideal, and a bit of a hack.
When executing migrations, especially in a production environment, it's useful to know which migrations are going to run. This is especially useful when certain migrations may require some site downtime. Previously, it was possible to see the migrations to run by using `manage.py showmigrations | grep -F "[ ]"`, but this is less than ideal, and a bit of a hack (not that that's a bad thing!).
Django 2.2 adds the ability to see the migrations before they are executed, rather than having to roll this functionality yourself. This is done using the new `--plan` flag.
@ -185,26 +195,27 @@ Previously, `request.META` gave access to HTTP headers, in a slightly weird way.
> any HTTP headers in the request are converted to META keys by converting all characters to upper-case, replacing any hyphens with underscores and adding an `HTTP_` prefix to the name. So, for example, a header called `X-Bender` would be mapped to the META key `HTTP_X_BENDER`.
For anyone who's worked with raw HTTP headers in the past, this is a little weird. Now, `request` objects have a `headers` attribute which allows a far more sane API over the raw request headers. As all headers should be, the accessing API is case-insensitive!
For anyone who's worked with raw HTTP headers in the past, this is a little weird.
Now, `request` objects have a `headers` attribute which allows a far more sane API over the raw request headers. As all headers should be, the accessing API is case-insensitive!
## Use of `sqlparse`
In previous versions, Django's ORM handled every aspect of constructing SQL queries. This added a lot of additional, and arguably unnecessary code to the core of Django. Django 2.2 adds a new dependency which takes care of this: `sqlparse`. `sqlparse` is a library to handle AST parsing of SQL, allowing the conversion from SQL text to Python objects, and vice versa. This doesn't extract Django's ORM into an external package, just remove a small section of it in favour of a existing library.
Using an external library brings with it many benefits. There's now less code inside the core Django codebase, meaning there's less for the core developers to manage and tie in to Django's release cycle. **(Wild speculation alert!)** It also _might_ mean it gets faster. Society is built on specialisation, therefore hopefully a library designed to do SQL parsing will be faster and more robust than the one originally written for Django.
Using an external library brings with it many benefits. There's now less code inside the core Django codebase, meaning there's less for the core developers to manage and tie in to Django's release cycle. **(Wild speculation alert!)** It also _might_ mean it gets faster. Society is built on specialisation, therefore hopefully a library designed to do SQL parsing will be faster and more robust than the one originally written for Django, and also takes some of the strain off the Django core team!
## Watchman
[Watchman](https://facebook.github.io/watchman/) is a technology from Facebook which enables efficient and powerful file watching in a directory. Django now has the ability to use this when doing live code reload in the dev server, rather than the pure-python alternative. This will give massive performance improvement on large codebases, and use fewer resources as it does.
Watchman support isn't enabled by default. It requires an additional optional dependency `pywatchman` to operate.
Watchman support isn't enabled by default. It requires an additional optional dependency `pywatchman` to operate, along with watchman being installed on your machine.
## Database instrumentation
Django supports many different ways of modifying the querying and model lifecycle, from executing arbitrary SQL, to using signals to listen for specific model events. Django 2.0
introduces instrumentation, which allows intermediary code to be executed for each query, which allows for modification, logging, and other modifications.
Django supports many different ways of modifying the querying and model lifecycle, from executing arbitrary SQL, to using signals to listen for specific model events. Django 2.0 introduces instrumentation, which allows intermediary code to be executed for each query, enabling modification, logging, and any other munging of queries and data you need.
An interesting use for this would be explicitly disabling queries in certain parts of the code, with [`django-zen-queries`](https://github.com/dabapps/django-zen-queries) (ships in https://github.com/dabapps/django-zen-queries/pull/12)
An interesting use for this would be explicitly disabling queries in certain parts of the code, with [`django-zen-queries`](https://github.com/dabapps/django-zen-queries) (ships in https://github.com/dabapps/django-zen-queries/pull/12).
## Upgrading

24
package-lock.json generated
View file

@ -276,7 +276,6 @@
"resolved": "https://registry.npmjs.org/babel-runtime/-/babel-runtime-6.26.0.tgz",
"integrity": "sha1-llxwWGaOgrVde/4E/yM3vItWR/4=",
"dev": true,
"optional": true,
"requires": {
"core-js": "^2.4.0",
"regenerator-runtime": "^0.11.0"
@ -305,7 +304,6 @@
"resolved": "https://registry.npmjs.org/babel-types/-/babel-types-6.26.0.tgz",
"integrity": "sha1-o7Bz+Uq0nrb6Vc1lInozQ4BjJJc=",
"dev": true,
"optional": true,
"requires": {
"babel-runtime": "^6.26.0",
"esutils": "^2.0.2",
@ -317,8 +315,7 @@
"version": "6.18.0",
"resolved": "https://registry.npmjs.org/babylon/-/babylon-6.18.0.tgz",
"integrity": "sha512-q/UEjfGJ2Cm3oKV71DJz9d25TPnq5rhBVL2Q4fA5wcC3jcrdn7+SssEybFIxwAvvP+YCsCYNKughoF33GxgycQ==",
"dev": true,
"optional": true
"dev": true
},
"balanced-match": {
"version": "1.0.0",
@ -754,8 +751,7 @@
"version": "2.6.9",
"resolved": "https://registry.npmjs.org/core-js/-/core-js-2.6.9.tgz",
"integrity": "sha512-HOpZf6eXmnl7la+cUdMnLvUxKNqLUzJvgIziQ0DiF3JwSImNphIqdGqzj6hIKyX04MmV0poclQ7+wjWvxQyR2A==",
"dev": true,
"optional": true
"dev": true
},
"core-util-is": {
"version": "1.0.2",
@ -1163,7 +1159,6 @@
"resolved": "https://registry.npmjs.org/define-properties/-/define-properties-1.1.3.tgz",
"integrity": "sha512-3MqfYKj2lLzdMSf8ZIZE/V+Zuy+BgD6f164e8K2w7dgnpKArBDerGYpM46IYYcjnkdPNMjPk9A6VFB8+3SKlXQ==",
"dev": true,
"optional": true,
"requires": {
"object-keys": "^1.0.12"
}
@ -1870,8 +1865,7 @@
"version": "1.0.0",
"resolved": "https://registry.npmjs.org/has-symbols/-/has-symbols-1.0.0.tgz",
"integrity": "sha1-uhqPGvKg/DllD1yFA2dwQSIGO0Q=",
"dev": true,
"optional": true
"dev": true
},
"hash-base": {
"version": "3.0.4",
@ -2076,8 +2070,7 @@
"version": "1.1.4",
"resolved": "https://registry.npmjs.org/is-callable/-/is-callable-1.1.4.tgz",
"integrity": "sha512-r5p9sxJjYnArLjObpjA4xu5EKI3CuKHkJXMhT7kwbpUyIFD1n5PMAsoPvWnvtZiNz7LjkYDRZhd7FlI0eMijEA==",
"dev": true,
"optional": true
"dev": true
},
"is-date-object": {
"version": "1.0.1",
@ -2755,8 +2748,7 @@
"version": "1.1.1",
"resolved": "https://registry.npmjs.org/object-keys/-/object-keys-1.1.1.tgz",
"integrity": "sha512-NuAESUOUMrlIXOfHKzD6bpPu3tYt3xvjNdRIQ+FeT0lNb4K8WR70CaDxhuNguS2XG+GjkyMwOzsN5ZktImfhLA==",
"dev": true,
"optional": true
"dev": true
},
"object.assign": {
"version": "4.1.0",
@ -3087,8 +3079,7 @@
"version": "0.11.1",
"resolved": "https://registry.npmjs.org/regenerator-runtime/-/regenerator-runtime-0.11.1.tgz",
"integrity": "sha512-MguG95oij0fC3QV3URf4V2SDYGJhJnJGqvIIgdECeODCT98wSWDAJ94SSuVpYQUoTcGUIL6L4yNB7j1DFFHSBg==",
"dev": true,
"optional": true
"dev": true
},
"relateurl": {
"version": "0.2.7",
@ -3691,8 +3682,7 @@
"version": "1.0.3",
"resolved": "https://registry.npmjs.org/to-fast-properties/-/to-fast-properties-1.0.3.tgz",
"integrity": "sha1-uDVx+k2MJbguIxsG46MFXeTKGkc=",
"dev": true,
"optional": true
"dev": true
},
"try-catch": {
"version": "2.0.0",