bringing-background-workers.../slides.md

1070 lines
26 KiB
Markdown
Raw Normal View History

2024-04-19 11:31:54 +01:00
---
title: Empowering Django with Background Workers
class: text-center
highlighter: shiki
mdc: true
2024-05-10 11:25:05 +01:00
monaco: false
2024-04-19 11:31:54 +01:00
themeConfig:
primary: '#0c4b33'
---
2024-06-02 22:31:44 +01:00
# Empowering <logos-django class="[&>path]:fill-white! h-15 w-43"/> with Background Workers
2024-04-19 11:31:54 +01:00
2024-06-02 22:31:44 +01:00
## Jake Howard{.mt-8}
2024-04-19 11:31:54 +01:00
2024-06-02 22:31:44 +01:00
<ul class="list-none! [&>li]:m-0!">
2024-04-19 11:31:54 +01:00
<li>Senior Systems Engineer @ Torchbox <mdi-fire class="fill-white"/></li>
2024-05-29 14:39:51 +01:00
<li>Core, Security & Performance teams @ Wagtail <logos-wagtail class="fill-white"/></li>
2024-04-19 11:31:54 +01:00
</ul>
2024-06-02 22:31:44 +01:00
<ul class="list-none! text-sm [&>li]:m-0! mt-5">
2024-04-19 11:31:54 +01:00
<li><mdi-earth /> theorangeone.net</li>
<li><mdi-github /> @RealOrangeOne</li>
2024-06-02 22:13:25 +01:00
<li><mdi-twitter /> @RealOrangeOne</li>
2024-04-19 11:31:54 +01:00
<li><mdi-mastodon /> @jake@theorangeone.net</li>
</ul>
2024-05-10 16:34:28 +01:00
<div class="absolute left-3 bottom-3">
<img src="/dceu24-qrcode.png" width="150px" />
2024-05-10 16:47:07 +01:00
</div>
2024-06-01 22:05:28 +01:00
<!--
- Hi
- I'm Jake
- Senior Systems Engineer at Torchbox
- I'm also on the security team, and as of last week the core team for Wagtail
- Leading Django-based CMS
- I exist in many places on the internet
- Here to talk about Background Workers
- What they are
- How to use them
- Exciting things _hopefully_ coming to Django
-->
2024-05-10 16:34:28 +01:00
---
layout: center
---
2024-05-24 12:55:29 +01:00
# Django is a web framework
2024-05-10 16:34:28 +01:00
```mermaid
flowchart LR
U(User 🧑‍💻)
D[\Django/]
U---->|Request|D
D---->|Response|U
```
<style>
.mermaid {
text-align: center;
}
</style>
2024-06-01 22:05:28 +01:00
<!--
- Django is a web framework
- It's a magic box which turns HTTP requests into HTTP responses
- What you do inside that box is up to you
2024-06-05 09:48:04 +01:00
- For something like a blog, that's probably as far as it needs to go
2024-06-01 22:05:28 +01:00
-->
2024-05-10 16:34:28 +01:00
---
layout: full
---
2024-05-24 12:55:29 +01:00
# Django isn't _just_ for websites
2024-05-10 16:34:28 +01:00
```mermaid
flowchart BT
2024-05-10 16:34:28 +01:00
U[User 🧑‍💻]
D[\Django/]
DB[(Database)]
E>Email]
EA[External API]
2024-05-24 12:55:29 +01:00
V[[Video Transcoding]]
R[Reporting]
ML((Machine<br>Learning))
2024-05-10 16:34:28 +01:00
2024-05-24 12:55:29 +01:00
U<--->D
2024-05-10 16:34:28 +01:00
2024-05-24 12:55:29 +01:00
D---DB
D-..-E & EA & V & R & ML
2024-05-10 16:34:28 +01:00
```
2024-05-24 12:55:29 +01:00
<style>
.mermaid {
text-align: center;
}
</style>
2024-06-01 22:05:28 +01:00
<!--
- For a full web application, you need a little more than that
- Not just "keep information in a database"
- Notification emails
- Talk to external services
- Transcoding video
- Complex reporting
- It's 2024, so lots of ML
- For many of these, you need code which runs outside the magic box
- You don't want your user waiting whilst these happen
- If you had to wait whilst YouTube transcoded all your videos, you'd get pretty annoyed
-->
2024-05-24 12:55:29 +01:00
2024-05-10 16:34:28 +01:00
---
layout: full
---
2024-06-01 22:05:28 +01:00
<v-click>
# Background Workers?
</v-click>
2024-05-24 12:55:29 +01:00
2024-05-10 16:34:28 +01:00
```mermaid
flowchart BT
2024-05-10 16:34:28 +01:00
U[User 🧑‍💻]
D[\Django/]
2024-05-24 12:55:29 +01:00
2024-05-10 16:34:28 +01:00
E>Email]
EA[External API]
2024-05-24 12:55:29 +01:00
V[[Video Transcoding]]
R[Reporting]
ML((Machine<br>Learning))
2024-05-10 16:34:28 +01:00
2024-06-01 22:05:28 +01:00
B{{<strong>Background Worker</strong>}}
2024-05-10 16:34:28 +01:00
2024-05-24 12:55:29 +01:00
U<-->D
2024-05-10 16:34:28 +01:00
2024-05-24 12:55:29 +01:00
D-..-B
2024-05-10 16:34:28 +01:00
2024-05-24 12:55:29 +01:00
B---E & EA & V & R & ML
2024-05-10 16:34:28 +01:00
```
2024-05-24 12:55:29 +01:00
<style>
.mermaid {
text-align: center;
}
</style>
2024-06-01 22:05:28 +01:00
<!--
- You need a background worker
- But[click]
2024-06-05 09:48:04 +01:00
- What _are_ background workers
2024-06-01 22:05:28 +01:00
- Let you offload complexity outside of the request-response cycle
- To be run somewhere else, potentially at a later date
2024-05-10 16:34:28 +01:00
2024-06-05 09:48:04 +01:00
- They keep requests nice and fast
2024-06-01 22:05:28 +01:00
- Move the slow bits somewhere else
2024-06-02 22:13:25 +01:00
- User doesn't have to wait
- Improves throughput and latency
2024-06-01 22:05:28 +01:00
-->
2024-05-10 16:34:28 +01:00
---
2024-06-01 22:05:28 +01:00
layout: section
2024-05-10 16:34:28 +01:00
---
2024-06-01 22:05:28 +01:00
## Background worker architecture
2024-05-10 16:34:28 +01:00
```mermaid
flowchart LR
D[\Django/]
S[(Queue Store)]
R1{Runner}
R2{Runner}
R3{Runner}
2024-05-24 12:55:29 +01:00
D<----->S<-....->R1 & R2 & R3
2024-05-10 16:34:28 +01:00
```
2024-06-01 22:05:28 +01:00
<!--
- How does this work?
- Web process submits a function to be run
- Stored in the queue store
- A runner then grabs a task, runs it, and returns the result to the queue store
- You can retrieve its status later if needed
-->
2024-05-10 16:34:28 +01:00
---
2024-05-24 12:55:29 +01:00
layout: section
2024-05-10 16:34:28 +01:00
---
# When?
2024-06-01 22:05:28 +01:00
<!--
- Background workers are very useful tool
- But that doesn't mean they're useful for everything, all the time
2024-06-05 09:48:04 +01:00
- As with all great things: "It depends" when they're useful
2024-06-02 22:13:25 +01:00
- Trade-off between complexity and functionality
2024-06-05 09:48:04 +01:00
- If you're considering whether an action makes sense in the background:
- There are a few things to consider...
2024-06-01 22:05:28 +01:00
-->
2024-05-29 16:58:47 +01:00
---
layout: cover
background: https://images.unsplash.com/photo-1518729371765-043e54eb5674?q=80&w=1807&auto=format&fit=crop&ixlib=rb-4.0.3
---
# Does it take time?{.text-right}
2024-06-01 22:05:28 +01:00
<!--
2024-06-02 22:13:25 +01:00
- Does it take time
2024-06-05 09:48:04 +01:00
- Or _could_ it take time
2024-06-02 22:13:25 +01:00
- Don't want to make the user wait
- Unable to close the tab or do something else
- Go off and do it in the background, and let them know whether it's done
- Even if that's by polling it in the browser
2024-06-01 22:05:28 +01:00
-->
2024-05-10 16:34:28 +01:00
---
layout: fact
---
2024-05-29 16:58:47 +01:00
## Does it leave your infrastructure?{.mb-5}
2024-05-10 16:34:28 +01:00
```mermaid
flowchart BT
D[\Django/]
subgraph Slow / Unreliable
E>Email]
EA[External API]
V[[Video Transcode]]
2024-05-29 16:58:47 +01:00
R[Reporting]
ML((Machine<br>Learning))
end
subgraph Fast & Reliable
DB[(Database)]
C[(Cache)]
2024-05-10 16:34:28 +01:00
end
D---DB & C
2024-05-29 16:58:47 +01:00
D-.-E & EA & V & R & ML
2024-05-10 16:34:28 +01:00
```
2024-06-01 22:05:28 +01:00
<!--
2024-06-05 09:48:04 +01:00
- Does control leave your infrastructure?
2024-06-02 22:13:25 +01:00
- The core components (Server, DB, Cache etc) you control and can closely monitor
- And are in a good position to fix it if something goes wrong
- That's not true for external APIs
- It's someone else's SRE team
- Their performance characteristics shouldn't affect your app
2024-06-01 22:05:28 +01:00
-->
2024-05-29 16:58:47 +01:00
---
layout: cover
background: https://images.unsplash.com/photo-1518770660439-4636190af475?q=80&w=3870&auto=format&fit=crop&ixlib=rb-4.0.3
---
# Specialized hardware?
2024-06-01 22:05:28 +01:00
<!--
- Maybe it's less about when, more about where?
2024-06-02 22:13:25 +01:00
- Maybe it's more about the hardware it runs on
- GPUs
- Loads of RAM
- External hardware
- Isolated network
2024-06-01 22:05:28 +01:00
-->
2024-05-29 16:58:47 +01:00
---
layout: image-right
image: https://images.unsplash.com/photo-1711606815631-38d32cdaec3e?q=80&w=2070&auto=format&fit=crop&ixlib=rb-4.0.3
class: text-xl
2024-05-29 16:58:47 +01:00
---
# Examples
- Compiling code
- Complex reporting
- File uploads
- Model training
- PDF generation
- Resizing images
- Sending email
- Transcoding video
- ... 🤯
2024-05-29 16:58:47 +01:00
2024-06-01 22:05:28 +01:00
<!--
- Background workers have a world of uses
- It's important to consider the scale when designing a feature
- It might be fine locally
2024-06-02 22:13:25 +01:00
- As your application grows, there'll be more data, so it'll likely take a lot longer
- The user shouldn't have to wait ages for your application
2024-06-02 22:13:25 +01:00
- They can get back on with their day
- Web servers can get back to processing other requests
- This list is quite long
- And it's nowhere near complete
- It's a very generic tool
- When designing an app at scale with these features
- Maybe consider moving it to the background.
2024-06-01 22:05:28 +01:00
-->
2024-05-10 16:34:28 +01:00
---
2024-05-24 12:55:29 +01:00
layout: section
2024-05-10 16:34:28 +01:00
---
2024-05-29 16:58:47 +01:00
# Background Workers in
<logos-django class="[&>path]:fill-white! h-fit w-60 -mt-20"/>
2024-05-10 16:34:28 +01:00
2024-06-01 22:05:28 +01:00
<!--
- Back to Django
2024-06-05 09:48:04 +01:00
- This is Djangocon after all
- In Python and Django, there are lots of different frameworks to achieve background workers
2024-06-01 22:05:28 +01:00
-->
2024-05-10 16:34:28 +01:00
---
layout: image-right
2024-05-29 16:58:47 +01:00
image: https://images.unsplash.com/photo-1444703686981-a3abbc4d4fe3?q=80&w=1740&auto=format&fit=crop&ixlib=rb-4.0.3
2024-05-10 16:34:28 +01:00
---
2024-06-01 22:05:28 +01:00
# Libraries
2024-05-10 16:34:28 +01:00
2024-06-01 22:05:28 +01:00
- Celery<br><br>
2024-05-10 16:34:28 +01:00
- arq
- Django DB Queue
- Django Lightweight Queue
- Django Too Simple Q
- Django-Q
- Django-Q2
- Dramatiq
- Huey
- RQ
- Taskiq
- ...
2024-06-01 22:05:28 +01:00
<!--
- All require an external library
2024-06-05 09:48:04 +01:00
- And possibly some external infrastructure
2024-06-01 22:05:28 +01:00
- Celery is probably the biggest one
- But it's not all that exists
- So many different libraries exist
- With different strengths / weaknesses
- Different learning curves (or cliffs)
-->
2024-05-10 16:34:28 +01:00
---
layout: cover
2024-05-29 16:58:47 +01:00
background: https://images.unsplash.com/photo-1522096823084-2d1aa8411c13?q=80&w=1740&auto=format&fit=crop&ixlib=rb-4.0.3
2024-05-10 16:34:28 +01:00
---
2024-05-29 16:58:47 +01:00
## Example:
2024-05-10 16:34:28 +01:00
# Email <mdi-email-fast-outline />
2024-06-01 22:05:28 +01:00
<!--
- Let's loon at an example, sending an email
- Very common functionality
- Let's imagine a CMS
- For totally unbias reasons
- When a page is published, send an email to everyone subscribed
-->
2024-05-10 16:34:28 +01:00
---
2024-05-20 17:56:30 +01:00
layout: center
2024-05-10 16:34:28 +01:00
---
2024-05-24 12:55:29 +01:00
# Sending an email
```python {all|7|8|9-14|all}
from django.contrib.auth.models import User
from django.core.mail import send_mail
from django.template.loader import render_to_string
from wagtail.models import Page
for user in page.subscribers.iterator():
email_content = render_to_string("notification-email.html", {"user": user, "page": page})
send_mail(
subject=f"A change to {page.title} has been published",
message=email_content
from_email=None, # Use the default sender email
recipient_list=[user.email]
)
```
2024-06-01 22:05:28 +01:00
<!--
- Here's the code we might write to do that
1. [click]Find the users to email
2. [click]Construct the email content
3. [click]Send the email
- [click]This works perfectly fine
- Scales _relatively_ well
2024-06-02 22:13:25 +01:00
- But has some issues
- If connecting to the email server takes a while, the user has to wait
- Usually only a few ms
- Might take a few seconds
- If something goes wrong with one email, the others won't send
- What if your email gateway is down altogether - do your requests start erroring?
- How do you handle it if they do?
- That web worker (eg gunicorn) can't process any other requests until this is done
2024-06-01 22:05:28 +01:00
-->
2024-05-24 12:55:29 +01:00
---
layout: center
---
2024-05-17 11:02:02 +01:00
```python {all|18|19|10|11-16|all|18-19|all}
2024-05-10 16:34:28 +01:00
from django.contrib.auth.models import User
from django.core.mail import send_mail
from django.template.loader import render_to_string
import django_rq
2024-05-17 11:02:02 +01:00
from wagtail.models import Page
def send_email_to_user(page: Page, user: User):
email_content = render_to_string("notification-email.html", {"user": user, "page": page})
2024-05-10 16:34:28 +01:00
send_mail(
2024-05-24 12:55:29 +01:00
subject=f"A change to {page.title} has been published",
2024-05-10 16:34:28 +01:00
message=email_content
from_email=None, # Use the default sender email
recipient_list=[user.email]
)
2024-05-17 11:02:02 +01:00
for user in page.subscribers.iterator():
2024-05-10 16:34:28 +01:00
django_rq.enqueue(send_email_to_user, user)
```
2024-06-01 22:05:28 +01:00
<!--
- Let's look at an example of how we might use background workers to help with this
- Use Django-RQ for this
1. [click]Find the users to email
2. [click]New: Start a task for each user
3. [click]Construct the email content
4. [click]Send the email
- [click]Most of this is exactly the same
- If you knew nothing of RQ, you could still maintain this code
- [click]Moving it to the background just quickly puts an item in the queue
- And then the user can get back on with their life
- Emails get sent out by the runners
- Multiple runners means they get sent out faster
- [click]Email sending is an easy action to move to the background
- It's a connection to an external API
- Variable latency
- Infrastructure you don't control
- All of that is simpler to handle when it's already running in the background
-->
2024-05-10 16:34:28 +01:00
---
layout: center
---
2024-05-24 12:55:29 +01:00
# Using <span v-click.hide="1">RQ</span><span v-click="1"><s class="opacity-60">RQ</s> Celery</span>
2024-05-10 16:34:28 +01:00
````md magic-move
```python
from django.contrib.auth.models import User
from django.core.mail import send_mail
from django.template.loader import render_to_string
import django_rq
2024-05-17 11:02:02 +01:00
from wagtail.models import Page
def send_email_to_user(page: Page, user: User):
email_content = render_to_string("notification-email.html", {"user": user, "page": page})
2024-05-10 16:34:28 +01:00
send_mail(
2024-05-24 12:55:29 +01:00
subject=f"A change to {page.title} has been published",
2024-05-10 16:34:28 +01:00
message=email_content
from_email=None, # Use the default sender email
recipient_list=[user.email]
)
2024-05-17 11:02:02 +01:00
for user in page.subscribers.iterator():
2024-05-10 16:34:28 +01:00
django_rq.enqueue(send_email_to_user, user)
```
2024-05-17 11:02:02 +01:00
```python {all|7-9,20|all}
2024-05-10 16:34:28 +01:00
from django.contrib.auth.models import User
from django.core.mail import send_mail
from django.template.loader import render_to_string
2024-05-17 11:02:02 +01:00
from wagtail.models import Page
2024-05-10 16:34:28 +01:00
from my_celery_config import app
@app.task
2024-05-17 11:02:02 +01:00
def send_email_to_user(page: Page, user: User):
email_content = render_to_string("notification-email.html", {"user": user, "page": page})
2024-05-10 16:34:28 +01:00
send_mail(
2024-05-24 12:55:29 +01:00
subject=f"A change to {page.title} has been published",
2024-05-10 16:34:28 +01:00
message=email_content
from_email=None, # Use the default sender email
recipient_list=[user.email]
)
2024-05-17 11:02:02 +01:00
for user in page.subscribers.iterator():
2024-05-10 16:34:28 +01:00
send_email_to_user.delay(user)
```
````
2024-05-24 12:55:29 +01:00
<style>
.slidev-vclick-hidden {
display: none;
}
</style>
2024-06-01 22:05:28 +01:00
<!--
- There's something I just said which might end up causing issues
- You'll notice I said "Using RQ" in that example
- That's because each worker library has its own API
- Its own features
- Its own configuration
- Its own caveats / implementation details
2024-06-02 22:13:25 +01:00
- What if we wanted to use Celery instead?
2024-06-01 22:05:28 +01:00
- [click]Well, that's easy
- [click]Just change a few lines
- [click]But there in lies the problem
- You had to make some changes!
- Sure, they're small, but this is only a tiny amount of code
- What if you wanted to support both?
-->
2024-05-10 16:34:28 +01:00
---
2024-05-20 17:56:55 +01:00
layout: image
image: /situation.png
backgroundSize: 50%
2024-05-10 16:34:28 +01:00
---
2024-06-01 22:05:28 +01:00
<!--
- It's hard enough having multiple options
- But how do you choose between them?
2024-06-02 22:13:25 +01:00
- Maybe you have experience with libraries already
2024-06-01 22:05:28 +01:00
- Do you have the time (and patience) to test each one out?
- Maybe you already have a standard you need to work to
2024-06-02 22:13:25 +01:00
- Maybe you need specific features
2024-06-01 22:05:28 +01:00
- If you're new to Django, do you really want to spend the time weighing them all up?
- Knowing it could bite you as you grow or need a specific feature
- Requiring a lot of time refactoring in future
- What about library maintainers
2024-06-02 22:13:25 +01:00
- Like, say, Wagtail
- Do you write and maintain integrations for _all_ task libraries
- Do you choose the big one(s) and force your users' hands?
- Do you expose a hook and let your users integrate themselves?
2024-06-05 09:48:04 +01:00
2024-06-02 22:13:25 +01:00
- It adds a huge maintenance burden, whichever you choose
- There isn't really a right answer
2024-06-01 22:05:28 +01:00
-->
2024-05-10 16:34:28 +01:00
---
layout: image
image: /ridiculous.png
2024-05-20 17:56:55 +01:00
backgroundSize: 49%
2024-05-10 16:34:28 +01:00
---
2024-06-01 22:05:28 +01:00
<!--
- There _should_ be one universal standard which combines them all
2024-06-02 22:13:25 +01:00
- A single API to help developers use a library
- Without tieing their hands
- First-party
2024-06-05 09:48:04 +01:00
- Allowing library developers to depend on it
- Instead of supporting every separate API
2024-06-01 22:05:28 +01:00
- Scale easily as your needs change
- Be easy to get started with for small projects
- But feature-packed for larger deployments
- Allowing easy stubbing out during tests
- Tests are important!
-->
2024-05-10 16:34:28 +01:00
---
layout: fact
---
## Introducing*:{.mb-5 .mt-3}
2024-05-10 16:34:28 +01:00
2024-05-20 17:56:55 +01:00
# `django.tasks`
2024-05-10 16:47:07 +01:00
<div class="absolute right-1/2 translate-x-1/2 top-3">
<img src="/django-tasks-qrcode.png" width="150px" />
2024-05-10 16:47:07 +01:00
</div>
2024-06-01 22:05:28 +01:00
<!--
- In progress API spec for first-party background workers in Django
-->
2024-05-17 11:02:02 +01:00
---
layout: image-right
2024-05-29 16:58:47 +01:00
image: https://images.unsplash.com/photo-1674027444485-cec3da58eef4?q=80&w=1932&auto=format&fit=crop&ixlib=rb-4.0.3
2024-05-17 11:02:02 +01:00
class: flex items-center text-xl
---
2024-05-10 16:34:28 +01:00
2024-05-17 11:02:02 +01:00
- API contract between library and application developers
- Swappable backends through `settings.py`
2024-06-05 09:48:04 +01:00
- Built in implementations:
2024-05-17 11:02:02 +01:00
- ORM
- "Immediate"
- "Dummy"
- Django 5.2 🤞
- Backport for 4.2+
2024-06-01 22:05:28 +01:00
<!--
- An API contract between worker library maintainers and application developers
- Compatibility layer between Django and their native APIs
- Hopefully the promise of "Write once, run anywhere"
2024-06-05 09:48:04 +01:00
- Built-in implementations
- ORM based (production grade, building on the power of the ORM)
2024-06-01 22:05:28 +01:00
- "Immediate" (ie doesn't background anything) loaded by default
- Dummy (for testing)
- Hopefully landing in Django 5.2
- Backwards compatible with Django 4.2, to allow easy adoption
-->
2024-05-10 16:34:28 +01:00
---
layout: center
---
2024-05-24 12:55:29 +01:00
# <span v-click.hide="1">Using Celery</span><span v-click="1">Using <code>django.tasks</code></span>
2024-05-10 16:34:28 +01:00
````md magic-move
```python
from django.contrib.auth.models import User
from django.core.mail import send_mail
from django.template.loader import render_to_string
2024-05-17 11:02:02 +01:00
from wagtail.models import Page
2024-05-10 16:34:28 +01:00
from my_celery_config import app
@app.task
2024-05-17 11:02:02 +01:00
def send_email_to_user(page: Page, user: User):
email_content = render_to_string("notification-email.html", {"user": user, "page": page})
2024-05-10 16:34:28 +01:00
send_mail(
2024-05-24 12:55:29 +01:00
subject=f"A change to {page.title} has been published",
2024-05-10 16:34:28 +01:00
message=email_content
from_email=None, # Use the default sender email
recipient_list=[user.email]
)
2024-05-17 11:02:02 +01:00
for user in page.subscribers.iterator():
2024-05-10 16:34:28 +01:00
send_email_to_user.delay(user)
```
2024-06-01 22:05:28 +01:00
```python
2024-05-10 16:34:28 +01:00
from django.contrib.auth.models import User
from django.core.mail import send_mail
from django.template.loader import render_to_string
2024-05-17 11:02:02 +01:00
from wagtail.models import Page
2024-05-10 16:34:28 +01:00
from django.tasks import task
@task()
2024-05-17 11:02:02 +01:00
def send_email_to_user(page: Page, user: User):
email_content = render_to_string("notification-email.html", {"user": user, "page": page})
2024-05-10 16:34:28 +01:00
send_mail(
2024-05-24 12:55:29 +01:00
subject=f"A change to {page.title} has been published",
2024-05-10 16:34:28 +01:00
message=email_content
from_email=None, # Use the default sender email
recipient_list=[user.email]
)
2024-05-17 11:02:02 +01:00
for user in page.subscribers.iterator():
2024-05-10 16:34:28 +01:00
send_email_to_user.enqueue(user)
```
````
2024-05-24 12:55:29 +01:00
<style>
.slidev-vclick-hidden {
display: none;
}
</style>
2024-06-01 22:05:28 +01:00
<!--
- Let's look at the same code example as before
- This is tied to Celery
- If want to support RQ too, I'd have to duplicate some parts
2024-06-05 09:48:04 +01:00
- Instead, let's write this once to use `django.tasks`[click]
2024-06-01 22:05:28 +01:00
- Still simple, clear, approachable and easy to use
- If I say so myself
2024-06-02 22:13:25 +01:00
- If we swapped to RQ: 0 lines need to change
2024-06-01 22:05:28 +01:00
- If a new library comes out, 0 lines need to change
2024-06-02 22:13:25 +01:00
- If this is in a library, not my own code, I'm not constrained by their preferences
- And the maintainer doesn't have extra work to support my preferences
2024-06-05 09:48:04 +01:00
- They can use what they like, I can use what I like
2024-06-02 22:13:25 +01:00
- For testing, I can use an in-memory backend
2024-06-01 22:05:28 +01:00
- With 0 lines changed
-->
2024-05-10 16:34:28 +01:00
---
layout: center
---
<v-click>
```python
# settings.py
EMAIL_BACKEND = "django.core.mail.backends.tasks.SMTPEmailBackend"
```
</v-click>
<br />
2024-05-10 16:34:28 +01:00
```python
from django.contrib.auth.models import User
from django.core.mail import send_mail
from django.template.loader import render_to_string
2024-05-24 12:55:29 +01:00
from wagtail.models import Page
2024-05-10 16:34:28 +01:00
2024-05-17 11:02:02 +01:00
for user in page.subscribers.iterator():
2024-05-24 12:55:29 +01:00
email_content = render_to_string("notification-email.html", {"user": user, "page": page})
2024-05-10 16:34:28 +01:00
send_mail(
2024-05-24 12:55:29 +01:00
subject=f"A change to {page.title} has been published",
2024-05-10 16:34:28 +01:00
message=email_content
from_email=None, # Use the default sender email
recipient_list=[user.email]
)
```
2024-06-01 22:05:28 +01:00
<!--
- In this case, we can actually make it even easier
- Because email is such a common use case, and so easy to extract
2024-06-05 09:48:04 +01:00
- Let's go back to the simple implementation
2024-06-01 22:05:28 +01:00
- No background workers in sight
2024-06-05 09:48:04 +01:00
- [click]Instead, we can change the email backend
2024-06-01 22:05:28 +01:00
- Emails are magically sent in the background automatically
- Without additional work
-->
2024-05-10 16:34:28 +01:00
---
layout: image-right
image: /soon.png
class: flex justify-center text-2xl flex-col
---
# Q: Why something new?
2024-06-01 22:05:28 +01:00
<!--
- I'm sure you're thinking "Why something new?"
- Celery already has a borderline monopoly on task queues
- Writing a production-grade task queue is hard
- As I've been told whilst working on this DEP
- Why not just vendor something existing?
- If not Celery, then something else
2024-06-02 22:13:25 +01:00
- That's not really the goal
- Shared API contract is
2024-06-05 11:19:54 +01:00
- Interoperate with as opposed to replace
- Let you use the ecosystem which already exists
- The Django backends will hopefully become great
2024-06-02 22:13:25 +01:00
- But must be done with careful planning and consideration
- Django needs to remain the stable and reliable base it always has been
2024-06-05 09:48:04 +01:00
- Don't want to burn-out the Django maintainers
2024-06-01 22:05:28 +01:00
-->
2024-05-10 16:34:28 +01:00
---
2024-05-17 11:02:02 +01:00
layout: image-right
2024-05-29 16:58:47 +01:00
image: https://images.unsplash.com/photo-1525683879097-8babce1c602a?q=80&w=1335&auto=format&fit=crop&ixlib=rb-4.0.3
2024-05-24 12:55:29 +01:00
class: flex justify-center text-xl flex-col
2024-05-10 16:34:28 +01:00
---
2024-05-17 11:02:02 +01:00
# Q: Why something built-in?
2024-05-10 16:34:28 +01:00
2024-05-10 17:11:55 +01:00
- Reduce barrier to entry
2024-05-24 12:55:29 +01:00
- Reduce cognitive load
- Reduce complexity for smaller projects
- Improve interoperability
- Use what's already there
- A common API
2024-05-10 16:34:28 +01:00
2024-06-01 22:05:28 +01:00
<!--
2024-06-05 11:19:54 +01:00
- Why should this be built-in?
- Being built-in reduces the barrier to entry
2024-06-01 22:05:28 +01:00
- Integrating becomes much simpler
2024-06-02 22:13:25 +01:00
- There's 1 API to learn
- It will last you a while
- Scale with your needs
2024-06-01 22:05:28 +01:00
- A developer can join a new project and already be productive
- A common API also helps library maintainers
2024-06-05 09:48:04 +01:00
- Maintaining a library is work enough
2024-06-01 22:05:28 +01:00
- Without needing to think about how to move code to the background
2024-06-02 22:13:25 +01:00
- If Django can take complexity off you, great
2024-06-05 09:48:04 +01:00
- Use the tools provided to you
2024-06-01 22:05:28 +01:00
- Currently, it's not really an option
2024-06-02 22:13:25 +01:00
- The burden is too great
- No additional dependencies for your library
2024-06-01 22:05:28 +01:00
- Just import from Django and you're set
- The user can use what they want
- Or what's suitable for their scale and use case
- Now the barrier is reduced, the ecosystem can flourish
2024-06-02 22:13:25 +01:00
- Libraries can assume background workers, without any additional burden
2024-06-01 22:05:28 +01:00
- The ORM backend should work for the majority of projects
2024-06-02 22:13:25 +01:00
- If you just want to send emails in the background, you probably don't need Celery or RQ
2024-06-01 22:05:28 +01:00
- It's overkill
- A vendored solution makes it the easiest to get started with
- Tweak some settings, run an extra process, and you're done.
-->
2024-05-10 17:11:55 +01:00
---
layout: center
2024-05-17 11:02:02 +01:00
transition: fade
2024-05-10 17:11:55 +01:00
---
2024-05-24 12:55:29 +01:00
![](/celery.svg){.h-32.mx-auto}
2024-05-10 17:11:55 +01:00
## vs
2024-05-24 12:55:29 +01:00
![](/postgres.png){.h-36.mx-auto}
2024-05-17 11:02:02 +01:00
<style>
.slidev-layout {
background: white;
color: black;
text-align: center;
}
</style>
2024-06-01 22:05:28 +01:00
<!--
- ORM at scale
- For some scales, an ORM-based worker might not be viable
- The Sentrys and Instagrams of the world
- Postgres scales pretty well, but sometimes not well enough
- And that's ok!
-->
2024-05-17 11:02:02 +01:00
---
layout: center
---
2024-05-24 12:55:29 +01:00
![](/elasticsearch.png){.h-32.mx-auto}
2024-05-10 17:11:55 +01:00
2024-05-17 11:02:02 +01:00
## vs
2024-05-24 12:55:29 +01:00
![](/postgres.png){.max-h-36.mx-auto}
2024-05-17 11:02:02 +01:00
2024-05-10 17:11:55 +01:00
<style>
.slidev-layout {
background: white;
color: black;
text-align: center;
}
</style>
2024-06-01 22:05:28 +01:00
<!--
- But the same is also true for Postgres FTS vs ElasticSearch
- A debate that's been going on for a while
- And I've had many times
- ElasticSearch is quite likely better for the ~10% of people who need it
- But that doesn't mean the other 90% of people won't be happy with PostgreSQL
2024-06-02 22:13:25 +01:00
- Probably wouldn't benefit from ElasticSearch anyway
- Definitely won't get a return on the extra hosting cost and complexity
2024-06-01 22:05:28 +01:00
- They'll be perfectly happy with Postgres FTS
- Let them get started the easiest way possible
- We can still invite them into ElasticSearch when they're ready
-->
2024-05-10 16:34:28 +01:00
---
layout: section
---
# Where are we now?
2024-06-02 22:13:25 +01:00
<!--
- I mean, other than Vigo
-->
2024-05-29 16:58:47 +01:00
---
layout: image
image: /dep.png
---
2024-05-10 16:34:28 +01:00
2024-05-29 16:58:47 +01:00
---
2024-06-01 22:05:28 +01:00
layout: section
2024-05-29 16:58:47 +01:00
---
2024-05-10 16:34:28 +01:00
2024-05-29 16:58:47 +01:00
# `pip install django-tasks`
<div class="absolute right-1/2 translate-x-1/2 top-3">
<img src="/django-tasks-qrcode.png" width="150px" />
2024-05-29 16:58:47 +01:00
</div>
2024-05-10 16:34:28 +01:00
2024-06-01 22:05:28 +01:00
<!--
- You can play with this right now!
- Download it, play around with it
- The dummy backend is great for testing
- The immediate backend can help get you started
- The ORM backend is where the magic happens
- Tell me about all the bugs in my code
- The more testing we can do now, the better
2024-06-02 22:13:25 +01:00
- There's still work to do
2024-06-05 09:48:04 +01:00
- Features
- Improvements
- Performance
- Scalability
- etc etc
2024-06-01 22:05:28 +01:00
-->
2024-05-17 11:02:02 +01:00
---
layout: section
---
2024-05-29 16:58:47 +01:00
# Where will we be _soon_™?
2024-05-17 11:02:02 +01:00
2024-06-01 22:05:28 +01:00
<!--
- More testing
- Upstreaming
2024-06-02 22:13:25 +01:00
- That's the big benefit
- Else it really is just another standard
2024-06-01 22:05:28 +01:00
- Once `django-tasks` is in a better state, it can become `django.tasks`
- Hopefully in time for the 5.2 release window
- Adoption
- The more people know about this, the better it is for everyone
2024-06-02 22:13:25 +01:00
- Developers can start working on integrating now
- Knowing they can trivially upgrade once it's in Django
2024-06-01 22:05:28 +01:00
-->
2024-05-10 16:34:28 +01:00
---
2024-05-10 17:26:20 +01:00
layout: cover
2024-05-24 12:55:29 +01:00
background: /celery.svg
2024-05-10 16:34:28 +01:00
---
2024-05-10 17:26:20 +01:00
# Is this the end?
<style>
.slidev-layout {
background: white;
2024-05-24 12:55:29 +01:00
background-size: contain !important;
2024-05-10 17:26:20 +01:00
}
</style>
2024-06-01 22:05:28 +01:00
<!--
- Is this the end for Celery and alike?
- Not at all!
2024-06-05 11:19:54 +01:00
- It's a great choice
- They have quite a head start
2024-06-01 22:05:28 +01:00
- This is much more about usability and flexibility
- If you need certain features, keep using them!
2024-06-02 22:13:25 +01:00
- Now you have the option of a Django-native API
- Which could even be Celery under the hood
2024-06-01 22:05:28 +01:00
-->
2024-05-10 16:34:28 +01:00
---
layout: image-right
2024-05-29 16:58:47 +01:00
image: https://images.unsplash.com/photo-1451187580459-43490279c0fa?q=80&w=1744&auto=format&fit=crop&ixlib=rb-4.0.3
2024-05-10 16:34:28 +01:00
class: flex justify-center flex-col text-xl
---
2024-06-05 09:48:04 +01:00
# Out of scope (_for now_)
2024-05-10 16:34:28 +01:00
- Completion / failed hooks
- Bulk queueing
- Automated task retrying
- Task runner API
- Unified observability
- Cron-based scheduling
- Task timeouts
- Swappable argument serialization
2024-05-24 12:55:29 +01:00
- ...
2024-05-10 16:34:28 +01:00
2024-06-01 22:05:28 +01:00
<!--
- The world of background workers is huge
2024-06-02 22:13:25 +01:00
- There are countless nice features
2024-06-01 22:05:28 +01:00
- Not everything is making it into the initial version(s)
- And that's ok!
- Existing libraries have a head start
- But I hope we can slowly catch them up
- Bringing the stability and longevity guarantees that come with Django
2024-06-02 22:13:25 +01:00
- Doesn't mean they'll never come
2024-06-05 09:48:04 +01:00
- With your help, we can make these happen
2024-06-01 22:05:28 +01:00
-->
2024-05-17 11:02:02 +01:00
---
layout: cover
2024-05-29 16:58:47 +01:00
background: https://images.unsplash.com/photo-1519187903022-c0055ec4036a?q=80&w=1335&auto=format&fit=crop&ixlib=rb-4.0.3
2024-05-17 11:02:02 +01:00
---
# The future is bright
2024-06-01 22:05:28 +01:00
<!--
- The future is bright though
- In time, I see more and more people reaching to `django.tasks`
2024-06-02 22:13:25 +01:00
- And background workers in general
2024-06-01 22:05:28 +01:00
- Moving work to the background will make Django apps _seem_ faster
- Improve throughput
- Reduce latency
- Improve reliability
2024-06-05 09:48:04 +01:00
- Gone are the days of needing additional research and testing to find the tooling you need
- You can use the ones built-in to Django
- And as you scale, it's easy to change
- _without_ rewriting half your application
- With all the knowledge to make an informed decision
- A lower barrier helps everyone!
2024-06-01 22:05:28 +01:00
-->
2024-05-10 16:34:28 +01:00
---
layout: section
---
# What's next?
## `pip install django-tasks`
<div class="absolute right-1/2 translate-x-1/2 top-3">
<img src="/django-tasks-qrcode.png" width="150px" />
2024-05-24 12:55:29 +01:00
</div>
2024-06-01 22:05:28 +01:00
<!--
- Time to turn the dream into a reality!
- If you've realised you could use a background queue, give `django_tasks` a try
- Test it out
- Report back your issues
- Suggest improvements
2024-06-05 09:48:04 +01:00
- Issues / Discussions are open
2024-06-01 22:05:28 +01:00
- If you want to get involved, please do!
- There's plenty of work to do
- And I can't do it alone!
- If you maintain a worker library
2024-06-05 09:48:04 +01:00
- Have interesting use cases
2024-06-02 22:13:25 +01:00
- Or have been burned by one...
2024-06-01 22:05:28 +01:00
-->
2024-05-10 16:34:28 +01:00
---
layout: section
class: text-center text-2xl
2024-05-10 16:34:28 +01:00
---
2024-05-10 16:47:07 +01:00
2024-06-02 22:13:25 +01:00
# Let's chat!
2024-05-10 16:47:07 +01:00
2024-06-02 22:13:25 +01:00
<ul class="list-none! [&>li]:m-0!">
<li><mdi-earth /> theorangeone.net</li>
<li><mdi-github /> @RealOrangeOne</li>
<li><mdi-twitter /> @RealOrangeOne</li>
<li><mdi-mastodon /> @jake@theorangeone.net</li>
</ul>
<div class="absolute right-5 bottom-1/2 translate-y-1/2">
<h3>Me</h3>
<img src="/dceu24-qrcode.png" width="150px" />
</div>
<div class="absolute left-5 bottom-1/2 translate-y-1/2">
<h3><code>django-tasks</code></h3>
<img src="/django-tasks-qrcode.png" width="150px" />
2024-05-10 16:47:07 +01:00
</div>
2024-06-02 22:13:25 +01:00
<style>
.slidev-layout {
background-color: #17181c;
color: #e85537;
}
</style>
---
layout: end
---
END