Skip to content

SQLALchemy: One to many-relationships

March 7, 2022

When learning a new tech stack, the database layer is usually one of the harder and the most fun activities. In my current project, our services are written in Python, and most of them use PostgreSQL for storage. For Python, the most commonly used SQL library is SQLAlchemy. It is usually used as an Object–relational mapping (ORM). While I have used Python in the past, I haven’t used either SQLAlchemy or ORMs very much in production. In this article, I will explore how to model relationships in SQLAlchemy.

Note: Much of what I will cover is already covered by the SQLAlchemy documentation on basic relationships. Post is mainly for my own understanding.

One to many

A one to many-relationship in SQL means that a row in one table is linked to zero or more rows in another table. For example: One company (in the companies table) could have three employees (in the employees table).

Some things I expect from a one to many-relationship.

Let’s get started.

Tables

from sqlalchemy import Column, Integer, String, ForeignKey
from sqlalchemy.orm import declarative_base, relationship

Base = declarative_base()


class Company(Base):
    __tablename__ = 'companies'
    id = Column(Integer, primary_key=True)
    employees = relationship("Employee")


class Employee(Base):
    __tablename__ = 'employees'
    id = Column(Integer, primary_key=True)
    name = Column(String)
    company_id = Column(Integer, ForeignKey('companies.id'))

We have two models Company and Employee. They have a one to many-relationship.

The following code inserts a company with two employees. The relationship("Employee") on the Company model ensure that the employee-table will be updated as well.

with Session.begin() as session:
    company = Company()
    company.employees = [
        Employee(name='Magnus'),
        Employee(name='John')
    ]
    session.add(company)

Result

company-table

id
1

employees-table

idnamecompany_id
1Magnus1
2John1

Cascading deletes

If we are a ruthless manager, we can also do updates like firing all our employees (managers are stored in a separate table).

with Session.begin() as session:
    company = session.query(Company).get(1)
    company.employees = []

Unfortunately, the ruthless manager’s upper management can still see the traces of the employees in the dabase.

employees-table

idnamecompany_id
1MagnusNULL
2JohnNULL

If we want that these entries should be removed from that table when they are removed from the parent table, we need to change the relationship()

  class Company(Base):
      __tablename__ = 'companies'
      id = Column(Integer, primary_key=True)
--    employees = relationship("Employee")
++    employees = relationship("Employee", cascade="all, delete-orphan")

Now, if we run our code again, everything looks nice and clean.

employees-table

idnamecompany_id

Note: The employees will be deleted any time the association between them and a company is lost, i.e. also if the company row itself is deleted.

Back propagation

The last requirement from the list is back propagation. If a child entity is added to its table, will it be visible when the parent is queried?

with Session.begin() as session:
    session.add(Employee(name='Lisa', company_id=1))
    
with Session.begin() as session:
    company = session.query(Company).get(1)
    print(f"Company {company.id} has the following employees:")
    for e in company.employees:
        print("-", e.name)

Output

Company 1 has the following employees:
- Magnus
- John
- Lisa

Relationship loading techniques

In SQLAlchemy there are a few gotchas around how relationships are loaded. While I think in many scenarios should avoid complicated relationship updating technique (and if needed, should be covered by tests), it is worth knowing some things about it.

Example 1: Lazy loading (default)

In the scope of a single session, we can perform both the insert, and the listing of employees. This will work, because the company.employees is actually loaded lazily when it is first used.

with Session.begin() as session:
    session.add(Employee(name='Lisa', company_id=1))

    company = session.query(Company).get(1)
    print(f"Company {company.id} has the following employees:")
    for e in company.employees: # 
        print("-", e.name)

This can be observed by setting echo=True on the engine, like so engine = create_engine(url, echo=True), and running the script again.

Company 1 has the following employees:
+2022-04-03 15:35:45,680 INFO sqlalchemy.engine.Engine SELECT employees.id AS employees_id, employees.name AS employees_name, employees.company_id AS employees_company_id
+FROM employees
+WHERE %(param_1)s = employees.company_id
+2022-04-03 15:35:45,680 INFO sqlalchemy.engine.Engine [generated in 0.00021s] {'param_1': 1}
\- Magnus
\- John
\- Lisa

As you can see, SQLAlchemy will do an additional trip to the DB when looking up the company employees. In fact, company.employees will only be loaded once per session.

We can observe this by adding another employee after the first fetch employees is done:

with Session.begin() as session:
    # Add an employee
    session.add(Employee(name='Lisa', company_id=1))

    company = session.query(Company).get(1)
    print(f"Company {company.id} has {len(company.employees)} employees")

    # Add another employee
    session.add(Employee(name='Lisa\'s friend', company_id=1))

    company = session.query(Company).get(1)
    print(f"Company {company.id} has {len(company.employees)} employees")

Output

Company 1 has 3 employees
Company 1 has 3 employees

Example 2: Eager loading

We can run in to the same problem even earlier if we enable eager loading. Then SQL alchemy will load the relationship data directly when the parent is queried for, even if it is not used. This is typically done by the option lazy=‘joined’.

class Company(Base):
    __tablename__ = 'companies'
    id = Column(Integer, primary_key=True)
-    employees = relationship("Employee", cascade="all, delete-orphan")
+    employees = relationship("Employee", cascade="all, delete-orphan", lazy='joined')
with Session.begin() as session:
    company = session.query(Company).get(1) # Child data will be loaded here
    session.add(Employee(name='Lisa', company_id=1))

    print(f"Company {company.id} has {len(company.employees)} employees")

    # Add another employee
    session.add(Employee(name='Lisa\'s friend', company_id=1))
    
    print(f"Company {company.id} has {len(company.employees)} employees")

Output

Company 1 has 2 employees
Company 1 has 2 employees

So what can we do?

Both of these scenarios are unintuitive for me, and can certainly lead to bugs. SQLAlchemy has a few possible ways of handling proper back-propagation using backref. Exactly how backref works is also not intuitive, so I do not recommend using it without having a good understanding of SQLAlchemy.

Good luck 🙃!

Conclusion

We have fulfilled our initial expectations for the one-to-many relationships.

Working with SQLAlchemy is initially quite straightforward, and fun. The library is very powerful and versatile. You can pretty much do anything you want, and there are many things that can be fined tuned. The post above is just a tiny, tiny fraction of what can be done.

✅ Pros

❌ Cons

〰️ Both

Possible next steps

Resources