SQLAlchemy: Single Table Inheritance and One-to-One Relationships

While playing around in SQLAlchemy’s ORM API I stumbled over ambiguous foreign keys when a class contained relationship references to two different subclasses, both of which inherited from the same superclass under a single-table inheritance pattern.

My project was to build a personal finance application to track stock transactions. I built a Transaction superclass withBuyTransaction and SellTransaction as subclasses, and all Transaction objects would be backed by a single database table (i.e. use a single-table inheritance pattern). There would also be a Lot class. A Lot would be created and inserted when and only when a new BuyTransaction was inserted, and when a new SellTransaction was inserted, a query would return the earliest purchased Lot with an empty SellTransaction so that returned Lot could be updated to point to the new SellTransaction.

Because the table for Lot contained foreign keys for both the BuyTransaction and the SellTransaction, and because the data for BuyTransaction and SellTransaction lived on the same table, I ran into the following AmbiguousForeignKeysError:

sqlalchemy.exc.AmbiguousForeignKeysError: Could not determine join condition between parent/child tables on relationship Transaction.lot - there are multiple foreign key paths linking the tables.  Specify the 'foreign_keys' argument, providing a list of those columns which should be counted as containing a foreign key reference to the parent table.

In the steps below I walk you through the setup of the declarative models with each step increasing complexity of the models. If you’re eager to jump to the resolution of the AmbiguousForeignKeyErrors, skip to step 4.

Example 1: The Basic Relationship Model

For the first attempt, we’ll define the BuyTransaction, the SellTransaction and the Lot. Forget about inheritance for now; BuyTransaction and SellTransaction will derive directly from the declarative base class. We’ll also drop the requirement to create a Lot when and only when a BuyTransaction is created. We’re simply showing the simplest possible relationship between a BuyTransaction, a SellTransaction, and Lot.

Here’s the model declaration code:


from sqlalchemy import Column, ForeignKey, Integer
from sqlalchemy.orm import declarative_base, relationship
from sqlalchemy import create_engine

Base = declarative_base()


class BuyTransaction(Base):
    __tablename__ = 'buy_transaction'

    buy_transaction_id = Column(Integer, primary_key=True)
    purchased_lot = relationship("Lot",back_populates='buy_transaction',
                                 uselist=False)
class SellTransaction(Base):
    __tablename__ = 'sell_transaction'

    sell_transaction_id = Column(Integer, primary_key=True)
    sold_lot = relationship("Lot",back_populates='sell_transaction',
                                 uselist=False)

class Lot(Base):
    __tablename__ = 'lot'

    # Each lot has one non-null buy transaction and zero or one sell transactions

    buy_transaction_id = Column(Integer,
                                ForeignKey('buy_transaction.buy_transaction_id'),
                                primary_key=True)

    sell_transaction_id = Column(Integer,
                                 ForeignKey('sell_transaction.sell_transaction_id'))

    buy_transaction = relationship(BuyTransaction,
                                   back_populates='purchased_lot')
    sell_transaction = relationship(SellTransaction,
                                    back_populates='sold_lot')

    def __repr__(self):
        return "<Lot(buy_transaction={buy},sell_transaction={sell}>".format(
            buy=self.buy_transaction,sell=self.sell_transaction)

The driver code for this model is below. This code manually instantiates a BuyTransaction, SellTransaction, and Lot (1) and manually points the new lot to the buy– and sell_transaction (2).

engine = create_engine('sqlite+pysqlite:///:memory:')
Base.metadata.create_all(engine)

b = BuyTransaction()        #(1)
s = SellTransaction()
l = Lot()

l.buy_transaction = b       #(2)
l.sell_transaction = s

print('\n\nb: ',b,'\n')
print('s: ',s,'\n')
print('l: ',l,'\n')

A python executable to run the first example is here. The printed output shows that the new lot correctly points to the buy– and sell_transactions.

b:  <__main__.BuyTransaction object at 0x7fa86dc659a0> 

s:  <__main__.SellTransaction object at 0x7fa86da68280> 

l:  <Lot(buy_transaction=<__main__.BuyTransaction object at 0x7fa86dc659a0>,sell_transaction=<__main__.SellTransaction object at 0x7fa86da68280>> 

Example 2: Inserting a BuyTransaction Creates a Lot

Next let’s address the requirement that a new lot should be created when and only when a new buy_transaction is created. We’d like to honor the separation of concerns between our ORM model definitions and the database session. By using an event listener on database insertion for a buy_transaction we can identify when a buy_transaction is inserted to the database and use the listener callback to create and link a new Lot object.

from sqlalchemy import Column, ForeignKey, Integer
from sqlalchemy.orm import declarative_base, relationship, sessionmaker
from sqlalchemy import create_engine
from sqlalchemy import event

Base = declarative_base()

'''
Now, make it a little more interesting by creating the lot when a BuyTransaction is created.  
'''

class BuyTransaction(Base):
    __tablename__ = 'buy_transaction'

    buy_transaction_id = Column(Integer, primary_key=True)
    purchased_lot = relationship("Lot",back_populates='buy_transaction',
                                 uselist=False)

@event.listens_for(BuyTransaction, "after_insert")
def after_insert(mapper,connection,instance):
    connection.execute(instance.metadata.tables['lot'].insert(),
                       {"buy_transaction_id":instance.buy_transaction_id})

class SellTransaction(Base):
    __tablename__ = 'sell_transaction'

    sell_transaction_id = Column(Integer, primary_key=True)
    sold_lot = relationship("Lot",back_populates='sell_transaction',
                                 uselist=False)

class Lot(Base):
    __tablename__ = 'lot'

    # Each lot has one non-null buy transaction and zero or one sell transactions
    # Creation of a buy transaction should create a lot
    buy_transaction_id = Column(Integer,
                                ForeignKey('buy_transaction.buy_transaction_id'),
                                primary_key=True)

    sell_transaction_id = Column(Integer,
                                 ForeignKey('sell_transaction.sell_transaction_id'))

    buy_transaction = relationship(BuyTransaction,
                                   back_populates='purchased_lot')
    sell_transaction = relationship(SellTransaction,
                                    back_populates='sold_lot')

    def __repr__(self):
        return "<Lot(buy_transaction={buy},sell_transaction={sell}>".format(
            buy=self.buy_transaction,sell=self.sell_transaction)

We’ve added to the driver code for this example because the lot won’t be created until the buy_transaction is inserted into the database. For a single file containing the model definitions and driver code, please see here.

engine = create_engine('sqlite+pysqlite:///:memory:',future=True)
Base.metadata.create_all(engine)
DBSession = sessionmaker(bind=engine)
session=DBSession()

b = BuyTransaction()   #(3)
s = SellTransaction()
l = b.purchased_lot

print('\n\nb: ',b,'\n')
print('s: ',s,'\n')
print('l: ',l,'\n')

print('\n==================ADDING AND COMMITTING TO DATABASE===============\n')  #(4)
session.add_all([b,s])
session.commit()

l = b.purchased_lot   #(5)

print('l: ',l,'\n')

Note that before b and s are inserted into the database (3), l is None. After b and s are inserted (4), l exists and it points back to b (5). The example output is below.

b:  <__main__.BuyTransaction object at 0x7fd06edfcf40> 

s:  <__main__.SellTransaction object at 0x7fd06ec58040> 

l:  None 


==================ADDING AND COMMITTING TO DATABASE===============

l:  <Lot(buy_transaction=<__main__.BuyTransaction object at 0x7fd06edfcf40>,sell_transaction=None> 

Example 3: Inserting a SellTransaction Finds and References a Lot

Now we will add an insertion listener to SellTransaction so that when a sell transaction is created and inserted Lot can be queried to return an unsold lot to which the new sell transaction will point. Note that the implementation here is over simplified. Eventually we will want to select lots with the correct ticker, and we will need to make sure that the number of shares sold is equal to the number of shares in the lot. But these features will be easy once the data model is correct. Here is the updated model declaration code:

from sqlalchemy import Column, ForeignKey, Integer, Date
from sqlalchemy.orm import declarative_base, relationship, sessionmaker
from sqlalchemy import create_engine
from sqlalchemy import event
from sqlalchemy import select, update 

Base = declarative_base()

'''
Now, when we create a SellTransaction, find the first Lot with no buy transaction 
and point the lot's sell_transaction to this newly created SellTransaction.  
Also add 'transaction_date' field to BuyTransaction so that we can sort the 
Lots in order when attaching the SellTransaction.
Note we need to use after_insert so that IDs are assigned before we update Lot.
'''

class BuyTransaction(Base):
    __tablename__ = 'buy_transaction'

    transaction_date = Column(Date,nullable=False)                              #(6)
    buy_transaction_id = Column(Integer, primary_key=True)
    purchased_lot = relationship("Lot",back_populates='buy_transaction',
                                 uselist=False)

@event.listens_for(BuyTransaction, "after_insert")
def after_insert(mapper,connection,instance):
    connection.execute(instance.metadata.tables['lot'].insert(),
                       {"buy_transaction_id":instance.buy_transaction_id})

class SellTransaction(Base):
    __tablename__ = 'sell_transaction'

    sell_transaction_id = Column(Integer, primary_key=True)
    sold_lot = relationship("Lot",back_populates='sell_transaction',
                                 uselist=False)
    
@event.listens_for(SellTransaction, "after_insert")
def after_insert(mapper,connection,instance):
    # Find the earliest Lot with no SellTransaction
    tables = instance.metadata.tables
    lot_table = tables['lot']
    buy_transaction_table = tables['buy_transaction']
    qstmt = select(lot_table.c.buy_transaction_id). \                          #(7)
            where(lot_table.c.sell_transaction_id == None). \
            join_from(lot_table,buy_transaction_table). \            
            order_by(buy_transaction_table.c.transaction_date) 
    res = connection.execute(qstmt).first()
    update_stmt = update(lot_table).where(lot_table.c.buy_transaction_id == res[0]). \
            values(sell_transaction_id=instance.sell_transaction_id)
    connection.execute(update_stmt)

class Lot(Base):
    __tablename__ = 'lot'

    buy_transaction_id = Column(Integer,
                                ForeignKey('buy_transaction.buy_transaction_id'),
                                primary_key=True)

    sell_transaction_id = Column(Integer,
                                 ForeignKey('sell_transaction.sell_transaction_id'))

    buy_transaction = relationship(BuyTransaction,
                                   back_populates='purchased_lot')
    sell_transaction = relationship(SellTransaction,
                                    back_populates='sold_lot')

    def __repr__(self):
        return "<Lot(buy_transaction={buy},sell_transaction={sell}>".format(
            buy=self.buy_transaction,sell=self.sell_transaction)

Note that in (7) we added a transaction_date attribute to BuyTransaction. This is so that when we query Lot for a data member to associate with the new sell_transaction, the result is the lot with the earliest purchase date. Also, the query to find the correct lot (8) uses an explicit join condition onto the buy_transaction. Without the join condition we would not be able to reference the buy_transaction‘s transaction_date and thus sort our query results.

The driver code and example printout are below, and the executable can be found here.


engine = create_engine('sqlite+pysqlite:///:memory:')
Base.metadata.create_all(engine)
DBSession = sessionmaker(bind=engine)
session = DBSession()

import datetime
b = BuyTransaction(transaction_date=datetime.date(2021,9,1))
s = SellTransaction()
l = b.purchased_lot

print('\n\nb: ',b,'\n')
print('s: ',s,'\n')
print('l: ',l,'\n')

print('\n==================ADDING AND COMMITTING TO DATABASE===============\n')
session.add_all([b,s])
session.commit()

l = b.purchased_lot

print('l: ',l,'\n')

This time we see in the output that the new sell transaction successfully linked to the lot when the sell_transaction was inserted into the database.

b:  <__main__.BuyTransaction object at 0x7f3ee9a48ee0> 

s:  <__main__.SellTransaction object at 0x7f3ee98928e0> 

l:  None 


==================ADDING AND COMMITTING TO DATABASE===============

l:  <Lot(buy_transaction=<__main__.BuyTransaction object at 0x7f3ee9a48ee0>,sell_transaction=<__main__.SellTransaction object at 0x7f3ee98928e0>> 

Step 4: Inheriting from a Transaction Superclass

The last step in the implementation is to make BuyTransaction and SellTransaction subclasses of a Transaction superclass. We will do this using the single-table inheritance pattern provided by sqlalchemy.

The first step is to update the model declarations to show the inheritance relationships:

class Transaction(Base):
    __tablename__ = 'transaction'
    
    transaction_id = Column(Integer, primary_key=True)
    date = Column(Date,nullable=False)
    type = Column(String(20))

    # 
    __mapper_args__ = {
        'polymorphic_on':type
    }

class BuyTransaction(Transaction):

    purchased_lot = relationship("Lot",back_populates='buy_transaction',
                                 uselist=False)

    __mapper_args__ = {
        'polymorphic_identity': 'buy_transaction'
    }


class SellTransaction(Transaction):
    
    sold_lot = relationship("Lot",back_populates='sell_transaction',
                                 uselist=False)
    
    __mapper_args__ = {
        'polymorphic_identity': 'sell_transaction'
    }

But we’re not quite done! If we left it at this, we’d return an AmbiguousForeignKeysError:

sqlalchemy.exc.AmbiguousForeignKeysError: Could not determine join condition between parent/child tables on relationship BuyTransaction.purchased_lot - there are multiple foreign key paths linking the tables.  Specify the 'foreign_keys' argument, providing a list of those columns which should be counted as containing a foreign key reference to the parent table.

This error happens because now that BuyTransaction and SellTransaction share a single table, there are two columns on the lot_table that contain foreign keys from the transaction_table. We need to explicitly state which column from lot_table corresponds to the transaction_id for a buy_transaction, and which column from lot_table corresponds to the foreign key for the sell_transaction. We do this with the foreign_keys parameter on relationship. Here is the fixed code showing explicit foreign keys for both BuyTransaction and SellTransaction:

class Transaction(Base):
    __tablename__ = 'transaction'
    
    transaction_id = Column(Integer, primary_key=True)
    date = Column(Date,nullable=False)
    type = Column(String(20))

    # 
    __mapper_args__ = {
        'polymorphic_on':type
    }

class BuyTransaction(Transaction):

    purchased_lot = relationship("Lot",back_populates='buy_transaction',
                                 foreign_keys='lot.c.buy_transaction_id',
                                 uselist=False)

    __mapper_args__ = {
        'polymorphic_identity': 'buy_transaction'
    }

@event.listens_for(BuyTransaction, "after_insert")
def after_insert(mapper,connection,instance):
    connection.execute(instance.metadata.tables['lot'].insert(),
                       {"buy_transaction_id":instance.transaction_id})

class SellTransaction(Transaction):
    
    sold_lot = relationship("Lot",back_populates='sell_transaction',
                            foreign_keys='lot.c.sell_transaction_id',
                            uselist=False)
    
    __mapper_args__ = {
        'polymorphic_identity': 'sell_transaction'
    }

We are almost there! If we stopped here, we’d run in to this error:

sqlalchemy.exc.AmbiguousForeignKeysError: Can't determine join between 'lot' and 'transaction'; tables have more than one foreign key constraint relationship between them. Please specify the 'onclause' of this join explicitly. 

This error comes from the query on Lot that we execute when we’re trying to find an unsold lot with the earliest buy_transaction transaction_date, which currently is written as:

qstmt = select(lot_table.c.buy_transaction_id). \
        where(lot_table.c.sell_transaction_id == None). \
        join_from(lot_table,transaction_table). \
        order_by(transaction_table.c.date) 

Again, the query requires a join from the lot_table to the transaction_table, and we need to explicitly state which column on lot_table the join should use. Use the onclause parameter in join_from to specify the relationship:

qstmt = select(lot_table.c.buy_transaction_id). \
           where(lot_table.c.sell_transaction_id == None). \
           join_from(lot_table,transaction_table,
             onclause=lot_table.c.buy_transaction_id==transaction_table.c.transaction_id). \
           order_by(transaction_table.c.date) 

The complete example with driver code is as follows, and the file can be downloaded from here.

from sqlalchemy import Column, ForeignKey, Integer, Date, String
from sqlalchemy.orm import declarative_base, relationship, sessionmaker
from sqlalchemy import create_engine
from sqlalchemy import event
from sqlalchemy import select, update 

Base = declarative_base()

'''
Last step:  buy transaction and sell transaction are subclasses of transaction. i

Use Single Table Inheritance pattern (https://docs.sqlalchemy.org/en/14/orm/inheritance.html?highlight=single%20table%20inheritance#single-table-inheritance)
'''

class Transaction(Base):
    __tablename__ = 'transaction'
    
    transaction_id = Column(Integer, primary_key=True)
    date = Column(Date,nullable=False)
    type = Column(String(20))

    # 
    __mapper_args__ = {
        'polymorphic_on':type
    }

class BuyTransaction(Transaction):

    purchased_lot = relationship("Lot",back_populates='buy_transaction',
                                 foreign_keys='lot.c.buy_transaction_id',
                                 uselist=False)

    __mapper_args__ = {
        'polymorphic_identity': 'buy_transaction'
    }

@event.listens_for(BuyTransaction, "after_insert")
def after_insert(mapper,connection,instance):
    connection.execute(instance.metadata.tables['lot'].insert(),
                       {"buy_transaction_id":instance.transaction_id})

class SellTransaction(Transaction):
    
    sold_lot = relationship("Lot",back_populates='sell_transaction',
                            foreign_keys='lot.c.sell_transaction_id',
                            uselist=False)
    
    __mapper_args__ = {
        'polymorphic_identity': 'sell_transaction'
    }

@event.listens_for(SellTransaction, "after_insert")
def after_insert(mapper,connection,instance):
    # Find the earliest Lot with no SellTransaction
    tables = instance.metadata.tables
    lot_table = tables['lot']
    transaction_table = tables['transaction']
    
    '''
    sqlalchemy.exc.AmbiguousForeignKeysError: Can't determine join between 'lot' and 'transaction'; tables have more than one foreign key constraint relationship between them. Please specify the 'onclause' of this join explicitly.
    '''

    qstmt = select(lot_table.c.buy_transaction_id). \
            where(lot_table.c.sell_transaction_id == None). \
            join_from(lot_table,transaction_table,
                      onclause=lot_table.c.buy_transaction_id==transaction_table.c.transaction_id). \
            order_by(transaction_table.c.date) 
    res = connection.execute(qstmt).first()
    update_stmt = update(lot_table).where(lot_table.c.buy_transaction_id == res[0]). \
            values(sell_transaction_id=instance.transaction_id)
    connection.execute(update_stmt)

''' Lot is like an edge connecting two transactions'''
class Lot(Base):
    __tablename__ = 'lot'

    # Each lot has one non-null buy transaction and zero or one sell transactions
    # Creation of a buy transaction should create a lot
    buy_transaction_id = Column(Integer,
                                ForeignKey('transaction.transaction_id'),
                                primary_key=True)

    sell_transaction_id = Column(Integer,
                                 ForeignKey('transaction.transaction_id'))



    buy_transaction = relationship(BuyTransaction,
                                   back_populates='purchased_lot',
                                  foreign_keys=buy_transaction_id)
    sell_transaction = relationship(SellTransaction,
                                    back_populates='sold_lot',
                                   foreign_keys=sell_transaction_id)

    def __repr__(self):
        return "<Lot(buy_transaction={buy},sell_transaction={sell}>".format(
            buy=self.buy_transaction,sell=self.sell_transaction)



engine = create_engine('sqlite+pysqlite:///:memory:')
Base.metadata.create_all(engine)
DBSession = sessionmaker(bind=engine)
session = DBSession()

import datetime
b = BuyTransaction(date=datetime.date(2021,9,1))
s = SellTransaction(date=datetime.date(2021,9,8))
l = b.purchased_lot

print('\n\nb: ',b,'\n')
print('s: ',s,'\n')
print('l: ',l,'\n')

print('\n==================ADDING AND COMMITTING TO DATABASE===============\n')
session.add_all([b,s])
session.commit()

l = b.purchased_lot

print('l: ',l,'\n')