Create a Fact class for storing facts

Include the URL of your launchpad blueprint:

https://blueprints.launchpad.net/congress/+spec/fact-datastructure

Today, the congress runtime stores facts as Rules. This is inefficient from a memory perspective since each Rule contains a Literal, which in turn contains a Term for each column. Each of these is a python object, and each of them also contains a few extra fields like head, body, location, negated, etc. Using Rules is also inefficient with CPU since congress needs to construct all these objects. This blueprint proposes to use a Fact data structure to store each fact. A Fact is a subclass of a native tuple, plus one field for table name. This is much more efficient memorywise and CPUwise than using a Rule because there are no extra objects like Literal and Term. Preliminary tests show a 10x reduction in CPU for initializing tables plus a 3x reduction in memory use.

Problem description

A detailed description of the problem:

  • Today, congress stores each fact as a Rule object

  • A Rule object contains many objects and fields

  • Many objects and fields means that creating and storing a fact uses lots of CPU and memory resources.

  • High CPU and memory use makes congress unable to scale to larger datasets.

Proposed change

We propose to create a new class called Fact to store each fact. A Fact is a native tuple plus one string for table name. Using a Fact will eliminate all the subfields and subobjects in Rule.

Alternatives

None

Policy

None

Policy Actions

None

Data Sources

None

Data model impact

None

REST API impact

None

Security impact

None

Notifications impact

None

Other end user impact

None

Performance impact

Preliminary testing shows a 10x reduction in CPU use and a 3x reduction in memory use in initialize_table() for 7M facts where the payload is 700MB.

Other deployer impact

None

Developer impact

A Theory object will internally contain Rules and Facts. The caller an insert a Fact into a RuleSet. However, whenever someone fetches the rules from a Theory the RuleSet converts Facts to Rules before returning them.

Implementation

Assignee(s)

Primary assignee:

ayip

Work items

  • Implement FactSet

  • Use FactSet inside of RuleSet

  • Change initialize_tables() to avoid instantiating a list of all facts coming from DSE

Dependencies

None

Testing

Add a unit test for FactSet

Documentation impact

None

References

None