Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POC: OrderedSet implementation #25

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

DandyLyons
Copy link
Collaborator

@DandyLyons DandyLyons commented Sep 16, 2024

Note: This PR is not meant to be merged onto main. Rather it's meant to be merged onto another branch such as a poc-orderedset branch.

This PR is a proof of concept of a reimplementation of BijectiveDictionary using two OrderedSets rather than two Dictionarys.
Per our earlier conversation, I don't recommend this implementation until AFTER a 1.0 release.

Why OrderedSet is a good candidate

To maintain bijectivity, we need the following guarantees:

  1. the left values must be unique
  2. the right values must be unique
  3. each left value maps to one and only one right value (and vice versa)

An OrderedSet should be able to guarantee these characteristics. Since it's a set, it already guarantees uniqueness. (This guarantees 1 and 2).

Since it's ordered, it is indexed with an Int just like an Array. (In fact, the OrderedSet is implemented with an Array). If we can keep the left and right collections then we will be able to use the same index on both sides. Lookups on an OrderedSet are O(1) (just like lookups by index on an Array).

Finally, while two dictionaries is a good solution, with very fast lookup times, it is very wasteful when it comes to memory. The memory usage is roughly twice that of a vanilla Dictionary. With two ordered sets, we are storing each value only once, so we should be using roughly half as much memory compared to the current implementation.

Status

This PR is still in a very early state and I haven't yet proven that it is viable, but it is looking promising. There are some blockers, but I feel confident that we should be able to find solutions to these.

While there are definite potential benefits, including cutting our memory usage roughly in half, there are definitely some tradeoffs. For example, it seems likely that we would lose O(1) mutation (while keeping O(1) lookups). This still needs to be explored much more.

Current Blockers

UPDATE: Unblocked.

I've discovered that the OrderedSet API does not make it easy to update a value in place.
update(_:at:) requires that the new value be equal to the old value. I'm not sure what use case this would be useful in, but it certainly doesn't match our use case.

Potential Solutions

It may not be possible to update values in place. If so we may have to first remove the values, then append the updated values. If this were the case, it would be a major bummer. It would mean that we would prefer O(1) lookups, but would almost certainly lose O(1) mutation.

Mutating OrderedSet through it's UnorderedView

UPDATE: This approach seems unfeasible.
See: https://swiftpackageindex.com/apple/swift-collections/1.1.3/documentation/orderedcollections/orderedset/unorderedview

This view is O(1) for both the setter and getter.
This view should be conceptually equivalent to a Set since it is generic over SetAlgebra.

Mutating OrderedSet through it's Array View

UPDATE: This is the approach that I used.
The OrderedSet API allows you to mutate values in place using a view on its Array. However this requires some post-processing, and we'll probably lose O(1) mutation. We should expect O(n) mutation instead.

Other Potential Opportunities

OrderedSet has a few other properties that could be beneficial to us, and our users. It could even have an effect on the public API.

Ordered

It's conceivable that since OrderedSet is ordered, we could guarantee that BijectiveDictionary is ordered. I'm not quite sure if this is benefit, since a vanilla Dictionary is not ordered, but it's nice to know that it is a potential benefit.

Random Access Collections

Ordered sets are random-access collections. This could be beneficial for certain use cases.
For more info, see: https://swiftpackageindex.com/apple/swift-collections/1.1.0/documentation/orderedcollections/orderedset#Sequence-and-Collection-Operations

We might not need invariantCheck() anymore

I haven't yet figured out how I would implement invariantCheck and in fact I'm not even certain if a variant would be possible in this new implementation.

@DandyLyons
Copy link
Collaborator Author

I'll explore more later and push more changes.

@DandyLyons
Copy link
Collaborator Author

Updates

  • Added Hashable
  • Add removal
  • Add Collection

@DandyLyons
Copy link
Collaborator Author

Interesting Areas for Further Exploration

The new implementation has semantics like Dictionary but internal characteristics that are similar to Array, Set and Dictionary.

It may or may not be advantageous to add conformance to RangeReplaceableCollection, RandomAccessCollection and MutableCollection.

From: https://itwenty.me/posts/04-swift-collections/collection_hierarchy.png

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant