Examples
For each example, the complete source code is available in Github in the examples directory
Example 1 - Multiple Data Sources
This is a simple example to show how DiffSync can be used to compare and synchronize multiple data sources.
For this example, we have a shared model for Device and Interface defined in models.py
And we have 3 instances of DiffSync based on the same model but with different values (BackendA, BackendB & BackendC).
The source code for this example is in Github in the examples/01-multiple-data-sources/ directory.
First create and populate all 3 objects:
from backend_a import BackendA
from backend_b import BackendB
from backend_c import BackendC
# Create each
a = BackendA()
a.load()
print(a.str())
b = BackendB()
b.load()
print(b.str())
c = BackendC()
c.load()
print(c.str())
Configure verbosity of DiffSync’s structured logging to console; the default is full verbosity (all logs including debugging):
from diffsync.logging import enable_console_logging
enable_console_logging(verbosity=0) # Show WARNING and ERROR logs only
# enable_console_logging(verbosity=1) # Also include INFO logs
# enable_console_logging(verbosity=2) # Also include DEBUG logs
Show the differences between A and B:
diff_a_b = a.diff_to(b)
print(diff_a_b.str())
Show the differences between B and C:
diff_b_c = c.diff_from(b)
print(diff_b_c.str())
Synchronize A and B (update B with the contents of A):
a.sync_to(b)
print(a.diff_to(b).str())
# Alternatively you can pass in the diff object from above to prevent another diff calculation
# a.sync_to(b, diff=diff_a_b)
Now A and B will show no differences:
diff_a_b = a.diff_to(b)
print(diff_a_b.str())
In the Device model, the
site_nameandroleare not included in the_attributes, so they are not shown when we are comparing the different objects, even if the value is different.
Example 2 - Callback Function
This example shows how you can set up DiffSync to invoke a callback function to update its status as a sync proceeds. This could be used to, for example, update a status bar (such as with the tqdm library), although here for simplicity we’ll just have the callback print directly to the console.
The source code for this example is in Github in the examples/02-callback-function/ directory.
from diffsync.logging import enable_console_logging
from main import DiffSync1, DiffSync2, print_callback
enable_console_logging(verbosity=0) # Show WARNING and ERROR logs only
# Create a DiffSync1 instance and populate it with records numbered 1-100
ds1 = DiffSync1()
ds1.load(count=100)
# Create a DiffSync2 instance and populate it with 100 random records in the range 1-200
ds2 = DiffSync2()
ds2.load(count=100)
# Identify and attempt to resolve the differences between the two,
# periodically invoking print_callback() as DiffSync progresses
ds1.sync_to(ds2, callback=print_callback)
You should see output similar to the following:
diff: Processed 1/200 records.
diff: Processed 3/200 records.
...
diff: Processed 199/200 records.
diff: Processed 200/200 records.
sync: Processed 1/134 records.
sync: Processed 2/134 records.
...
sync: Processed 134/134 records.
A few points to note:
For each record in
ds1andds2, either it exists in both, exists only inds1, or exists only inds2.The total number of records reported during the
"diff"stage is the sum of the number of records in bothds1andds2.For this very simple set of models, the progress counter during the
"diff"stage will increase at each step by 2 (if a corresponding pair of models is identified betweends1andds2) or by 1 (if a model exists only inds1or only inds2).The total number of records reported during the
"sync"stage is the number of distinct records existing acrossds1andds2combined, so it will be less than the total reported during the"diff"stage.By design for this example,
ds2is populated semi-randomly with records, so the exact number reported during the"sync"stage may differ for you.
Example 3 - Work with a remote system
This is a simple example to show how DiffSync can be used to compare and synchronize data with a remote system like Nautobot via a REST API.
For this example, we have a shared model for Region and Country defined in models.py.
A country must be part of a region and has an attribute to capture its population.
The comparison and synchronization of dataset is done between a local JSON file and the public instance of Nautobot.
Also, this example is showing :
How to set a Global Flags to ignore object that are not matching
How to provide a custom Diff class to change the ordering of a group of object
The source code for this example is in Github in the examples/03-remote-system/ directory.
Install the requirements
to use this example you must have some dependencies installed, please make sure to run
pip install -r requirements.txt
Setup the environment
By default this example will interact with the public sandbox of Nautobot at https://demo.nautobot.com but you can use your own version of Nautobot by providing a new URL and a new API token using the environment variables NAUTOBOT_URL & NAUTOBOT_TOKEN
export NAUTOBOT_URL = "https://demo.nautobot.com"
export NAUTOBOT_TOKEN = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
Try the example
The first time you run this example, a lot of changes should be reported between Nautobot and the local data because by default the demo instance doesn’t have the subregion defined.
After the first sync, on subsequent runs, the diff should show no changes.
At this point, Diffsync will be able to identify and fix all changes in Nautobot. You can try to add/update or delete any country in Nautobot and DiffSync will automatically catch it and it will fix it with running in sync mode.
### DIFF Compare the data between Nautobot and the local JSON file.
python main.py --diff
### SYNC Update the list of country in Nautobot.
python main.py --sync
Example 4 - Using get or update helpers
This example aims to expand on Example 1 that will take advantage of two new helper methods on the DiffSync class; get_or_instantiate and update_or_instantiate.
Both methods act similar to Django’s get_or_create function to return the object and then a boolean to identify whether the object was created or not. Let’s dive into each of them.
get_or_instantiate
The following arguments are supported: model (DiffSyncModel), ids (dictionary), and attrs (dictionary). The model and ids are used to find an existing object. If the object does not currently exist within the DiffSync adapter, it will then use model, ids, and attrs to add the object.
It will then return a tuple that can be unpacked.
obj, created = self.get_or_instantiate(Interface, {"device_name": "test100", "name": "eth0"}, {"description": "Test description"})
If the object already exists, created will be False or else it will return True if the object had to be created.
update_or_instantiate
This helper is similar to get_or_instantiate, but it will update an existing object or add a new instance with the provided ids and attrs. The method does accept the same arguments, but requires attrs, whereas get_or_instantiate does not.
obj, created = self.update_or_instantiate(Interface, {"device_name": "test100", "name": "eth0"}, {"description": "Test description"})
Example Walkthrough
We can take a look at the data we will be loading into each backend to understand why these helper methods are valuable.
Example Data
BACKEND_DATA_A = [
{
"name": "nyc-spine1",
"role": "spine",
"interfaces": {"eth0": "Interface 0", "eth1": "Interface 1"},
"site": "nyc",
},
{
"name": "nyc-spine2",
"role": "spine",
"interfaces": {"eth0": "Interface 0", "eth1": "Interface 1"},
"site": "nyc",
},
]
Example Load
def load(self):
"""Initialize the BackendA Object by loading some site, device and interfaces from DATA."""
for device_data in BACKEND_DATA_A:
device, instantiated = self.get_or_instantiate(
self.device, {"name": device_data["name"]}, {"role": device_data["role"]}
)
site, instantiated = self.get_or_instantiate(self.site, {"name": device_data["site"]})
if instantiated:
device.add_child(site)
for intf_name, desc in device_data["interfaces"].items():
intf, instantiated = self.update_or_instantiate(
self.interface, {"name": intf_name, "device_name": device_data["name"]}, {"description": desc}
)
if instantiated:
device.add_child(intf)
The new methods are helpful due to having devices that are part of the same site. As we iterate over the data and load it into the DiffSync adapter, we would have to account for ObjectAlreadyExists exceptions when we go to add each duplicate site we encounter within the data or possibly several other models depending how complex the synchronization of data is between backends.