[yadifa-users] Multi-master design

yadifa info at yadifa.eu
Tue Jul 19 16:12:42 CEST 2016


1    Introduction

This thread only contains information about YADIFA's multi-master design feature. A different section, elaborating on the triggers to switch from one primary name server to the next one, will be made available soon.

We value the input from our community, and if the community does not share our views, we are always willing to listen and consider making adjustments to improve the usability of our software.

If you do not agree with the reasoning presented in this thread, or have remarks (positive or negative), please contact us via our mailing list to continue the discussion.

In the next YADIFA release, we will update the multi-master chapter in the user manual for clarity purposes. Writing and amending documentation is an ongoing effort and your input is greatly appreciated.


2    YADIFA as a multi-master

First off, here's a small explanation how YADIFA sees zone files:
In a secondary name server, there are 9 possible areas in which a zone file can be:

1.  True Master ON in the zone section of the secondary name server

* Zone data, where the primary name server uses Dynamic Updates to update content
* Zone data, where the primary name server uses Dynamic Updates to update content and the zone file is DNSSEC
* Zone data, where the primary name server does not use Dynamic Updates, the content is updated through the reloading of the zone data and an augmentation of the serial of the SOA resource record
* Zone data, where the primary name server does not use Dynamic Updates and the zone data is DNSSEC, the content is updated through the reloading of the zone data and an augmentation of the serial of the SOA resource record

2.  True Master OFF

* Zone data, where the primary name server uses Dynamic  Updates  to update content
* Zone data where the primary  name server uses Dynamic  Updates to update content and the zone file is DNSSEC
* Zone data, where the primary name server does not use Dynamic Updates, the content is updated through the reloading of the zone data and an augmentation of the serial of the SOA resource record
* Zone data, where the primary name server does not use Dynamic Updates and the zone data is DNSSEC, the content is updated through the reloading of the zone data and an augmentation of the serial of the SOA resource record
* Zone data, where there is a single primary name server and intermediary masters.

The possibilities (9) can be divided into two subsections:

1.  True Master option is switched ON

2.  True Master option is switched OFF

When YADIFA receives a NOTIFY, it always communicates with the same primary name server for reception of the changes (AXFR or IXFR).  If YADIFA receives a NOTIFY that contains an SOA resource record with a lesser or equal serial than its own, it ignores the message.

2.1     True Master ON

The true-master option in YADIFA is used for installations where the zone content of the primary name servers is not identical.

The  reasons  for why  the  zone  content for  the  same  serial  is not  identical  is beyond the scope of this thread.

Please Note:

In  such  scenarios,  the  same  or  a  higher  serial  does  not  mean  that the  zone content is up-to-date.
Defining multiple primary masters for a zone file indicates that, if the secondary name server does not receive an answer from the primary  master  name server for a particular zone , the secondary  name server will communicate with the next primary  name server in its list of masters.
With true-master ON, primary name server switch will result in a full AXFR of the zone data as the secondary name server cannot be sure about its content.
In all the cases, (dynamic, static, DNSSEC or not DNSSEC), delaying a switch to a different primary master will reduce the amount of wasted resources while maintaining the highest operational performance.
You can adjust the connection retries to the primary name server accordingly. If, after 'X' retries no connection can be established with the primary name server, the secondary name server will take its place in the list, resulting in an AXFR. What YADIFA considers to be a trigger for a failure is beyond the scope of this thread and is discussed in a different thread.
The design implemented in version 2.2.0 of YADIFA is considered to be optimal under these scenarios.


2.2     True Master OFF

In this case, YADIFA considers all the primary name servers with the same serial as having identical zone data.

Primary name servers using a dynamic zone file with DNSSEC:

In this case you REALLY cannot be sure, no matter the configuration, that the same SOA resource record serial has the same zone data. Jitter in signing the zone, resigning of the zone and the dynamic updates which are never completely on the same time on all primary name servers.   This results in the content not being 100 percent the same on all the primary name servers.  In this case, true-master ON is the best and only choice.

Please Note:

We are considering real primary name servers and not intermediary masters.

As an example, let's look at cases with a static zone file, both with and without DNSSEC and intermediary masters.

We are assuming  for the  DNSSEC  enabled  zone that all the  signatures are pre-calculated and  that primary  name  server(s)  are  not  responsible  for maintaining  the  signatures, which would otherwise  result  in a scenario  where true multi-master would be preferred.
When  we are absolutely  sure that the content is identical,  the zone content could  be updated more  quickly  by switching  to  the  first  primary  name  server for which  a  NOTIFY is received. As different setups (e.g.  without  bind's ixfr-from-differences yes;) could  result  in  an  AXFR through an update, we have opted  to allow the  control  to the  host master as different paths  to the  primary name  servers  may  have  different bandwidth restrictions and  costs  associated with them.
Additionally, for a host master with thousands of zones to administer, the load between different masters can be distributed by simply rotating the primary name servers in the configuration.
In the case where the zone is guaranteed to contain identical content, the cost of the paths  is irrelevant, the load on the primary  name servers is dynamic and  there  is a great  delay  between  the  updates on the  primary  name  servers. Requesting UPDATE's from the primary name server with the first NOTIFY has advantages, however we opted for a single and consistent model for all cases.

Please Note:

To avoid flapping services, we have implemented a round-robin scheme.  When the  first primary  name  server  is known to be bad  (configurable),  we switch  to the  next  primary  name  server  in the  list.  When the host master has addressed the issue and wants to switch back to the "first" primary name server, this can be done by issuing a config reload.



3    Conclusion

In  the  ever  growing  world  where  more  and  more  zones  are  configured  with DNSSEC, the overall benefits of our current model are clear.

In  most  cases  depicted  above,  the  current  model  is the  only  model  that can be considered  and in the other  cases, there  could be a slight improvement, depending  on the delay between the updates of the master(s). However, in most real-world  scenarios,  this  model  provides  a  good  balance  between  protection against  failure  of the  primary  name  server  (or  network)  while also  having  the  most current zone information. The consistent behavior also allows us to seamlessly and on-the-fly switch between dynamic zones, with and without DNSSEC.



4         Summary

1. On a multi-master where the zone content of all primary name servers is perfectly in sync (for a given serial), changing the "preferred" is virtually cost- less.  In this  setup,  if the  "first/preferred" does not  answer  to  an  SOA or an IXFR  query from the secondary  name server (YADIFA), trying  the next one is henceforth,  a trivial  task.


2. On a multi-master where the zone content of the primary  name servers is not  in sync,  slight differences may  occur  for different reasons  (updates chunk sizes, DNSSEC,  ...)   even  if, in  the  end,  the  content should  be the  same. Changing the "first" is very costly as it requires a complete zone transfer. This case is only useful to change the master when absolutely needed.

When designing the multi-master, we wanted to have YADIFA behave in a consistent manner for all cases.

When an update request is triggered by notification, there are a few choices:

1. Only query the primary name server that sent the notification. Doing this  may  trigger  frequent  primary  name  server  changes,  which we don't  want for a true  multi-master or when "expensive"  paths  exist.

2. Query all primary name servers and take the one that has the highest serial. Doing this may trigger frequent master changes, which we don't want for a true multi-master. In addition, we are doing more queries and disturbing more servers, which is not desirable for a slave with thousands of zones

3. Query  the  master  deemed  as  "primary" by the  slave,  changing  in case of (a  number  of ) hiccups.  There is little risk of changing the "first" without cause, which we want for true multi-master. The predetermined ordering of the master allows admins to initially spread the load between all master servers simply having different order combinations for each zone.

Kind regards,

R&D Team
EURid VZW

-------------- next part --------------
An HTML attachment was scrubbed...
URL: </archives/yadifa-users/attachments/20160719/3d2c3cc1/attachment-0001.html>


More information about the yadifa-users mailing list