跳转到主内容

由于DNS查询失败、客户端自动挂载偶尔会失败

Views:
25
Visibility:
Public
Votes:
0
Category:
fas-systems<a>DNS负载平衡</a><a>绑定</a><a>2009-176716</a>
Specialty:
nfs
Last Updated:

适用场景

机载DNS负载平衡
 

问题描述

  • 在挂载风暴环境中、绑定版本等于或大于 bind-9.9.9.4-69.el7 且开始支持 0 TTL的外部Linux DNS服务器 可能会将DNS查询请求转发到非侦听器机载DNS服务器LIF
  • 如果外部DNS服务器将DNS查询请求转发给非侦听器机载DNS服务器LIF、则ONTAP 会拒绝初始连接请求
  • 最终会导致客户端DNS查询失败
 
示例:
 
//IP信息
 
NFS客户端:
192.168.0.61
外部DNS服务器IP:
192.168.0.188
存储IP:
192.168.0.134
192.168.0.135 (仅允许使用此IP来处理DNS查询请求。listen-for-dns-query:true)
192.168.0.136
192.168.0.137
192.168.0.139
 
//OTNAP配置
 
cluster1::*> net int show -vserver svm2_cluster1 -fields listen-for-dns-query,dns-zone,address
(network interface show)
vserver  lif   address  dns-zone  listen-for-dns-query
------------- --------------------- ------------- ----------------------- --------------------
svm2_cluster1 lif_svm2_cluster1_433 192.168.0.135 storage_hostname.test.com true
svm2_cluster1 lif_svm2_cluster1_453 192.168.0.137 storage_hostname.test.com false
svm2_cluster1 lif_svm2_cluster1_583 192.168.0.134 storage_hostname.test.com false
svm2_cluster1 lif_svm2_cluster1_858 192.168.0.136 storage_hostname.test.com false
svm2_cluster1 lif_svm2_cluster1_996 192.168.0.139 storage_hostname.test.com false
 
///分区文件配置
 
/var/named/test.zone
------------------------------------------
;TR guide setting
$ORIGIN storage_hostname.com.
@   IN   NS   storage_hostname.com.
IN   NS   ansible.seccae.com.
storage_hostname.test.com.  IN   A  192.168.0.135
 

///日志分析

NFS客户端返回SERVERFAIL错误和自动挂载失败消息 

Sun Jun 26 15:14:06 UTC 2022
storage_hostname.test.com has address 192.168.0.134
Host storage_hostname.test.com not found: 2(SERVFAIL)
Host storage_hostname.test.com not found: 2(SERVFAIL)
Sun Jun 26 15:14:07 UTC 2022
Host storage_hostname.test.com not found: 2(SERVFAIL)

May 9 07:55:12 client_hostname automount[1364] : add_host_addrs: hostname lookup for storage_hostname failed: Name or service not known

外部DNS服务器从 named.run日志返回连接被拒绝的信息级别消息

6-Jun-2022 15:10:02.301 resolver: debug 1: fetch: storage_hostname.test.com/AAAA
26-Jun-2022 15:10:02.335 resolver: debug 1: fetch: storage_hostname.test.com/MX
26-Jun-2022 15:14:06.990 resolver: debug 1: fetch: storage_hostname.test.com/A
26-Jun-2022 15:14:07.004 resolver: debug 1: fetch: storage_hostname.test.com/AAAA
26-Jun-2022 15:14:07.005 lame-servers: info: connection refused resolving 'storage_hostname.test.com/AAAA/IN': 192.168.0.134#53
26-Jun-2022 15:14:07.005 resolver: debug 1: fetch: storage_hostname.test.com/AAAA
26-Jun-2022 15:14:07.005 query-errors: debug 1: client @0x7f5362edac20 192.168.0.61#60563 (storage_hostname.test.com): query failed (SERVFAIL) for storage_hostname.test.com/IN/AAAA at ../../../bin/named/query.c:8580
26-Jun-2022 15:14:07.005 lame-servers: debug 1: lame server resolving 'storage_hostname.test.com' (in 'storage_hostname.test.com'?): 192.168.0.188#53
26-Jun-2022 15:14:07.005 resolver: debug 1: fetch: storage_hostname.test.com/MX
26-Jun-2022 15:14:07.006 lame-servers: info: connection refused resolving 'storage_hostname.test.com/AAAA/IN': 192.168.0.134#53
26-Jun-2022 15:14:07.006 lame-servers: info: connection refused resolving 'storage_hostname.test.com/MX/IN': 192.168.0.134#53
26-Jun-2022 15:14:07.006 query-errors: debug 1: client @0x7f53600aa070 192.168.0.61#57471 (storage_hostname.test.com): query failed (SERVFAIL) for storage_hostname.test.com/IN/MX at ../../../bin/named/query.c:8580
26-Jun-2022 15:14:07.017 resolver: debug 1: fetch: storage_hostname.test.com/A
26-Jun-2022 15:14:07.017 lame-servers: info: connection refused resolving 'storage_hostname.test.com/A/IN': 192.168.0.134#53
26-Jun-2022 15:14:07.017 query-errors: debug 1: client @0x7f5362edac20 192.168.0.61#48698 (storage_hostname.test.com): query failed (SERVFAIL) for storage_hostname.test.com/IN/A at ../../../bin/named/query.c:8580

存储端pakcet跟踪在
此处清楚地记录了此行为、只有192.168.0.135是机载DNS列表程序IP

72   2022-06-26 23:14:19.572835   6.154698   192.168.0.188   38078    192.168.0.135   53   DNS   Standard query 0x1d9c A storage_hostname.test.com OPT
73   2022-06-26 23:14:19.585266   0.012431   192.168.0.135   53   192.168.0.188   38078   DNS   Standard query response 0x1d9c A storage_hostname.test.com A 192.168.0.134 NS storage_hostname.test.com OPT
74   2022-06-26 23:14:19.586980   0.001714   192.168.0.188   33860    192.168.0.134   53   DNS   Standard query 0x88a9 AAAA storage_hostname.test.com OPT
75   2022-06-26 23:14:19.587027   0.000047   192.168.0.134   33860    192.168.0.188   53   ICMP   Destination unreachable (Port unreachable)

可以通过从外部DNS服务器捕获的数据包跟踪检查完整行为

  • 查询成功
    • 客户端向外部DNS服务器发送查询请求
    • 外部DNS服务器将此查询请求转发到机载DNS IP 192.168.0.135
    • 对外部DNS服务器发出查询响应的机载DNS回复(此时、外部DNS服务器似乎记住IP 192.168.0.134)
    • 外部DNS服务器使用查询响应响应响应客户端(使用192.168.0.134挂载存储系统)
  • AAAA查询失败
    • 客户端向外部DNS服务器发送AAAA查询请求
    • 外部DNS服务器将此AAAA查询请求转发到机载DNS IP 192.168.0.134
    • 机载DNS回复 Destination unreachable 到外部DNS服务器
    • 外部DNS服务器使用响应客户端 query failed (SERVFAIL)
  • MX查询失败
    • 客户端向外部DNS服务器发送MX查询请求
    • 外部DNS服务器将此MX查询请求转发到机载DNS IP 192.168.0.134
    • 机载DNS回复 Destination unreachable 到外部DNS服务器
    • 外部DNS服务器使用响应客户端 query failed (SERVFAIL)
  • 查询失败
    • 客户端向外部DNS服务器发送查询请求
    • 外部DNS服务器将此查询请求转发到机载DNS IP 192.168.0.134
    • 机载DNS回复 Destination unreachable 到外部DNS服务器
    • 外部DNS服务器使用响应客户端 query failed (SERVFAIL)

967   2022-06-26 23:14:06.989631   0.000501   192.168.0.61   36365    192.168.0.188   53   DNS   Standard query 0x6fea A storage_hostname.test.com
969   2022-06-26 23:14:06.990708   0.000557   192.168.0.188   38078    192.168.0.135   53   DNS   Standard query 0x1d9c A storage_hostname.test.com OPT
970   2022-06-26 23:14:07.003287   0.012579   192.168.0.135   53   192.168.0.188   38078    DNS   Standard query response 0x1d9c A storage_hostname.test.com A 192.168.0.134 NS storage_hostname.test.com OPT
971   2022-06-26 23:14:07.003852   0.000565   192.168.0.188   53   192.168.0.61   36365    DNS   Standard query response 0x6fea A storage_hostname.test.com A 192.168.0.134 NS storage_hostname.test.com
972   2022-06-26 23:14:07.004491   0.000639   192.168.0.61   60563    192.168.0.188   53   DNS   Standard query 0x47bc AAAA storage_hostname.test.com
973   2022-06-26 23:14:07.004892   0.000401   192.168.0.188   33860    192.168.0.134   53   DNS   Standard query 0x88a9 AAAA storage_hostname.test.com OPT
974   2022-06-26 23:14:07.005017   0.000125   192.168.0.134   33860    192.168.0.188   53   ICMP   Destination unreachable (Port unreachable)
975   2022-06-26 23:14:07.005451   0.000434   192.168.0.188   53   192.168.0.61   60563    DNS   Standard query response 0x47bc Server failure AAAA storage_hostname.test.com
976   2022-06-26 23:14:07.005710   0.000259   192.168.0.61   57471    192.168.0.188   53   DNS   Standard query 0xe348 MX storage_hostname.test.com
977   2022-06-26 23:14:07.005742   0.000032   192.168.0.188   59080    192.168.0.134   53   DNS   Standard query 0xde2d AAAA storage_hostname.test.com OPT
978   2022-06-26 23:14:07.005809   0.000067   192.168.0.134   59080    192.168.0.188   53   ICMP   Destination unreachable (Port unreachable)
979   2022-06-26 23:14:07.005901   0.000092   192.168.0.188   51515    192.168.0.134   53   DNS   Standard query 0x0a65 MX storage_hostname.test.com OPT
980   2022-06-26 23:14:07.005999   0.000098   192.168.0.134   51515    192.168.0.188   53   ICMP   Destination unreachable (Port unreachable)
981   2022-06-26 23:14:07.006199   0.000200   192.168.0.188   53   192.168.0.61   57471    DNS   Standard query response 0xe348 Server failure MX storage_hostname.test.com
982   2022-06-26 23:14:07.016776   0.010577   192.168.0.61   48698    192.168.0.188   53   DNS   Standard query 0x573a A storage_hostname.test.com
983   2022-06-26 23:14:07.017217   0.000441   192.168.0.188   46874    192.168.0.134   53   DNS   Standard query 0xf57a A storage_hostname.test.com OPT
984   2022-06-26 23:14:07.017343   0.000126   192.168.0.134   46874    192.168.0.188   53   ICMP   Destination unreachable (Port unreachable)
985   2022-06-26 23:14:07.017536   0.000193   192.168.0.188   53   192.168.0.61   48698    DNS   Standard query response 0x573a Server failure A storage_hostname.test.com
987   2022-06-26 23:14:07.027914   0.010016   192.168.0.61   53770    192.168.0.188   53   DNS   Standard query 0x4a57 A storage_hostname.test.com
989   2022-06-26 23:14:07.028215   0.000217   192.168.0.188   53   192.168.0.61   53770    DNS   Standard query response 0x4a57 Server failure A storage_hostname.test.com
995   2022-06-26 23:14:07.038529   0.009511   192.168.0.61   41057    192.168.0.188   53   DNS   Standard query 0x498a A storage_hostname.test.com
996   2022-06-26 23:14:07.038756   0.000227   192.168.0.188   53   192.168.0.61   41057    DNS   Standard query response 0x498a Server failure A storage_hostname.test.com

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.