End-user troubleshooting of bad c-ares interaction with router

Nicholas Chammas nicholas.chammas at gmail.com
Tue Jan 23 17:25:01 CET 2024


Thank you for all the troubleshooting help, Brad.

I am using gRPC via Apache Spark Connect (a Python library), so I am two levels removed from c-ares itself. Looking in the Python virtual environment where gRPC is installed, I’m not sure what file to run otool on. The only seemingly relevant file I could find is called cygrpc.cpython-311-darwin.so, and otool didn’t turn up anything interesting on it.

I will take this issue up with the gRPC folks.

I see in several places that the gRPC folks are using ares_gethostbyname:
https://github.com/grpc/grpc/blob/v1.60.0/src/core/lib/event_engine/ares_resolver.cc#L287-L293
https://github.com/grpc/grpc/blob/v1.60.0/src/core/ext/filters/client_channel/resolver/dns/c_ares/grpc_ares_wrapper.cc#L748-L758
https://github.com/grpc/grpc/blob/v1.60.0/src/core/ext/filters/client_channel/resolver/dns/c_ares/grpc_ares_wrapper.cc#L1075-L1086


> On Jan 22, 2024, at 1:39 PM, Brad House <brad at brad-house.com> wrote:
> 
> Are you using gRPC installed via homebrew or is it bundled with something else?  Usually package maintainers like homebrew will dynamically link to the system versions of dependencies so they can be updated independently.  You might be able to run otool -L on grpc to see what c-ares library its picking up (and if none are listed, it might be compiled in statically).
> 
> That said, according to your grpc logs, it appears that grpc may be itself performing both A and AAAA queries and expect responses to both of those.  I see the "A" reply comes back but the "AAAA" reply never comes and it bails at that point.  Many years ago c-ares didn't have a way to request both A and AAAA records with one query, but does these days via ares_getaddrinfo(), and it was recently enhanced with logic to assist in the exact scenario you are seeing, basically it will stop retrying when at least one address family is returned. 
> 
> You might need to escalate this to the gRPC folks.
> 
> On 1/22/24 12:10 PM, Nicholas Chammas wrote:
>> Here’s the output of adig and ahost <https://gist.github.com/nchammas/a4c9873d8158c323796e9b47c064e63a#file-adig-ahost-txt>, both with and without the DNS servers set directly on the network interface (vs. just on the router).
>> 
>> I also learned that gRPC 1.60.0 may be using c-ares 1.19.1 <https://github.com/grpc/grpc/tree/v1.60.0/third_party/cares>, though again that’s just via looking at the gRPC source and not via some runtime query.
>> 
>> 
>>> On Jan 21, 2024, at 7:34 AM, Brad House <brad at brad-house.com> <mailto:brad at brad-house.com> wrote:
>>> 
>>> I think homebrew distributes the 'adig' and 'ahost' utilities from c-ares.  Can you try using those to do the same lookup so we can see the results?
>>> 
>>> On 1/19/24 11:01 AM, Nicholas Chammas wrote:
>>>> 
>>>>> On Jan 17, 2024, at 3:38 PM, Brad House <brad at brad-house.com> <mailto:brad at brad-house.com> wrote:
>>>>> What version of c-ares is installed?
>>>>> 
>>>> Sorry about the delay in responding. Answering this question is more difficult than I expected.
>>>> 
>>>> I know that Spark Connect is running gRPC 1.160.0. Looking through the gRPC repo, I see mention of c-ares 1.13.0 <https://github.com/grpc/grpc/blob/v1.60.0/cmake/cares.cmake#L42>, but I don’t know how that translates to my runtime. Homebrew tells me I have c-ares 1.25.0 installed, but again, I’m not sure if that’s what I’m actually running.
>>>> 
>>>> Is there a way I can directly query the version of c-ares being run via Spark Connect / gRPC? I asked this question on the gRPC forum <https://groups.google.com/g/grpc-io/c/3tZCa48Xvh8> but no response yet.
>>>> 
>>>> For the record, I know that c-ares is involved because if I tell gRPC to not use it (via GRPC_DNS_RESOLVER=native <https://github.com/grpc/grpc/blob/b34d98fbd47834845e3f9cdaa4aa706f1aa4eddb/doc/environment_variables.md>) then my problem disappears.
>>>>> What DNS servers are configured on your MacOS system when its not operating properly?  The output of "scutil --dns" would be helpful here.
>>>>> 
>>>> Here’s that output. <https://gist.github.com/nchammas/a4c9873d8158c323796e9b47c064e63a#file-scutil-dns-txt> I believe 192.168.1.1 is just my local router, and on there is where I have the default DNS servers set to 1.1.1.1 and 1.0.0.1.
>>>> 
>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.haxx.se/pipermail/c-ares/attachments/20240123/e1c8c713/attachment.htm>


More information about the c-ares mailing list