Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Mobile/Russia] dns_probe_finished_nxdomain when visiting lm.facebook.com #1518

Open
nickcarterney opened this issue May 2, 2024 · 22 comments

Comments

@nickcarterney
Copy link

nickcarterney commented May 2, 2024

Shadowsocks version: 1.17.2 - 1.18.0 - 1.18.1 - 1.18.3
Client device: IPhone (iOS)
Server: Ubuntu 23 (Vultr and OVH)
Config:

{
          "server":"207.148.90.219",
          "mode":"tcp_and_udp",
          "server_port": 55837,
          "local_port": 1080,
          "password":"d0c222d5bd70ab16f4df9dd3caa70a54",
          "timeout": 60,
          "method":"chacha20-ietf-poly1305",
          "fast_open":true,
          "nameserver": "8.8.8.8"
 }

When I clicked on a link on Facebook, It returned an error screen. When I copied the link to the browser, It said dns_probe_finished_nxdomain. This issue only occurs on 4G network (OctopusNet Ltd), but not on Wifi (another ISP: Zelenaya Vladivostok Network).

It has no errors on the log. I tried to ping or traceroute to l.facebook.com or lm.facebook.com on my VPS, It still responded

image

image

image

https://l.facebook.com/l.php?u=https%3A%2F%2Fbaonga.com%2Fuav-va-ten-lua-nga-pha-huy-trung-tam-hau-can-cua-ukraine-o-odessa.html%3Ffbclid%3DIwZXh0bgNhZW0CMTAAAR1tMfKhQpRQwdEeWKPs6bkHYpaxsUCaL-9gWUmLBkJx_XlMKMjG37Abphg_aem_AUQ06mMNOEoxacNHH9qatPBT50R57-qs79wnQvKI_ufWHY6MrCxo5BXActIIHjclcK_JzP5U8l1Zu8KVBZyW6nK1&h=AT3TRiH86M5iyQke2wCwr_-mBUpipxOI9uTm1P_hdeM7MeI-k3DQT6VoQd1Ku8kHa3sUPQVmVLWBIRwos53RE_5Cf2L9R4k-Z846hAClXIowZKwbTG0EkCEwHxwKsRfu1Zea

@zonyitoo
Copy link
Collaborator

zonyitoo commented May 2, 2024

I think the DNS query should be handled by your iOS App, which serve as a local client of shadowsocks. So I think you should first open this issue first to the iOS App repository.

@nickcarterney
Copy link
Author

nickcarterney commented May 2, 2024

I think the DNS query should be handled by your iOS App, which serve as a local client of shadowsocks. So I think you should first open this issue first to the iOS App repository.

But this issue only happened on the 4G network, the other networks were not. And only happened to lm.facebook.com, the other facebook's domains still worked well. And only happened in Russia.

In Vietnam, Finland, and Thailand still work

@zonyitoo
Copy link
Collaborator

zonyitoo commented May 5, 2024

If you suspect there are something happening in the server side, you could run ssserver with -vvv and see what was happening when you see errors on your mobile phone.

@frozzway
Copy link

frozzway commented May 6, 2024

If you suspect there are something happening in the server side, you could run ssserver with -vvv and see what was happening when you see errors on your mobile phone.

I am also recently experiencing issues with access to most of the sites while connecting to SS server using mobile network in Russia.
How do I force SS docker container to run with -vvv options?

@zonyitoo
Copy link
Collaborator

zonyitoo commented May 6, 2024

You can run whatever command you want with docker run, right?

@frozzway
Copy link

frozzway commented May 6, 2024

You can run whatever command you want with docker run, right?

Ok, I've figured it out.

I have copied logs with -vvv mode and I would appreciate if you help with determining the issue.

https://gist.github.com/frozzway/0a5c84739e75770b114268c449ff417c

And some more:
https://gist.github.com/frozzway/608c0f6edd7e9639df616c21ab6d684c

Chrome says "This site can't be reached, unexpectedly closed the connection"

For now had to deploy wireguard in parallel. Works fine by the way

@zonyitoo
Copy link
Collaborator

zonyitoo commented May 6, 2024

From the first logs, I can see a tunnel to ya.ru was already established. So your browser still said that The site (ya.ru) cannot be reached?

@zonyitoo
Copy link
Collaborator

zonyitoo commented May 6, 2024

In the second one, I can see access to 149.154.167.41:5222, graph.facebook.com:443, 1.1.1.1:853, chrome.cloudflare-dns.com:443. They were all established and finished successfully.

For example, lines like this:

DEBUG   tokio-runtime-worker ThreadId(02) shadowsocks_service::server::tcprelay: crates/shadowsocks-service/src/server/tcprelay.rs:255: established tcp tunnel 85.140.23.181:14785 <-> chrome.cloudflare-dns.com:443 with ConnectOpts { fwmark: None, bind_local_addr: None, bind_interface: None, tcp: TcpSocketOpts { send_buffer_size: None, recv_buffer_size: None, nodelay: false, fastopen: true, keepalive: Some(15s), mptcp: false }, udp: UdpSocketOpts { mtu: None } }    

indicated that the tunnel 85.140.23.181:14785 <-> chrome.cloudflare-dns.com:443 have already been established. TCP socket connect() to the remote target successfully.

and lines like:

TRACE   tokio-runtime-worker ThreadId(02) shadowsocks_service::server::tcprelay: crates/shadowsocks-service/src/server/tcprelay.rs:264: tcp tunnel 85.140.23.181:14785 <-> chrome.cloudflare-dns.com:443 closed, L2R 1563 bytes, R2L 4424 bytes    

told us that the tunnel was finished. It has copied 1563 bytes from local to remote, and 4424 bytes in the other direction. This tells us the tunnel working well and actually transferred data from local to remote.

I didn't see anything abnormal in these logs.

@zonyitoo
Copy link
Collaborator

zonyitoo commented May 6, 2024

BTW, I didn't see any UDP logs. Did you enabled UDP mode? Or you don't need that in your environment?

@frozzway
Copy link

frozzway commented May 6, 2024

So your browser still said that The site (ya.ru) cannot be reached?

Yep(

I didn't see anything abnormal in these logs.

Yes. That is the issue. Some connections establishes fine, especially those that goes though some applications like Telegram or Instagram. But some breaks immediately (like 90% that comes through Chrome or YouTube). And it is always random. I might 'get through' on to some site only because few minutes before I visited it without SS enabled (or not because of this, I really do not know). But then I open one-two-three other random sites and poof -> no more establishing connections to any of them.

BTW, I didn't see any UDP logs. Or you don't need that in your environment?

I have used tcp_only configuration just fine for several months. So I guess I don't need it.

Yet about "ya.ru" connection. How can it be that logs show no abnormalities but I still couldn't reach the site?

Btw, I did not change any of server or client configurations myself from the time this issue appeared and weeks before.

And the issue persists only on mobile 4G network. WiFi home/work networks just fine.

@frozzway
Copy link

frozzway commented May 6, 2024

I have rented another VM on another hosting and deployed my configuration to it.
It all the same.

@zonyitoo
Copy link
Collaborator

zonyitoo commented May 6, 2024

And the issue persists only on mobile 4G network. WiFi home/work networks just fine.

That's very interesting. You can see access logs on server when you are using 4G network, right? (for example, "accepted connection xxxx").

Yet about "ya.ru" connection. How can it be that logs show no abnormalities but I still couldn't reach the site?

I can only guess:

  1. The ya.ru's TCP connection works fine, but TLS handshake failed between your browser and remote server (data transfer exists, but closes immediately after handshake).
  2. The data sent from ssserver was hijacked by a middleman that makes your browser showed errors. (probably no)
  3. The connection in ssserver's log was not the actual connection that your browser was using.

@zonyitoo
Copy link
Collaborator

zonyitoo commented May 6, 2024

I saw an error in your log:

TRACE   tokio-runtime-worker ThreadId(02) shadowsocks::net::tcp: crates/shadowsocks/src/net/tcp.rs:76: connected ya.ru:443 77.88.55.242:443    
DEBUG   tokio-runtime-worker ThreadId(02) shadowsocks_service::server::tcprelay: crates/shadowsocks-service/src/server/tcprelay.rs:255: established tcp tunnel 85.140.23.181:14792 <-> ya.ru:443 with ConnectOpts { fwmark: None, bind_local_addr: None, bind_interface: None, tcp: TcpSocketOpts { send_buffer_size: None, recv_buffer_size: None, nodelay: false, fastopen: true, keepalive: Some(15s), mptcp: false }, udp: UdpSocketOpts { mtu: None } }    
DEBUG   tokio-runtime-worker ThreadId(02) shadowsocks::relay::tcprelay::utils: crates/shadowsocks/src/relay/tcprelay/utils.rs:262: copy bidirection ends with error: Broken pipe (os error 32), a_to_b: Done(2136), b_to_a: Running(CopyBuffer { read_done: false, pos: 0, cap: 63, amt: 67337, .. })    
TRACE   tokio-runtime-worker ThreadId(02) shadowsocks_service::server::tcprelay: crates/shadowsocks-service/src/server/tcprelay.rs:273: tcp tunnel 85.140.23.181:14792 <-> ya.ru:443 closed with error: Broken pipe (os error 32)    

As you can see, a -> b which was local -> remote have already finished (probably EOF), but b -> a which was remote -> local fails because of Broken pipe. The only reason of that was the local client closed the connection before received the whole response data.

This is the connection you saw on Chrome that showed errors. Chrome or your sslocal client closed the connection actively before finishing receiving the whole respond data.

But we can still see some connections to ya.ru finished successfully.

Which client application were you using? Could you see its logs about this connection?

@nickcarterney
Copy link
Author

nickcarterney commented May 6, 2024

I saw an error in your log:

TRACE   tokio-runtime-worker ThreadId(02) shadowsocks::net::tcp: crates/shadowsocks/src/net/tcp.rs:76: connected ya.ru:443 77.88.55.242:443    
DEBUG   tokio-runtime-worker ThreadId(02) shadowsocks_service::server::tcprelay: crates/shadowsocks-service/src/server/tcprelay.rs:255: established tcp tunnel 85.140.23.181:14792 <-> ya.ru:443 with ConnectOpts { fwmark: None, bind_local_addr: None, bind_interface: None, tcp: TcpSocketOpts { send_buffer_size: None, recv_buffer_size: None, nodelay: false, fastopen: true, keepalive: Some(15s), mptcp: false }, udp: UdpSocketOpts { mtu: None } }    
DEBUG   tokio-runtime-worker ThreadId(02) shadowsocks::relay::tcprelay::utils: crates/shadowsocks/src/relay/tcprelay/utils.rs:262: copy bidirection ends with error: Broken pipe (os error 32), a_to_b: Done(2136), b_to_a: Running(CopyBuffer { read_done: false, pos: 0, cap: 63, amt: 67337, .. })    
TRACE   tokio-runtime-worker ThreadId(02) shadowsocks_service::server::tcprelay: crates/shadowsocks-service/src/server/tcprelay.rs:273: tcp tunnel 85.140.23.181:14792 <-> ya.ru:443 closed with error: Broken pipe (os error 32)    

As you can see, a -> b which was local -> remote have already finished (probably EOF), but b -> a which was remote -> local fails because of Broken pipe. The only reason of that was the local client closed the connection before received the whole response data.

This is the connection you saw on Chrome that showed errors. Chrome or your sslocal client closed the connection actively before finishing receiving the whole respond data.

But we can still see some connections to ya.ru finished successfully.

Which client application were you using? Could you see its logs about this connection?

I'm using ShadowSocks by LV Max on Android OS (Download from Google Play Store). I've never gotten this issue before on 4G Networks

@zonyitoo
Copy link
Collaborator

zonyitoo commented May 7, 2024

@madeye Do you have any idea about this issue?

@frozzway
Copy link

frozzway commented May 7, 2024

Which client application were you using? Could you see its logs about this connection?

I tried Shadowsocks by Max Lv and v2rayNG. Same symptoms.

Some logs from v2rayNG https://gist.github.com/frozzway/28ef2645eefac19964fc14a618246e50
Do not see abnormalities from it, but still couldn't connect.

@frozzway
Copy link

frozzway commented May 7, 2024

shadowsocks/shadowsocks-android#3151 (comment)

Found similar issue report

@madeye
Copy link
Contributor

madeye commented May 9, 2024

It looks your ISP blocks all the UDP traffic, causing DNS issues. Enabling a SIP003 plugin can solve the problem, as it makes SS app works in TCP-only mode.

@frozzway
Copy link

frozzway commented May 9, 2024

It looks your ISP blocks all the UDP traffic, causing DNS issues. Enabling a SIP003 plugin can solve the problem, as it makes SS app works in TCP-only mode.

Any thoughts on this one #1518 (comment)?
We sort of discussed it on this topic cause it is Russia mobile network related

@volodalexey
Copy link

Hello! I wanted to make separate bug, but I think this is almost the same. Testing from Russia also.
I have sslocal (port 7771) on my laptop that is always connected to ssserver on VPS.
Using WiFi/LAN/WAN (local ISP) I can connect to e.g. https://bitbucket.com (test with curl)

curl -x http://127.0.0.1:7771/ https://bitbucket.com -v
*   Trying 127.0.0.1:7771...
* Connected to (nil) (127.0.0.1) port 7771 (#0)
* allocate connect buffer!
* Establish HTTP proxy tunnel to bitbucket.com:443
> CONNECT bitbucket.com:443 HTTP/1.1
> Host: bitbucket.com:443
> User-Agent: curl/7.81.0
> Proxy-Connection: Keep-Alive
> 
< HTTP/1.1 200 OK
< Date: Thu, 26 Sep 2024 15:56:40 GMT
< 
* Proxy replied 200 to CONNECT request
* CONNECT phase completed!
* ALPN, offering h2
* ALPN, offering http/1.1
*  CAfile: /etc/ssl/certs/ca-certificates.crt
*  CApath: /etc/ssl/certs
* TLSv1.0 (OUT), TLS header, Certificate Status (22):
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS header, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS header, Finished (20):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.2 (OUT), TLS header, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: C=US; ST=California; L=San Francisco; O=Atlassian US, Inc.; CN=*.bitbucket.com
*  start date: Feb 22 00:00:00 2024 GMT
*  expire date: Mar 24 23:59:59 2025 GMT
*  subjectAltName: host "bitbucket.com" matched cert's "bitbucket.com"
*  issuer: C=US; O=DigiCert Inc; CN=DigiCert Global G2 TLS RSA SHA256 2020 CA1
*  SSL certificate verify ok.
* Using HTTP2, server supports multiplexing
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* Using Stream ID: 1 (easy handle 0x62921d699eb0)
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
> GET / HTTP/2
> Host: bitbucket.com
> user-agent: curl/7.81.0
> accept: */*
> 
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* old SSL session ID is stale, removing
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* Connection state changed (MAX_CONCURRENT_STREAMS == 64)!
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* TLSv1.2 (IN), TLS header, Supplemental data (23):
< HTTP/2 301 
< location: https://bitbucket.org/
< x-content-type-options: nosniff
< x-xss-protection: 1; mode=block
< atl-traceid: e5583a715c5247c18de313f7cc0bfc7f
< report-to: {"endpoints": [{"url": "https://dz8aopenkvv6s.cloudfront.net"}], "group": "endpoint-1", "include_subdomains": true, "max_age": 600}
< nel: {"failure_fraction": 0.001, "include_subdomains": true, "max_age": 600, "report_to": "endpoint-1"}
< strict-transport-security: max-age=63072000; includeSubDomains; preload
< access-control-allow-origin: *
< vary: Accept-Encoding
< server-timing: atl-edge;dur=2,atl-edge-internal;dur=3,atl-edge-upstream;dur=0,atl-edge-pop;desc="aws-eu-central-1"
< date: Thu, 26 Sep 2024 15:56:40 GMT
< server: AtlassianEdge
< 
* Connection #0 to host (nil) left intact

Using 4G from smartphone I can not connect to https://bitbucket.com, it just hangs

curl -x http://127.0.0.1:7771/ https://bitbucket.com -v
*   Trying 127.0.0.1:7771...
* Connected to (nil) (127.0.0.1) port 7771 (#0)
* allocate connect buffer!
* Establish HTTP proxy tunnel to bitbucket.com:443
> CONNECT bitbucket.com:443 HTTP/1.1
> Host: bitbucket.com:443
> User-Agent: curl/7.81.0
> Proxy-Connection: Keep-Alive
> 
< HTTP/1.1 200 OK
< Date: Thu, 26 Sep 2024 16:00:12 GMT
< 
* Proxy replied 200 to CONNECT request
* CONNECT phase completed!
* ALPN, offering h2
* ALPN, offering http/1.1
*  CAfile: /etc/ssl/certs/ca-certificates.crt
*  CApath: /etc/ssl/certs
* TLSv1.0 (OUT), TLS header, Certificate Status (22):
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* SSL connection timeout
* Closing connection 0
curl: (28) SSL connection timeout

@volodalexey
Copy link

Compiled and enabled https://github.com/shadowsocks/v2ray-plugin for sslocal and ssserver - now it works!

@zonyitoo
Copy link
Collaborator

So it is suspected that there was a firewall that trying to detact such kinds of data packets in cellular network of RU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants